A global comparison of bicycle sharing systems Journal of Transport Geography

Increasing urban populations have created pressures on the transportation networks that serve them. Bicycle sharing systems (BSS) have seen a dramatic increase in popularity as cities around the world begin to implement and see significant use and benefit from this growing mode of urban micro-mobility. As a result, the research surrounding bicycle sharing systems has also increased, although this has been primarily focused on the analysis of individual systems. There is therefore a need for a global comparison of systems, particularly given that prior research often omits China, which is currently the largest BSS market in the world. This paper therefore marks a major step forward through its analysis of data from 322 schemes situated on all major continents. Conducting such analysis, there appear to be 5 main types of BSS: very large, high use BSS, large BSS in major cities, medium BSS with extensive cycling infrastructure, small to medium efficient BSS and small to medium inefficient BSS . From these major cluster groups, we are able to group schemes by usage, contextual indicators and the behavioural characteristics of their users. This not only facilitates a global comparison of scheme performance, but also offers a basis to new schemes to identify established BSS with similar characteristics that can be used as a template for anticipating the likely demand from users.


Introduction
The number of Bicycle Sharing Systems (BSS) has seen a dramatic increase in recent years rising from 13 systems in 2004 to over 2000 in 2019 (Bicycle Sharing Blog, 2019), making it one of the fastest growing modes of transportation. The docked BSS we find in many citiessuch as Santander Cycles in London and Paris's Vélib' Systemare known as "third generation" schemes (Corcoran et al., 2014;Médard de Chardon and Caruso, 2015;Médard de Chardon et al., 2017;DeMaio and Gifford, 2004;ter Beek et al., 2014) and can be characterised as having customised bicycles with automated docking stations secured by user payment card details. They are typically accompanied by websites that provide data on the occupancy rates of each docking station (Médard de Chardon and Caruso, 2015;Davis, 2014).
In the decade since 2010, when the London BSS commenced operation, the global urban population increased from 3.6 billion to 4.3 billion (Worldbank, 2021). With urban growth comes increased levels of congestion (Çolak et al., 2016) that cities need to manage across both public and private transportation modes. As technologies around BSS have improved and become cheaper, they have become an attractive solution to help improve both congestion levels and also health as cycling is a form of active transport that reduces emissions (Fishman et al., 2013;O'Brien et al., 2014;Pucher and Buehler, 2012;Shaheen et al., 2010;Woodcock et al., 2014;Zhang et al., 2019). In a study conducted by Yang et al. (2019), they find evidence to show a change in travel behaviour of BSS users near metro stations, which exemplifies the uptake of BSS as a mode to help overcome issues around the 'last mile problem'. This refers to the difficulty that commuters encounter when travelling from a transportation hub, such as a train or bus station, to their final destination.
User-focused research has shown that weekday journeys during peak hours may actually be faster using BSS services compared to taxi journeys (Faghih-Imani et al., 2017). Similarly, with a careful integration with existing public transportation networks, BSS has been found to reduce overall travel times as well as increase use of these more environmentally friendly modes of public transportation as a whole (Jäppinen et al., 2013;Zhang et al., 2015). In addition, the benefits of cycling can outweigh disbenefits of air pollution exposure (Mueller et al., 2015) as well as provide a safer means to travel, in comparison to private cyclists (Fishman and Schepers, 2016).
Whilst the literature on individual BSS is extensive there have been few comparisons of schemes globally. This can be attributed to the lack of routine data releases from BSS operators (Mátrai and Tóth, 2016) and has meant there is currently no established standard for the comparison of BSS (Médard de Chardon and Caruso, 2015). One important consequence of this is that it allows operators to overstate the success of their BSS in comparison to others in search of benefits such as increased investment (Médard de Chardon and Caruso, 2015). This paper therefore provides a timely update and extension of the global analysis conducted in O'Brien et al. (2014) and delivers a comprehensive comparison of an unprecedented number of docked BSS around the world.
Utilising a very large collection of BSS data, this research conducts a two-staged clustering analysis to help gain insights into the global landscape of docked BSS, their evolution, and individual system successes and failures. To achieve this, it was first necessary to establish comparative metrics across all systems in the dataset through the manipulation of dock capacity data. These metricsdetailed in Section 4.1 -provide the foundations to the global classification created through a combination of K-means and Dynamic Time Warping (DTW) clustering.

Literature review
The body of BSS literature has been comprehensively detailed in Fishman (2016) and Fishman (2019), so our focus in this literature review is on those studies that offer comparisons between BSS (Médard de Chardon et al., 2017;O'Brien et al., 2014;Kou and Cai, 2019;Sarkar et al., 2015;Zaltz Austwick et al., 2013;Zhang et al., 2015), as well as variables that influence usage across systems. The variables that are found to be influential (detailed in Section 2.1) have helped to inform decisions on the counterfactual variables to underpin the classification process detailed in Section 4.2.

Variables of influence
When assessing the factors that can influence the use of BSS, it makes sense to start with the attributes of BSS themselves. As Table 1 shows, the size of a system appears to play a large role in influencing users' propensity to use it as well as influence the absolute number of riders . We also find that the larger the system and higher the density of docking stations, the more a BSS is used (Médard de Chardon et al., 2017;O'Brien et al., 2014). There is also a growing body of literature surrounding users' propensity to use systems based on factors such as their distance from the nearest docking station. Such insights are usually garnered through survey methods and can shed light on how far users are willing to travel in order to use BSS (Guerra et al., 2012;Bachand-Marleau et al., 2012). For example, Bachand-Marleau et al. (2012) found that the maximum distance a user is willing to walk to the nearest BSS station is around 500 m. Similarly, Gu et al. (2019) used 500 m as the 'acceptable walking distance' for commuters using BSS in combination with public transportation. Therefore, 500 m has been adopted as the standardised distance for the circle of operation around each docking station, which has been used to calculate the operational area for each BSS in this paper. A fixed distance is used in order to maintain metric homogeneity across BSS.
Another important determinant in the use of BSS are the characteristics of the population in close proximity to them. Fishman (2016) shows that higher population density and higher income areas have more active BSS use (Faghih-Imani et al., 2014;Fishman et al., 2014;Woodcock et al., 2014). Research by Bachand-Marleau et al. (2012) and Buck et al. (2013) have found that users' income levels have a mixed level of influence both within BSS as well as between BSS, with evidence showing that in Montreal, Canada, those users earning less than Can $40,000 were 32% less likely to use BSS, whereas those users in Washington D.C., United States, were typically of lower income in comparison to private cyclists. BSS is more likely have a closer gender balance although still with a male bias (Pucher et al., 2011). Research on the ethnicity and education of BSS users in western cities shows higher use rates in urban areas with a higher proportion of white people as well as more educated populations (Fishman et al., 2014;Martin and Shaheen, 2014;Buck et al., 2013). The schemes have often favoured more affluent areas in the city but Goodman and Cheshire (2014) found that residents from more deprived areas use BSS if these are built in their local areas and remain affordable relative to other modes.
Weather and topography have been found to play a significant role in the use patterns within BSS and seasonal impacts are clear from the number of trips (Ahmed et al., 2010), with areas/times with high levels of average rainfall seeing a decrease in the use of BSS (Campbell et al., 2016;Corcoran et al., 2014;Miranda-Moreno and Nosal, 2011;Nankervis, 1999). Temperature also appears to play a significant role in the Table 1 A summary of some of variables which influence the use of BSS, extracted from the papers discussed. With safety concerns found to be one of the largest barriers to BSS use (see Fishman et al., 2014) those urban areas with more dedicated cycling infrastructure benefit from greater BSS use (Faghih-Imani et al., 2014;Mateo-Babiano et al., 2016). However, research also suggests that regulations requiring users to wear helmets can limit the appeal of BSS (Médard de Chardon et al., 2017;Fishman et al., 2013;O'Brien et al., 2014), with BSS users significantly less likely to wear helmets as compared to private bicycle users in those cities which do not mandate cyclists to wear helmets (Fischer et al., 2012;LDA Consulting, 2013).
There are numerous other variables which have been assessed at the individual BSS level, but the impact of temporality is one which has been found to have the most consistent impacts across BSS. The number of journeys within a BSS conform to seasonal, weekly and daily trends (Ahmed et al., 2010;Faghih-Imani et al., 2017;Faghih-Imani et al., 2014;Zaltz Austwick et al., 2013). In particular we can see clear differences in the use patterns within a system, when comparing weekday and weekend journey patterns. As a result, the weekday-weekend divide has been a consistent feature of the majority of analysis which considers use patterns across BSS literature (Faghih-Imani et al., 2017;Faghih-Imani et al., 2014;Mateo-Babiano et al., 2016.

Bicycle sharing system comparisons
Comparisons between BSS provide a way of assessing the relationships between different attributes across systems. This enables us to develop a deeper understanding of the BSS landscape as a whole, as it allows us to see how the variables impact BSS performance globally. Previous work in this respect has been limited by poor data availability, therefore few regional comparisons have been attempted. There are exceptions, which include Zaltz Austwick et al. (2013) who compare 5 systems, and a number of studies that seek to investigate differences in usage patterns as well as clustering (see also Bieliński et al., 2019;Kou and Cai, 2019;Sarkar et al., 2015;Zhang et al., 2015).
The scope for substantive international comparisons across BSS is constrained by the inconsistency of BSS data collection practices and formats, and the limited availability of data for many systems. As a result, when making comparisons between BSS the "trips per day per bicycle" metric (TDB) is commonly used as it is straightforward to calculate and enables some understanding of the usage rates in a BSS. This metric is calculated using the dock capacity information, which is the most commonly available form of real-time BSS data. For example, it is the basis to Médard de Chardon et al.'s (2017) comparison of 75 systems from around the world that determined common variables that influence the use of each BSS. Similar to Table 1, the authors identified 5 categories of variable: BSS attributes, density and compactness, geography, weather and transportation infrastructure. Although the authors express TDB as a 'good standardised measure of success' (Médard de Chardon et al., 2017), in this paper TDB has been used primarily as a measure of system efficiency and rate of use.
There is a need for a large global comparison of systems, particularly given that prior research often omits China, which is currently the largest BSS market in the world with 1.9 million bicycles in over 430 cities in 2016 (Gu et al., 2019). The research in this paper therefore marks a major step forward given it benefits from data collected from schemes situated on all major continents. It can thus reveal new similarities/differences between systems and offer insights at a truly global level.

Bicycle sharing system data
Dock capacity information is the most abundantly available data for BSS as it offers a real-time indication of the number of bicycles and spaces at each docking station within a particular BSS. These data are typically used to notify users and system operators of bike availability across the system. But for the purposes of this research, these data were collected and stored within a database at 2-min intervals (for the majority of systems), enabling us to build up a detailed understanding of the size of the fleet within each system as well as the flow of bicycles in and out of docking stations. A subset of BSS (see Appendix) also release Origin-Destination (O-D) journey data, which is used within this research for calibrating and validating journeys detected from the manipulation of dock capacity data (Section 4.1.1). Fig. 1 depicts the distribution and size of the 322 BSS for which we have complete data and used within the final analysis. The dataset provides the most holistic view of BSS which is currently available, with BSS data from all continents. The coordinates of docking stations within each BSS were also recorded and stored within a database. These were used to estimate the operational extent and to extract the confounding variables for each scheme (Sections 4.1.3 and 4.1.4).

Confounding variable data
In an effort to account for different contextual settings of each BSS additional confounding variables were calculated and clustered. These were chosen based on the literature in Section 2.1 and are detailed in Table 2. We have attempted to choose from each variable category, prioritising those which exhibited the most consistent impacts across BSS. Unfortunately, the choice of variables was limited to those with global coverage, as it was necessary to maintain metric homogeneity across the BSS within this analysis. After careful consideration, the metrics of population, topography, precipitation and cycling infrastructure were selected and used to calculate those additional confounding variables (detailed in Section 4.1.4).
Data on cycling infrastructure was derived from OpenStreetMap (OSM) data. OSM is widely used has been found to be a more updated and granular source of such data in comparison to proprietary alternatives like Google Maps (Natera et al., 2020).

Metric creation
The careful manipulation of dock capacity data enables the creation of metrics indicating the size and use of each BSS, whilst dock location data can be used to determine the operational area of each BSS as well as enable the calculation of confounding variables within these areas. The metrics calculated provide the foundations for the creation of two datasets (Section 4.2) that split the time constant and time series variables and are used to conduct the two-staged classification process.

Journey estimation, validation and cleaning
In order to estimate the number of journeys within each system, the dock capacity records were examined sequentially. A fall in the number of bicycles between two consecutive collection periods (usually 2-min) was used as a proxy for the start of a journey from that docking  station. This heuristic was applied to each docking station within a BSS, helping to build up a detailed picture of the frequency of journeys. Similar journey estimation methods have been used by Médard de Chardon and Caruso (2015) who found that aggregating journey estimations to the daily level proved to be the most accurate. Journeys were also aggregated to the hourly level, in order to be able to examine patterns of use throughout a day in more detail. The validation of journey estimations was an essential step due to the importance of the variable in the creation of other metrics such as TDB as well as in the second-stage DTW cluster analysis of the use profile for a typical day. For these reasons, many precautions were taken in order to ensure its validity.  For the systems for which we have journey data available (see Appendix), it was possible to compare the calculated journey estimations to the journey data. This contains the O -D flows of each journey taken within a BSS, as opposed to dock capacity data that is likely to contain operator redistributions as well as data errors. Therefore, the data is aggregated to the same 24-h period as the estimated journey figures and used to calculate the percentage difference. After initial investigations, journey estimations were found to both under and overestimate journey counts. On average, the estimated journey figures varied by up to 15% with respect to actual journey records from the journey data. Underestimations can arise from the frequency of observations not being sufficient to capture all dock changes. This effect can be reduced by using a shorter collection interval (Médard de Chardon and Caruso, 2015). Overestimations are typically due to operator actions such as rebalancing or bicycle removal for repairs, which are likely to get captured within dock capacity data. There were also periods of overestimations triggered largely by transmission errors from docking stations. These were identified within the dock capacity data where sequential records fluctuated between the same value across several collection periods (as seen in Fig. 2). For this method to be effective at identifying all periods of potential data collection errors across all BSS, it was necessary to test several threshold levels. Due to the high data volume and the large variability in BSS size, it was found that low threshold levels were necessary to capture all true positive cases of false data collection. This meant that differences in the number of bikes of more than 1 over a period greater than 6 sequential data collection instances were identified as periods of potential data error. Since the threshold values were very low, there were instances where the algorithm captures some false positives. As a result, it was necessary to manually confirm each case identified as potential false data and remove those periods which were verified to be a false positive.

Maximum available bicycles and trips per day per bicycle
The number of operational bicycles (total number of bikes in a single collection period) was recorded and TDB calculated by dividing the estimated number of journeys within a particular day, by the maximum number of operational bicycles for the same day. TDB helps to account for differences in the size of the system and allows for cross comparison between systems in terms each bicycle's use per day.

Operational area
As mentioned in Section 2.1, the operational area of a BSS was defined by placing a 500 m radius buffer around each of the docking stations. These were dissolved into a single polygon and were used to calculate the average operational area. Similar to all the other metrics calculated within this analysis, it was important to ensure homogeneity in the way that the metrics were calculated, not only to ensure comparative rigor, but also to help automate the calculation process across large number of BSS. We therefore opted for this simpler rule of thumb, rather than attempting to determine operational areaswhich are also subject to changeon an operator by operator basis. This approach has been taken in other studies, including  300 m radius buffer to determine operational area makes no difference to the K-means cluster assignments in Section 5.1, whilst a 250 m radius buffer causes a change in the classification of just 1 system out of the 322 BSS within this analysis. This is a very minor change, reinforcing the strong cluster identities found within the results (Section 5).

Confounding variables
Time invariant confounding variables (population, precipitation, topography and cycle paths) were estimated for each BSS to add additional contextual information to our clusters. Active docking stations, in the period between April and September 2018, were buffered to generate polygons for overlay onto the confounding variable datasets (Section 3.2). The buffer distances used for each variable are shown in Table 2. For both the population and precipitation variables we decided to use a 1 km buffer to incorporate some of the urban characteristics beyond the immediate limits of the outer docking stations in a scheme. Although literature suggests that willingness to walk to a docking station is around 500 m for some systems (Gu et al., 2019;Bachand-Marleau et al., 2012), we know that users of BSS are likely to use BSS as a mode to solve issues around the "last mile problem" (Fan et al., 2019;Yang et al., 2019), which implies users travelling from outside that 500 m walking boundary. Similarly, when considering the topography and length of cycling infrastructure of a docked BSS, it is important to consider that journeys will be taken between docking stations, therefore calculations for these variables were limited to 300 m buffers and a convex hull. Much like the determination of the use of 500 m radius buffers for calculating operational area, the buffer sizes/methods were chosen to best capture the nature of each confounding variable in the context of each BSS in a homogenous manner.

Panel style data curation and BSS clustering
Following the creation of the metrics detailed across Section 4.1, the data was split into time constant and time series data to allow for a twostaged clustering analysis to be conducted. This method was chosen to take advantage of the multiple metric types that were created as well as current limitations with the simultaneous clustering of time series and time constant variables.
For the first stage of clustering, a time constant dataset was created for analysis using a K-means technique (MacQueen, 1967). K-means clustering is a standard method for exploratory clustering where the number of clusters are specified by the user (Wagstaff et al., 2001). Here, it is employed to gain a detailed, static understanding of the global BSS landscape. The variables included within this time constant dataset (see Appendix) were a combination of BSS variables (Sections 4.1.1 to 4.1.3), as well as the confounding variables (Section 4.1.4). Since these metrics were calculated for each day that we have operational data, it was necessary to aggregate metrics to the same time period for comparative purposes. As a result, the metrics were aggregated for the 6-month period between April and September 2018, since this was the period for which the largest group of BSS had operational data. Each metric was aggregated between this period, and BSS variables split between weekday and weekend measures due to the large amount of variability found between patterns of use (Faghih-Imani et al., 2014;Faghih-Imani et al., 2017;O'Brien et al., 2014). In addition to the time aggregation, variables were normalised in order to remove scale and weighting issues in the K-means clustering. Conducting this first-stage of clustering analysis allows us to gain an understanding of the type of system, in terms of its size, environment and general use as well as where each of these systems are situated in comparison to other BSS around the world.
In the second stage of clustering, the hourly journey estimations were used to create a time series dataset on each BSS. Similar to the creation of the time constant dataset for the first stage of clustering, the hourly journey estimations (Section 4.1.1) were aggregated for the same 6month period between April and September 2018 and split between weekday and weekend (Faghih-Imani et al., 2014;Faghih-Imani et al., 2017;O'Brien et al., 2014). The time series data was also normalised, again to counter issues around scales. By aggregating the data in this way, the resulting dataset depicts the average pattern of use within each BSS for a typical weekday or weekend. For each of the clusters created in the first stage of clustering, a DTW cluster analysis was conducted. DTW is a shape-based clustering method for time series data, computing the smallest Euclidean distances between each time series based on moving window (Berndt and Clifford, 1994). These distances were then clustered using a hierarchical clustering method based on the Ward's method (Mojena, 1977). This was chosen after assessing the 3 alternativesnamely the single, complete and average methods (Edelbrock, 1979). It was found that the Ward's method provided the cleanest separation of clusters whilst minimising the within-cluster sum of squares variance. This second stage of clustering was employed to help give additional detail into each first stage cluster by investigating how systems within the first stage cluster are further grouped, based on their patterns of use.

Bicycle sharing system clustering
The first-stage of the clustering was designed to create an understanding of the general groups that exist in the current BSS landscape. Although direct and statistical methods of deriving the number of clusters largely suggested 2 to be optimal, namely the elbow method, the Calinsky Criterion and average silhouette method, the decision was taken to group BSS into 5 clusters using a K-means clustering algorithm. Dividing 322 BSS into 2 unequal groups would not be beneficial in helping to uncover a deeper understanding of the global BSS landscape, whilst 5 clusters allowed for a large enough differentiation between clusters with clear cluster identities and kept the number of clusters at a minimum. The centres of each of these clusters are detailed in Table 3. The clusters have been named, based on these characteristics, as: 'very large, high use BSS', 'large BSS in major cities', 'medium BSS with extensive cycling infrastructure', 'small to medium efficient BSS', and 'small to medium inefficient BSS'. Within this section, we refer to the 'efficiency' of a BSS using the TDB metric. Those systems that demonstrate higher TDB figures are taken to be more efficient, since this means that each bike is used on multiple occasions throughout an average day.

Very large, high use BSS
The first cluster, 'very large, high use BSS', contains the largest and most used BSS in terms of the absolute number of journeys. This cluster only contains three systems, all of which are located in East Asia: Taipei, Suzhou and Weifang. These systems do not appear to have extensive road or cycle infrastructures. Research suggests that there is a positive correlation between the level of cycling infrastructure and the number of journeys, so there may be further unmet demand in these cities.
Assessing the results of the sub-clusters we can see that all of the systems have a very similar daily use pattern (Table 4). During an average weekday all of the systems have two major peaks at the start and end of the working day (Fig. 4), which informs us that these systems are typically used by commuters. Suzhou and Weifang appear to exhibit similar use patterns throughout the weekend, although the Taipei City BSS demonstrates a single, large peak throughout the weekend (Fig. 5). This suggests a dominance of leisure and tourism activity on nonworking days .

Large BSS in major cities
The second cluster, 'large BSS in major cities', contains 15 systems that include New York, Paris, London, Barcelona, and Seoul (Table 5). These BSS are typically located in sprawling conurbations with large and densely populated cores. These systems hold similar characteristics to those very large, high use BSS, the main difference being their absolute size. Large BSS in major cities appear to have a greater proportion of roads with cycling infrastructure compared to very large, high use BSS. The success of BSS within this cluster may help to reinforce previous literature which finds that greater investment in cycle safety helps encourage greater BSS use (Faghih-Imani et al., 2014;Fishman et al., 2014;Mateo-Babiano et al., 2016).
Assessing the sub-clusters, we can see that there are two distinct patterns of use that we can identify during the average weekday (Fig. 6) and weekend (Fig. 7) within these large systems in major cities. During the weekdays we find that the majority of the systems have a typical two commuter peaks. We also find that five of the systems appear to have a more gradual use patterns, with less defined commuter peaks, although they are still identifiable. During the weekend, we find that the majority of systems appear to have one large peak, which is likely driven by leisure and tourism use. There are four systems that appear to have two commuter peaks at the weekend, displaying similar patterns to the weekday. These systems are all in Chinese cities. This gives us an indication that the users within these systems continue to work throughout the weekends.

Medium BSS with extensive cycling infrastructure
The third cluster, medium BSS with extensive cycling infrastructure is a small cluster of five systems (Table 6), which are characterised by an unusually large road and cycling infrastructure network for their size. Although these systems benefit from good levels of cycling infrastructure, they appear to have relatively low levels of use. The cities with these systems also are relatively flat and dry. Both these aspects contradict the earlier literature (Campbell et al., 2016;Corcoran et al., 2014;Midgley, 2011;Miranda-Moreno and Nosal, 2011;Nankervis, 1999), which goes to show that the confounding variables within this analysis do not have consistent implications on the activity and use within BSS. Further investigation into these systems would provide new understandings of how these variables can impact the use of BSS in different ways.
Looking into the aggregated daily travel patterns shows that for the majority of systems, there appears to be a clear two-commuter peak trend during the weekdays (Fig. 8) and a large extended peak during the Fig. 9. Results of DTW clustering on average weekend use for the medium BSS with extensive cycling infrastructure cluster. weekends (Fig. 9). The BSS in Stuttgart (Germany) appears to be an anomaly in terms of its typical use on both the weekday and weekend. This is primarily due to its small size in comparison to the other systems in this cluster, as well as the fact that this system was used electric bicycles. Electric BSS were much more uncommon at this time, which may help to explain its differing journey patterns. At the end of October 2018, the system was replaced by another operator after losing its contract in a competitive retendering process, for whom we do not have data.

Small to medium efficient BSS
The fourth cluster, small to medium efficient BSS, is a large group of 66 systems (Table 7). Although much smaller, these systems appear to have a similar density of cycling infrastructure as those very large, high use BSS, which follow the trends suggested in the current literature. For their size, these systems achieve high numbers both in terms of the number of journeys and TDB, during both the weekends and weekday. This suggests that these systems are well managed, and ideally set up to cater to the surrounding population.
An investigation into the sub-clusters shows that there are three main types of journey pattern during the weekday and weekend. During the weekday (Fig. 10), the first two sub-clusters depict the typical two commuter peak pattern, with the first sub-cluster depicting two defined peaks whilst the second sub-cluster has much smaller peaks. The third sub-cluster of journey patterns during the weekday depict two very small peaks during commuting hours, although there appears to be a gradual increase in journeys throughout the day until around 6 or 7 pm when the number of journeys start to fall. These systems therefore appear exhibit a more varied user base, with general trends similar to those of leisure and tourism use, with small surges in demand during peak hours by commuters. During the weekends (Fig. 11), again we see that two of the clusters follow the typical trend of having one large peak across the weekend. Both of these sub-clusters show a steep incline throughout the morning hours between 8 am and noon, with a steep decline in use in the late evening around 9 pm. The third sub-cluster appears to have a much more even distribution of journeys throughout the day, with journeys appearing to gradually increase throughout the morning hours accompanied by a gradual decrease in the number of journeys into the evening and night.

Small to medium inefficient BSS
The final cluster, small to medium inefficient BSS, contain the largest number of systems (233) ( Table 8 and 9) but are, on average, the smallest BSS within the dataset and experience the lowest levels of use. These systems appear to be inefficient, with bicycles used less than once a day. Although they have the lowest average measures for most metrics, they appear to have a comparatively similar levels of average precipitation and total relief to those medium BSS with extensive infrastructure and small to medium efficient BSS schemes.
Assessing the travel patterns of these systems, the journey patterns are arranged into 4 sub-clusters for both the weekdays and weekends. For those weekday travel patterns (Fig. 12), we have 2 sub-clusters which appear to have the typical two-commuter peak, one containing   Fig. 13. Results of DTW clustering on average weekend use for the small to medium inefficient BSS cluster. large and distinct peaks (sub-cluster 3), with the other characterised by smaller peaks during commuting hours with a more even use throughout the day (sub-cluster 4). The first sub-cluster appears to have a fairly uniform use throughout the day, which may suggest systems characterised by a mix of leisure and commuter users, although large peaks and variations identify systems may help to identify systems with poor use or poor data quality. Similarly, the second sub-cluster has no prominent pattern due to infrequent use of the systems therefore with the potential for messy data. The patterns we find for the weekend use (Fig. 13) is similar with the second and third sub-clusters showing typical weekend behaviour with one large peak, with a steady increase throughout the morning and steady use throughout the afternoon and early evening. The first sub-cluster appears to depict a more of a uniform BSS use throughout a typical weekend, although we find large spikes during the early hours of the morning at 3 am, which may be a result the 24 h operation of BSS in comparison to alternative public transportation modes or operational maintenance such as bicycle rebalancing. The final weekend sub-cluster appears to contain those systems with the lowest number of journeys and therefore appear to show sporadic use throughout the day. This occurs since these systems rarely get used and when there a few journeys that occur they cause a large spike the number of journeys.

Applications
The analysis here offers new insights and perspectives on the global BSS landscape, allowing for a greater understanding of the structure and relationships between BSS at the global level. A prominent issue with analysis of BSS is availability and the quality of the data (Médard de Chardon and Caruso, 2015;Mátrai and Tóth, 2016). This has allowed for unchallenged exaggerations in terms of the published statistics from BSS operators (Médard de Chardon and Caruso, 2015). The data and heuristics adopted here could therefore be used to verify the published statistics for those systems whose operators release highlight statistics, as well as helping to predict the use for those systems that do not release      Table 9 K-means Cluster 5 DTW weekend sub-cluster allocation.
BSS sub-cluster allocation in K-means cluster 6 (Weekend) data at all. Since the metrics calculated within this research are independent of any operator motivations, they may be considered impartial. They also provide insights into systems that fail to share data. In addition, this research offers a tool to help those cities better plan BSS, should they wish to implement new or improve existing BSS. By comparing the characteristics of the city to those within the analysis, city officials can learn from those more efficient systems. This is timely due to the fast-expanding nature of this mode of micro-mobility. Alternative modes of micro-mobility, such as scooters and dockless BSS, have a similar data format and are similarly limited in terms of data availability. The methods outlined in this research can easily be extended to other new modes of urban micro-mobility and help to provide a framework for comparison between system metrics.
Finally, by using the metric creation methodology outlined in this paper it is possible to gain a detailed understanding into the operation of a BSS using dock capacity data alone. This means that it is straightforward to scale the analysis by running the heuristics on data inputs from additional schemes as their data become available.

Summary
The analysis detailed here offers new insights into the global landscape of BSS, outlining a robust method that manipulates a voluminous dataset in a manageable way to create a comprehensive global comparison of BSS.
Dock capacity data provides the foundations for the majority of the comparative metrics created within this research, which has been supplemented with additional confounding variables to enable further contextual understanding of each BSS. Using a two-staged clustering process on the calculated metrics provides a framework for comparison between BSS and helps to situate individual systems within this rapidly growing mode of urban micro-mobility.
Conducting such analysis, there appear to be 5 main types of BSS: very large, high use BSS, large BSS in major cities, medium BSS with extensive cycling infrastructure, small to medium efficient BSS and small to medium inefficient BSS. From these major cluster groups, we are able to group schemes by usage, contextual indicators and the behavioural characteristics of their users. This not only facilitates a global comparison of scheme performance, but also offers a basis to new schemes to identify established BSS with similar characteristics that can be used as a template for anticipating the likely demand from users.

Declaration of conflicting interests
The authors declared no conflict of interests with respect to the authorship and/or publication of this original research article.