Content area
Abstract
Congestion in urban rail transit (URT) systems often results in passengers being left behind on platforms due to trains’ reaching capacity. Distinguishing between the travel choice behaviors of passengers who board the first arriving train (Type I passengers) and those who are left behind (Type II passengers) in passenger assignment is essential for effective URT passenger management. This paper proposes a data-driven passenger-to-train assignment model (DPTAM) that leverages automated fare collection (AFC) data and automated vehicle location (AVL) data to differentiate between the travel choice behaviors of the two types of passengers. The model comprises two modules based on passenger travel choice behavior: the passenger route choice model (PRCM) and the passenger itinerary choice model (PICM). The PRCM employs a granular ball–based density peaks clustering (GB-DP) algorithm to estimate passengers’ route choices based on historical data, enhancing precision and efficiency in passenger classification and route matching. The PICM incorporates tailored itinerary selection strategies that consider train capacity constraints and schedules, enabling accurate inference of passenger itineraries and localization of their spatiotemporal states. The model also estimates train loads and left-behind probabilities to identify congested periods and sections. The effectiveness of DPTAM is validated through synthetic data, demonstrating superior assignment accuracy compared to benchmarks. Additionally, real-world data from Chengdu Metro reveal the impact of congestion on travel behavior and effectively identify congested periods and high-demand stations and sections, highlighting its potential to enhance URT system efficiency and passenger management.
Full text
This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
1. Introduction
The burgeoning congestion in urban rail transit (URT) networks is a pressing concern, as passenger demand consistently surpasses the limited capacity of available trains [1, 2]. This imbalance results in numerous passengers being left behind on platforms despite URT systems operating at full capacity during peak hours. The repercussions of such overcrowding are twofold: It poses significant safety risks on the platforms and contributes to increased individual travel time, thereby compromising the overall travel experience. Such crowding not only raises safety concerns on the platforms but also extends individual travel time, thereby degrading the overall travel experience [3, 4].
Central to addressing this issue is the development of methods to accurately assign passengers to trains to reveal the impact of congestion on passenger travel behaviors [5, 6]. Moreover, as congestion may prompt passengers to change their travel decisions, a clear distinction emerges between the travel choices of passengers boarding the first arriving train and those left behind. It becomes crucial to distinguish between the travel choice behaviors of the two types of passengers, to reveal the impact of train capacity constraints on travel behaviors.
With the emergence of data-driven models, there is an attractive solution for passenger assignments, leveraging automated fare collection (AFC) data and automated vehicle location (AVL) data to gain insights into the travel choice behaviors that shape passenger flow [7, 8]. These travel choice behaviors are typically segmented into two sequential decision-making phases: route choices and itinerary choices. Route choices pertain to the selection of a spatial sequence of stations from the origin to the destination [9], while itinerary choices involve choosing among various train sequences along the selected route, subject to spatial and temporal considerations [10]. Although many studies have been conducted on passenger assignments, passenger route choices, and passenger itinerary choices, few studies have clarified their definitions and interrelationships within modeling frameworks. It is necessary to exploring the relationship among passenger assignments, route choices, and itinerary choices and modeling these two phases of travel behaviors in a passenger assignment model framework for the following reasons. First, passenger assignments are determined by passenger route choices and itinerary choices, which are two distinct decision-making phases influenced by different factors. Route choices pay more attention to travel time and transfer preferences, while itinerary choices are more limited by capacity constraints and train schedules. The separation of PRCM and PICM in a whole framework of passenger assignment model ensures that each module can focus on its specific task, leading to more precise and efficient modeling. Second, the two phases of travel choice behavior have a natural time sequence, where passengers make route choices before making itinerary choices. By addressing route choices and itinerary choices separately, the passenger assignment model can capture the complete travel process of passengers and offer a holistic view of passenger behaviors in URT networks, which is essential for effectively capturing the complexities of passenger travel behavior and optimizing system operation strategies.
To address these challenges, this paper proposes a data-driven passenger-to-train assignment model (DPTAM) for URT networks. The model dissects passenger travel choice behaviors and integrates train capacities to enhance assignment accuracy. It relies on AFC data, AVL data and network information. In the first phase, the granular ball–based density peaks clustering (GB-DP) algorithm is employed in the PRCM to cluster passengers based on travel time. Passengers’ travel routes are estimated by matching these clusters to alternative routes via route matching rules, categorizing passengers into two types: Type I, who manage to take the first arriving train, and Type II, who are left behind. In the second phase, itinerary selection strategies considering train capacity and train schedules are built upon passenger types, allowing for the inference of passengers’ itineraries and the localization of their spatiotemporal states within the network. Moreover, evaluation metrics, that is, the load rates of trains and the left-behind probabilities of passengers, are introduced to assess the performance of URT systems. Above all, the contributions of this paper are summarized as follows:
• A data-driven framework composed of the PRCM and the PICM is proposed for passenger-to-train assignments, providing a detailed examination of the passenger decision-making process in URT networks. The model also explicitly distinguishes between travel choice behaviors of passengers boarding the first arriving train and those left behind.
• A clustering algorithm combined with route matching rules is introduced to estimate passengers’ route choices, which enhances the precision and efficiency of the model. Passengers are also divided into Type I and Type II passengers.
• Constraints on train capacity and train schedules are fully considered in the tailored selection principles for passenger itinerary inferences. The PICM can also localize passengers’ spatial–temporal states on the network.
• Evaluation metrics, such as load rates of trains and left-behind probabilities of passengers at stations, are generated to quantitatively analyze capacity utilization, service level, and operating strategies.
The remaining sections are organized as follows. Section 2 presents a comprehensive review of related literature. Section 3 details the DPTAM’s methodological framework and the algorithms. Section 3.1 defines the problem and notations. Section 4 validates the proposed method with synthetic data and demonstrates its applicability with real-world data. Finally, conclusions and future works are given in Section 5.
2. Literature Review
A wealth of research has provided insights into passenger assignment models [5, 11, 12]. Notably, data-driven models employing multisource data have attracted considerable attention recently. Zhu et al. [13] assigned passengers on nontransfer trips to individual trains based on AFC and AVL data. Subsequently, Zhu et al. [14] extended this model to accommodate trips with transfers. Luo et al. [8] outlined the specific travel process of passengers and inferred passenger spatiotemporal path from travel time data recorded by AFC, train diagram data, and passenger walking time data. In contrast to traditional models, data-driven models place greater emphasis on the dynamics of passengers and trains, yielding more accurate results with enhanced efficiency.
In URT networks, capacity constraints of trains frequently result in passengers left-behind. To address this issue, significant efforts have been made to investigate passenger assignment in congested URT networks, particularly for left-behind passengers. Two early modeling efforts by Nuzzolo et al. [15] and Poon et al. [16] considered the capacity constraint in passenger assignments. Recently, Ma et al. [17] utilized AFC and AVL data to relax the limitations of existing approaches and inferred the left-behind probability distribution of passengers. Zhou et al. [18] introduced an iterative passenger assignment procedure and integrated passenger assignments with line configuration and frequency determination. Moreover, the assignment model has been further explored in conjunction with prediction methods [19] and simulation models [20]. While a great deal of previous research into passenger assignments has acknowledged the importance of train capacity constraints, there is still a need for the specific impact of these capacity constraints on passenger travel choice behaviors and comprehensive studies that integrate these constraints with other factors. Modeling this impact and the integrations with passenger travel behaviors are crucial, as they can provide a more nuanced understanding of the left-behind phenomenon and its effects on passenger decision-making processes, directly influencing the effectiveness of passenger assignment. Understanding how passengers react to capacity constraints can also provide valuable behavior insights that are not captured by traditional models, such as alter passengers’ travel plans, choose alternative routes, or help transit operators optimize strategies and enhance service quality.
Passenger route choice constitutes the core of passenger assignments and represents the first phase of passenger travel choice. A substantial body of literature focuses on PRCM, with the majority of studies suggesting that passengers prefer paths with lower cost rather than considering all possible alternatives [21]. Over the past two decades, numerous studies have employed the LOGIT model in PRCM [22, 23]. To overcome the inefficiency and costliness of manually calibrating model parameters, PRCM has introduced automatic system data to capture passenger travel patterns and assess the impact of crowding on route choice. Zhu and Xu [24] enhanced the generation of route choice sets using the operating information of a URT network. Zhu et al. [25] proposed an AFC data-driven framework to estimate passengers’ route choice patterns. Furthermore, Zhu et al. [26] evaluated route choice models by integrating AFC data with automatic train supervision (ATS) data. Lee et al. [27] explored passenger’s route preference using smart card data and train log data. However, much of the research up to now could hardly handle a large size of URT network efficiently, due to the limitation of data processing and computation efficiency and stability. Nevertheless, most existing studies have struggled to efficiently manage large-scale URT networks using automated data. This inefficiency is attributed to constraints in data processing capabilities, as well as computational efficiency and stability.
In trip planning, deciding on a route is merely the initial step. The passenger’s itinerary—encompassing the trains boarded, the travel time segments, and the passenger’s location in the URT network—entails additional decisions and has garnered increasing interest recently. Zhou and Xu [28] proposed a passenger assignment model in which the passenger’s itinerary choice was generated based on time constraints. Zhao et al. [29] estimated the probability of passengers being assigned to each route and train, found on the distributions of times passengers were left behind at the origin station and transfer stations. Zhu et al. [13] designed a passenger-to-train assignment model for nontransfer trips by decomposing the passenger’s travel time into several segments, and then Zhu et al. [14] extended this model to accommodate trips with transfers and estimated the route choice fractions. Zhu et al. [30] estimated passenger train choice with timetable and AFC data. Su et al. [10] proposed a passenger–train matching probability model characterizing the time–space trajectory of each passenger. Mo et al. [31] conducted explorations in path choice models utilizing travel time and train operation information. Besides, efforts have also been directed toward multisource data utilization and simulation methods. Gu et al. [32] initiated the use of Wi-Fi probe data to track passenger spatial–temporal trajectories. Zhang et al. [33] simulated passengers and trains by an efficient multi-agent model capable of handling millions of travel trajectories. While efforts have addressed passenger assignments, route choice, and itinerary choice separately, most studies have not clearly delineated the distinctions among these concepts, which may fail to capture the dynamic interplay between different phases of passenger decision-making and lead to inaccuracies and inefficiencies in passenger assignments. Route choices and itinerary choices are interconnected elements that collectively influence passenger assignments in URT networks. By exploring the relationship among them, a holistic view of passenger travel behavior can be provided to capture the two sequential passenger decision-making phases in passenger assignment models and managing passenger flow in congested URT networks. It also enables URT operators to implement targeted operation optimizations and help policymakers to make informed policy decisions that enhance the overall efficiency and reliability of the transit systems.
To sum up, this study is motivated by multiple challenges in passenger assignment in URT networks and aims to address the following gaps:
• The interrelationship among passenger assignment, route choice, and itinerary choice remains an area ripe for deeper investigation. A comprehensive understanding of these interconnected elements is essential for tackling passenger assignment challenges and enhancing URT network efficiency. By exploring the relationship, a holistic view of passenger travel behavior can be provided in the passenger assignment models, leading to more accurate and effective assignment strategies.
• There is a lack of detailed analysis on how train capacity constraints specifically influence passenger travel choices and behaviors. The need for a more nuanced understanding of the left-behind phenomenon and its effects on passenger decision-making process in passenger assignment models is highlighted, which can inform better operation strategies and passenger managements.
• Effective use of AFC and AVL data has been limited by the complexities inherent in modeling passenger behaviors and the expanding scale of URT networks. Advanced data-driven approaches are required to address these limitations, which is essential for developing more robust passenger assignment models adept at navigating the intricacies of extensive URT systems.
3. Methodology
3.1. Problem Description
This paper investigates the passenger assignments in URT networks by utilizing AFC and AVL data. We use
Table 1
Passenger travel time parameters.
| Notations | Definitions |
| Inbound time of passenger | |
| Outbound time of passenger | |
| Walking time of passenger | |
| Waiting time on the platform before boarding at the origin station | |
| Transfer time of passenger | |
| Waiting time of passenger | |
| Walking time of passenger | |
| Travel time of passenger | |
| Travel time of passengers from station |
Table 2
Train operation time parameters.
| Notations | Definitions |
| Arriving time of the train at station | |
| Dwelling time of the train at station | |
| Departure time of the train at station | |
| Arriving time of the train in section | |
| Running time of the train in section | |
| Departure time of the train in section |
The interrelationships among passenger assignment, route choice and itinerary choice are illustrated in Figure 1. As defined in “Introduction” section, several route choices can be generated from the origin station and destination station, while passengers may encounter multiple itinerary choices along a chosen route. The passenger flows on each train and sections within the URT network can be obtained when all the passengers during the target time interval are assigned. Considering both route choices and itinerary choices, a passenger entering the URT network at station
[figure(s) omitted; refer to PDF]
Figure 2 depicts the spatial–temporal states of the passenger on a given route and train schedules, showing the available itinerary choices. The passenger enters the URT network at the origin station
[figure(s) omitted; refer to PDF]
3.2. Framework
To address the passenger assignment problem in URT networks, we propose a two-phase DPTAM methodological framework that incorporates route estimation and itinerary selection in passenger-to-train assignments. Figure 3 illustrates the model framework, with a succinct introduction to each part given as follows:
• PRCM: First, passengers are classified into clusters by the GB-DP algorithm according to their travel time, enhancing precision and efficiency in passenger classification. Then, route matching rules are prescribed for passengers in these clusters to match them with alternative routes, identify their route choices, and categorize them into two types based on passenger boarding status.
• PICM: Itinerary selection strategies are tailored for each type of passenger clusters considering train capacity constraints and schedules, allowing for accurate inference of passenger itineraries and localization of their spatiotemporal states.
[figure(s) omitted; refer to PDF]
In addition to the two primary parts, the framework encompasses the processing of input data, that is, passenger travel time parameters, train operation time parameters, and alternative routes, as well as the calculation of evaluation metrics, that is, load rates of trains and left-behind probabilities of passengers. These metrics can account for variations in passenger behavior during peak and off-peak periods, leading to more reliable and actionable insights for URT system management.
3.3. Data Processing
In the field of data-driven passenger assignment, prior studies predominantly utilize passengers’ travel time
Table 3
Example of passenger travel time data from one selected OD.
| No. | ||
| 01 | 2669 | 339 |
| 02 | 3181 | 402 |
| 03 | 2691 | 341 |
| … … |
3.4. Route Choice Estimation
In this subsection, the GB-DP algorithm initially classifies passengers on each OD pair into several clusters, which are subsequently matched with alternative routes by tailored route matching rules. The passengers are then categorized into Type I and Type II according to their route choices. Additionally, the travel time of each cluster is estimated.
3.4.1. Passenger Classification
We employ the GB-DP algorithm [34] for classifying passengers into clusters and introduce a statistic-based nonparametric method [35] to automatically identify cluster centers and determine the number of clusters. The GB-DP algorithm is an advanced machine learning approach, which utilizes extensive historical data to substantially enhance the accuracy and robustness of passenger classification. Unlike traditional methods, the GB-DP algorithm excels in extracting and analyzing latent features of passenger travel choice behaviors from travel time data, thereby providing a deeper and more nuanced understanding of these patterns. This capability allows the model to effectively capture the intricate details of passenger travel behavior, resulting in more precise and reliable route choices. The granular ball (GB), a coarse-grained representation of data, replaces the data point in the density peak clustering (DPC) algorithm [36]. The GB-DP is proposed on the assumption that when the sample size of a GB is small enough, the most samples contained in it belong to the same cluster. Compared to other DPC algorithms, GB-DP offers efficiency improvements without the need for parameter settings, given that the quantity of GBs significantly undercuts the sample count in the dataset. GBs are generated using the generation algorithm proposed by Cheng et al. [34].
1. The density
Let
where
Adopting the definition in the classical DPC, the density
where
2. Automatic center identification
Generally, cluster centers are usually the points with high
where
where
3. Cluster generation
Once the cluster centers are identified, each remaining sample should be classified into the same cluster as its nearest neighbor with higher density. The set of clusters
3.4.2. Cluster Analysis
The alternative routes
When the time difference between the travel time of alternative routes is significant, passengers prefer to choose the route with the largest matching value. However, it is a common situation that two alternative routes have similar travel time. In such scenarios, selecting a route solely based on the maximum matching value may not align with actual passenger behavior. To address this, we propose the following route matching rules that account for the likelihood of passengers choosing among routes with comparable travel times, reflecting a more accurate estimation of passenger preference within the clusters.
If
Otherwise,
Then passengers are divided into two sets:
3.4.3. Re-evaluation of Passenger Classifications
To enhance the accuracy and reliability of passenger route matching, we have introduced a dynamic re-evaluation mechanism. This mechanism periodically assesses the capacity availability of trains and reclassifies passengers accordingly, ensuring that feasible itineraries are generated even in scenarios where initial classifications might have limited options.
The re-evaluation mechanism continuously monitors train capacities and reclassifies passengers. The passenger reclassification process involves the following two steps:
1. Train capacity data collection
The data of available train capacities can be generated through simulation or extrapolated from known operational data.
2. Passenger reclassification
If a train with available capacity is identified at station
where
This reclassification process enables the generation of feasible itineraries for these passengers even in scenarios where initial classifications might have limited options.
3.5. Itinerary Choice Inference
After getting the estimated travel routes, passengers on each route are assigned to respective trains in the PICM. Their travel trajectory can be accurately depicted by utilizing the timetables of trains and the travel time parameters. Itinerary selection principles are designed for the passengers in Type I and in Type II.
3.5.1. Itinerary Selection Principle for Type I Passengers
For Type I passengers on the routes without transfers, the first feasible train whose arrival time is compatible with the passengers’ arrival time at the origin station platform is chosen into their itinerary. Therefore, the principle can be indicated as follows:
For the Type I passengers on the routes with transfers, we also need to select the train for after-transfer parts of the itinerary. Like the origin station, we choose the first feasible train whose arrival time aligns with the passengers’ arrival time at the platform after transferring and add it to the itinerary. We introduce
It should be emphasized that the estimations of the time parameters can vary drastically from different periods of the day, especially for
The itinerary selection principle for passengers on the route with transfers also includes the following equation:
3.5.2. Itinerary Selection Principle for Type II Passengers
Essentially, the generation of itineraries of Type II passengers mainly depends on the number of trains that passengers have been left behind at the origin station and each transfer station. It should be mentioned that, even at the same station, passengers may be left behind by different numbers of trains according to the congestion. Therefore, there are two main decisive factors for itinerary selection: at which station passengers have been left behind and the number of trains that passengers have been left behind at the station.
For Type II passengers traveling on routes without transfers, passengers are only able to be left behind at the origin station and just the number of trains the passengers are left behind should be considered for the itinerary selections.
For Type II passengers traveling on routes with transfers, passengers are likely to be left behind at the origin station or any of the transfer stations. In order to make informed itinerary selections, it is necessary to take into account the number of stations where passengers are left behind, as well as the number of trains that they may be left behind at each station. Similar to Type I passengers, the feasible itineraries for Type II passengers on routes with transfers are formulated as follows:
3.6. Load Rate Estimation
The output of the PICM can be used to estimate the load on individual trains. Based on a passenger’s itinerary, the spatial–temporal states of the passenger can be calibrated by travel time parameters and train timetables.
To demonstrate this procedure, a left-behind passenger on the route in Figure 4 is taken as an example. This passenger enters the URT system from station
[figure(s) omitted; refer to PDF]
Table 4
Timetable of
| Train_ID | Line_ID | Station_ID | Arrive_time | Departure_time | Direction |
| Line 1 | 07:03:15 | 07:04:30 | Up | ||
| Line 1 | 07:05:00 | 07:06:17 | Up | ||
| Line 1 | 07:07:44 | 07:08:57 | Up | ||
| Line 2 | 07:13:03 | 07:14:22 | Down | ||
| Line 2 | 07:15:59 | 07:16:59 | Down | ||
| Line2 | 07:18:37 | 07:19:55 | Down |
Table 5
Spatial–temporal states inferred from passenger’s itinerary.
| Spatial position | Train in itinerary | Time span |
The load rate reflects the utilization of the train capacity. The load rate
3.7. Left-Behind Probability Estimation
The left-behind probability illustrates the mismatch between the train capacity and the passenger flow, which reflects the operational status of the URT system by time interval. The left-behind probability
4. Experiment
4.1. Data Description
The URT network data, AFC data, and AVL data in the case study were collected from Chengdu Metro in April 2018. As shown in Figure 5, this URT network comprises 6 metro lines and, in total, 136 stations. The upper limit of the train capacity in different lines of Chengdu Metro are 1762 passengers (Line 1), 1762 passengers (Line 2), 1752 passengers (Line 3), 1752 passengers (Line 4), 2194 passengers (Line 7), and 2100 passengers (Line 10). All stations of interest are numbered from S1 to S37. The AFC data consist of the card_ID, the inbound and outbound stations, and the inbound and outbound time of each passenger. Train arrival time and departure time at each station are collected from the AVL system. One peak period in the morning (7:00–8:00) and one off-peak period in the afternoon (15:00–16:00) on April 20, 2018 were selected as the study periods.
[figure(s) omitted; refer to PDF]
To validate the effectiveness of the model, we have implemented a robust validation framework that uses both synthetic and real-world data. Due to the lack of ground truth passenger itinerary data, the DPTAM is validated and evaluated with synthetic data based on Chengdu Metro. Meanwhile, the evaluation of the performance of the model in a practical URT environment and the practical application of the model are expatiated by a real-world dataset of Chengdu Metro. Based on the results, several insights into the operational performance are obtained for future improvements.
4.2. Baseline Models
Passenger assignment models using state-of-art clustering algorithms, including DPC, AP, DBSCAN, and K-means, acted as baselines to compare the model performance.
The detailed information for each benchmark is listed as follows.
• DPTAM-DPC [36]: The rule to choose a cutoff distance is to make the average number of neighbors for each sample around 1%–2% of the total number of samples.
• DPTAM-FKNN-DPC [39]: The number of nearest neighbors is 6.
• DPTAM-AP [40]: The sample preference is −3010 and the damping is 0.9.
• DPTAM-DBSCAN [41]: The eps is 48 and the MinPts is 4.
• DPTAM-K-means [42]: The number of clusters is 4.
All the experiments were conducted on a Windows 10 server with an Intel Core i7-13700K 3.40 GHz CPU and 32 GB RAM. Models were implemented with PyTorch. AP, DBSCAN, and K-means are facilitated by sklearn package.
4.3. Validation on Synthetic Data
4.3.1. Synthetic Data Generation
The data of passengers from S1 to S37 on April 20, 2018 were used to generate the synthetic data for model validation and performance evaluations. Three route choices were searched out [37, 38] as shown in Figure 6, but only the first two alternative route were involved in the simulation as the travel time of Alternative Route 3 is so long that few passengers choose it. Alternative Route 1 only transfers once at S16, which is one of the most crowded stations in the network; Alternative Route 2 has a travel time much shorter than Alternative route 1, while it has two transfer stations, that is, a congested transfer station S9 and an uncongested transfer station S36. In the simulation, the route choice rate for the two route choices were predefined to assign passengers to routes. The available train capacity of each train at the interested stations was also predefined to assign passengers to trains.
[figure(s) omitted; refer to PDF]
We generated synthetic data for each passenger based on the travel process shown in Figure 2. Each passenger entries the URT system at the inbound time from the AFC data, walks to the platform with randomly assigned a
The travel time distribution of the synthetic data and the actual data during peak period and off-peak period is illustrated in Figure 7. It can be confirmed that the travel time distribution of the synthetic data is consistent with actual conditions. The regularity of the actual data during peak period are stronger and the synthetic data during peak period fit better than off-peak period. Thus, we selected passengers during peak period as dataset to evaluate model performances.
[figure(s) omitted; refer to PDF]
Figure 8 compares the distributions of the number of feasible itineraries for passengers from S1 to S37 of the synthetic data and the actual data. Passengers in the peak period are more likely to be left behind and have more feasible itineraries than off-peak period. The distribution of the synthetic data highly conforms with actual data.
[figure(s) omitted; refer to PDF]
4.3.2. Performance Evaluation
In total, 314 samples during peak period were used as examples to demonstrate the detailed process of the model and evaluate model performances; 28 GBs were generated by GB-DP. The plot of samples divided into GBs and classification results are given in Figure 9. Three clusters were yielded with the respective average travel time of 2738, 2962, and 3457 s. The number of passengers in the three clusters was 172, 41, and 95, respectively. The estimated travel time on the two alternative routes during off-peak period was 3414.5 and 2726.3 s, obtained statistically by the method proposed in [43]. Based on the cluster analysis, passengers in three clusters were identified as Type I passengers on Alternative Route 2, Type II passengers on Alternative Route 2, and Type I passengers on Alternative Route 1.
[figure(s) omitted; refer to PDF]
Then, the passengers’ itineraries can be acquired according to the selection principles. Table 6 illustrates the itineraries of a set of passengers from S1 to S37 during 07:25–07:30 on April 20, 2018.
Table 6
The itineraries of passengers from S1 to S37 during 07:25–07:30 on April 20, 2018.
| No. | Inbound time | Outbound time | Itinerary |
| 01 | 07:25:05 | 08:07:53 | |
| 02 | 07:25:34 | 08:20:18 | |
| 03 | 07:25:37 | 08:08:04 | |
| 04 | 07:25:38 | 08:19:56 | |
| 05 | 07:25:41 | 08:12:10 | |
| 06 | 07:26:48 | 08:12:32 | |
| 07 | 07:28:42 | 08:16:01 | |
| 08 | 07:30:25 | 08:16:02 |
Based on the results of PRCM, we conducted a comprehensive comparative analysis between GB-DP and several state-of-art clustering algorithms, including FKNN-DPC, DPC, AP, DBSCAN, and K-means. A performance comparison based on Silhouette coefficient (SC), Calinski–Harabasz index (CHI), Davies–Bouldin index (DBI), Dunn’s validation index (DVI) [44], adjusted Rand index (ARI), adjusted mutual information (AMI) [45], accuracy (ACC) [46], and runtime is presented in Table 7.
Table 7
The comparison of clustering algorithms in terms of SC, CHI, DBI, DVI, ARI, AMI, ACC, and runtime on synthetic data of Chengdu Metro.
| Models | SC | CHI | DBI | DVI | ARI | AMI | ACC | Runtime |
| GB-DP | 0.539 | 1048.737 | 0.650 | 0.048 | 0.834 | 0.784 | 0.936 | 0.014 |
| FKNN-DPC | 0.487 | 1007.342 | 0.681 | 0.053 | 0.778 | 0.762 | 0.917 | 0.016 |
| DPC | 0.393 | 683.191 | 0.931 | 0.030 | 0.585 | 0.664 | 0.828 | 0.019 |
| AP | 0.457 | 597.887 | 0.843 | 0.048 | 0.536 | 0.575 | 0.755 | 0.010 |
| DBSCAN | 0.646 | 928.868 | 0.501 | 0.104 | 0.529 | 0.619 | 0.736 | 0.001 |
| K-means | 0.423 | 720.532 | 0.870 | 0.032 | 0.768 | 0.748 | 0.911 | 0.005 |
The results show that GB-DP outperforms the other algorithms on the synthetic dataset of Chengdu Metro in terms of CHI, ARI, AMI, and ACC. As we can see, AP, DBSCAN and K-means run faster than our proposed model, but their results are not better than ours. In addition, the parameter tuning of AP and DBSCAN are much more time-consuming than the other algorithms. FKNN-DPC also performs well on the accuracy, but it needs more time to calculate the local density for each sample. Although GB-DP takes a certain time cost to generate GBs, it requires much less time in density calculations since the number of GBs is much less than the number of samples in the data. Moreover, it can automatically get the clustering results without any parameters settings and manually selections. Overall, GB-DP is superior to the other current typical clustering algorithms in solution quality under an acceptable computational efficiency, which substantiates the effectiveness of GB-DP and its potential to advance the field of passenger assignment modeling.
We also compared passengers’ estimated itineraries and the “actual” itineraries in the synthetic data to evaluate the performance of the DPTAM and its baseline models, as depicted in Table 8. Our proposed model demonstrates superior performance in ARI, AMI, and ACC. Although its runtime is 0.009 s per OD pair longer than DPTAM-K-means, its accuracy is significantly higher. Therefore, the DPTAM can achieve favorable assignment results and can be effectively applied in URT networks.
Table 8
The comparison of DPTAM and baseline models in terms of ARI, AMI, ACC, and runtime on synthetic data of Chengdu Metro.
| Models | ARI | AMI | ACC | Runtime |
| DPTAM | 0.811 | 0.769 | 0.916 | 0.019 |
| DPTAM-FKNN-DPC | 0.700 | 0.754 | 0.818 | 0.021 |
| DPTAM-DPC | 0.502 | 0.602 | 0.810 | 0.024 |
| DPTAM-AP | 0.488 | 0.514 | 0.687 | 0.015 |
| DPTAM-DBSCAN | 0.510 | 0.558 | 0.680 | 0.005 |
| DPTAM-K-means | 0.675 | 0.718 | 0.905 | 0.010 |
4.4. Application on Real-World Data
Line 2 of Chengdu Metro was chosen to show the results based on a whole network passenger-to-train assignment, involving 18,360 OD pairs in the network, with 186,972 travel records during the peak period and 116,035 travel records during the off-peak period on April 20, 2018. By applying the DPTAM to real-world data, the model’s accuracy can be validated by comparing estimated section flows and actual values in both peak and off-peak period, demonstrating its reliability and effectiveness in capturing the dynamics of passenger behavior under varying congestion conditions. The examination of travel times, train loads, and left-behind probabilities also provides insights into how congestion affects passenger travel choices and behaviors and identifies congested periods and sections. The real-world case analysis showcases the model’s scalability and potential for use in other urban transit systems, which highlights the practical benefits and broader implications of our research for enhancing passenger managements and operational efficiency in URT networks.
4.4.1. Section Flow Analysis
To evaluate the computation accuracy of DPTAM, we choose section flows in both peak and off-peak periods in Line 2 of Chengdu Metro to compare the assigned values and actual values. The actual section flows are provided by Chengdu Rail Transit Group Co., Ltd. The root-mean-squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) are adopted to evaluate the accuracy of DPTAM according to equations (22)∼(24).
Table 9 displays the RMSE, MAE, and MAPE of DPTAM. The experimental results show high confidence and reliability, with a maximum MAPE of 5%. Notably, the assignment accuracies during off-peak hours are similar in the up and down directions. During peak hours, however, the assignment results are more accurate in the up direction, which serves much more passengers compared to the down direction. This suggests that both passenger volume and time of day significantly affect the model’s accuracy.
Table 9
Computation accuracy of DPTAM on Chengdu Metro dataset.
| Section flow | RMSE | MAE | MAPE |
| 7:00–8:00 up direction | 334.595 | 254.516 | 0.033 |
| 7:00–8:00 down direction | 190.684 | 152.581 | 0.019 |
| 15:00–16:00 up direction | 300.363 | 221.968 | 0.031 |
| 15:00–16:00 down direction | 321.350 | 280.129 | 0.050 |
To contextualize these results, we compared the findings with those reported in recent studies. In terms of model accuracy, due to the lack of actual section flow data, only a few studies carry out model accuracy analysis. Specifically, Su et al. [10] have higher MAPE values, often less than 10%, which is more than 5%, the maximum MAPE in our study. This highlights the improved accuracy of our model. Regarding problem scale, DPTAM is designed for large-scale URT networks with over 10 million daily ridership. It addresses the assignment of 186,972 travel records during peak periods and 116.035 travel records during off-peak periods in the real-world case, which is consistent with the magnitude of the data processed in the latest study [8, 20, 31]. The ability to effectively manage a more extensive dataset underscores the robustness and scalability of the DPTAM. The computational efficiency of our model, as indicated by the runtime in Table 8, shows that the DPTAM achieves favorable results within an acceptable time frame. Despite handling a larger problem scale, the model’s runtime remains competitive with existing models. For example, the runtime per OD pair for DPTAM is comparable to those illustrated by Mo et al. [31], demonstrating that our model can achieve high accuracy without compromising computational efficiency.
4.4.2. Travel Time Analysis
Figure 10(a) shows the travel time distribution of passengers from S1 to S37 during the peak period grouped by clusters getting from DPTAM, and the travel time is derived from AFC data. Figure 10(b) illustrates the travel time distribution of these passengers in the unconstrained network, inferred by assigning passengers to the shortest path and the first coming train without capacity constraints. The actual average travel time is 2977 s, while the average travel time of passengers in an unconstrained network is 2722 s. It can be figured out that an additional time of 277 s per passenger is spent due to congestion effects.
[figure(s) omitted; refer to PDF]
4.4.3. Train Load Analysis
Figures 11 and 12 illustrate the train diagram and the load rates of trains on Line 2 of Chengdu Metro during the peak period in the morning (7:00–8:00) and the off-peak period in the afternoon (15:00–16:00). In the morning rush hours, the train loads in sections S12-S13, S14-S15, and S8-S9 in the up direction (from S1 toward S32) approach 100% of the standard capacity at 7:40, 7:52, and 7:53, respectively. When the load rate of trains approaches 100%, an early warning is given to the URT operators for the potential peak periods and high-demand sections. Meanwhile, the train loads in sections S19-S18 and S24-S23 in the down direction (from S32 toward S1) exceed 120% at 7:54 and 7:55, respectively. When the train loads exceed the upper limit 120%, the train schedule needs to be adjusted by increasing the frequency and shortening the headway to improve its capacity. As shown in Figure 11, there is a 3-min headway in this period and the minimum headway on Line 2 of Chengdu Metro is 2.25 min. No more than 6 trains per hour can be added to the schedule considering other operational indicators, for example, capacities of the respective stations, operating costs, and available trains.
[figure(s) omitted; refer to PDF]
In the off-peak period, most train loads are lower than 60% in both up and down directions. The 5-min headway offers a stable and sufficient service level while simultaneously maintaining a relatively high resource utilization in this period.
4.4.4. Left-Behind Analysis
Figure 13 presents the left-behind probability at all the stations in the peak period. Seven stations have a probability of being left behind higher than 20% in the up direction, which are S1, S9, S12, S15, S16, S22, and S26. Besides, in the down direction, stations S32, S28, S22, S16, S15, S12, and S9 are with high probabilities. Herein, S1 and S32 are both the original and destination stations on Line 2 for different directions, and the other stations are transfer stations except S26 and S28. Based on the experiment results, it can be inferred that passengers are more likely to be left behind at the original stations and at the transfer stations. However, because most trains operate at full capacity with a short headway during the peak period, the left-behind probability may not be fully reduced by adjusting the train schedule with limited space. In this regard, the effective organization of passengers in these stations becomes thus important, which can improve the service level of the URT system and minimize the risks of accidents in case of emergency events caused by overcrowding on the platform.
[figure(s) omitted; refer to PDF]
4.5. Discussion
Our research has yielded promising results, offering a better understanding of passenger behavior in the URT system. By using evaluation metrics as a basis for further optimization of operating strategies, several measures can be implemented to better align passenger flows with URT operations. Specifically, several measures can be implemented from a practical perspective to ensure that the URT system operates more efficiently.
• The train schedule can be better designed and optimized based on the evaluation metrics. Additionally, these metrics may also inform the development planning of the URT network and public transit system.
• Passenger flows can be better organized and controlled during rush hours. One effective method is to provide passengers with clear guidelines to alternative routes at busy stations.
• Evacuation facilities and optimal evacuation plans can be developed, tested, and executed, especially for the busiest stations, to minimize the risk of accidents caused by overcrowding on the platforms.
• Dynamic ticket pricing strategies can be worked out to induce passenger flows in high-demand sections from peak periods to off-peak periods, which can help relieve pressure and congestion on the URT system.
5. Conclusion
This study proposes the DPTAM for congested URT networks using AFC data and AVL data. The most significant contribution of this paper is that we explicitly distinguish the passengers’ route choice behavior and the itinerary choice behavior in the travel process, based on which a two-phase architecture for passenger assignment is proposed. Moreover, passengers are classified into two types considering whether they are left behind or not, and tailored methods for different types of passengers are employed to ensure the accuracy and confidence of the model.
With the proposed approach, passengers’ itineraries and train loads can be obtained, based on which the passenger movements and the state of a URT network can be thoroughly inferred and visualized. Furthermore, the model can also be used to evaluate several key operating performance metrics. These metrics form the basis for further optimizing and improving capacity utilization, service levels, and operating strategies. For example, the peak periods and the high-demand sections in the URT network can be identified, so the train schedule can be adjusted accordingly to better match the demand. Besides, improved passenger organization and emergency response strategies need to be devised in busy stations. Furthermore, some strategies, for example, dynamic ticket pricing, may be employed to adjust the demand patterns to relieve the pressures of peak periods and high-demand sections. Passengers may also be better informed and guided with URT operating information through mobile Apps. The validity and applicability of the proposed DPTAM are shown with synthetic data and real-world data from Chengdu Metro. The results highlight the importance of capacity constraints in passenger assignment models and provide critical insights for passenger assignments of congested URT networks.
Future research can be carried out to tackle the following limitations of this paper:
• The accuracy of the model’s outputs is highly dependent on travel time parameter estimations. Future research is suggested to employ a broader array of multisource data, such as mobile phone records, Wi-Fi probe data, and sensor data, to improve the travel time estimation accuracy.
• Dynamic itinerary adjustment mechanisms are still unexplored due to the difficulty in accessing real-time data, such as train capacity and passenger boarding status. In future work, the real-time data accessibility can be carried out, which will enable us to develop advanced algorithms that can dynamically adjust passenger itineraries based on real-time conditions, further enhancing the accuracy and reliability of our model.
• The model’s results are not available in real time, constraining its application for real-time operational management. To address this, additional research is required to augment real-time data accessibility and integrate passenger prediction models with passenger assignments.
• Our model can only be applied to analyze URT networks. Future research is thus suggested to incorporate multimode trips in regional rail transit networks, facilitating passenger transfers between URT and other rail transit modes.
Author Contributions
Di Wen: conceptualization, methodology, data curation, formal analysis, investigation, resources, validation, visualization, writing – original draft, and writing – review and editing. Hongxia Lv: conceptualization, supervision, resources, project administration, and funding acquisition. Hao Yu: conceptualization, supervision, formal analysis, writing – original draft, and writing – review and editing.
Funding
This study was supported by the National Key Research and Development Program of China (2022YFB4300502), the National Natural Science Foundation of China (52172321), the Key Research and Development Program of Guangzhou Municipality (202206030007), the Sichuan Province Science and Technology Innovation Talent Project (2024JDRC0020), and the Sichuan Province Science and Technology Support Program (2025YFHZ0328). The third author was partially supported by the Norwegian Directorate for Higher Education and Skills (UTF-2021/10166).
Acknowledgments
This study was supported by the National Key Research and Development Program of China (2022YFB4300502), the National Natural Science Foundation of China (52172321), the Key Research and Development Program of Guangzhou Municipality (202206030007), the Sichuan Province Science and Technology Innovation Talent Project (2024JDRC0020), and the Sichuan Province Science and Technology Support Program (2025YFHZ0328). The third author was partially supported by the Norwegian Directorate for Higher Education and Skills (UTF-2021/10166).
[1] P. Noursalehi, H. N. Koutsopoulos, J. Zhao, "Real Time Transit Demand Prediction Capturing Station Interactions and Impact of Special Events," Transportation Research Part C: Emerging Technologies, vol. 97, pp. 277-300, DOI: 10.1016/j.trc.2018.10.023, 2018.
[2] A. Drabicki, R. Kucharski, O. Cats, A. Szarata, "Modelling the Effects of Real-Time Crowding Information in Urban Public Transport Systems," Transportmetrica: Transport Science, vol. 17 no. 4, pp. 675-713, DOI: 10.1080/23249935.2020.1809547, 2020.
[3] X. Chen, L. Zhou, Z. Bai, Y. Yue, B. Guo, H. Zhou, "Data-driven Approaches to Mining Passenger Travel Patterns: “Left-Behinds” in a Congested Urban Rail Transit Network," Journal of Advanced Transportation, vol. 2019,DOI: 10.1155/2019/6830450, 2019.
[4] B. Mo, Z. Ma, H. N. Koutsopoulos, J. Zhao, "Capacity-constrained Network Performance Model for Urban Rail Systems," Transportation Research Record: Journal of the Transportation Research Board, vol. 2674 no. 5, pp. 59-69, DOI: 10.1177/0361198120914309, 2020.
[5] A. Nuzzolo, U. Crisalli, L. Rosati, "A Schedule-Based Assignment Model with Explicit Capacity Constraints for Congested Transit Networks," Transportation Research Part C: Emerging Technologies, vol. 20 no. 1, pp. 16-33, DOI: 10.1016/j.trc.2011.02.007, 2012.
[6] K. Huang, F. Liao, "A Novel Two-Stage Approach for Energy-Efficient Timetabling for an Urban Rail Transit Network," Transportation Research Part E: Logistics and Transportation Review, vol. 176,DOI: 10.1016/j.tre.2023.103212, 2023.
[7] S. H. Cheon, C. Lee, S. Shin, "Data-Driven Stochastic Transit Assignment Modeling Using an Automatic Fare Collection System," Transportation Research Part C: Emerging Technologies, vol. 98, pp. 239-254, DOI: 10.1016/j.trc.2018.09.011, 2019.
[8] Q. Luo, B. Lin, Y. Lyu, Y. He, X. Zhang, Z. Zhang, "Spatiotemporal Path Inference Model for Urban Rail Transit Passengers Based on Travel Time Data," IET Intelligent Transport Systems, vol. 17 no. 7, pp. 1395-1414, DOI: 10.1049/itr2.12332, 2023.
[9] J. Arriagada, M. A. Munizaga, C. A. Guevara, C. Prato, "Unveiling Route Choice Strategy Heterogeneity From Smart Card Data in a Large-Scale Public Transport Network," Transportation Research Part C: Emerging Technologies, vol. 134,DOI: 10.1016/j.trc.2021.103467, 2022.
[10] G. Su, B. Si, K. Zhi, H. Li, "A Calculation Method of Passenger Flow Distribution in Large-Scale Subway Network Based on Passenger-Train Matching Probability," Entropy, vol. 24 no. 8,DOI: 10.3390/e24081026, 2022.
[11] Y. Jiang, A. A. Ceder, "Incorporating Personalization and Bounded Rationality Into Stochastic Transit Assignment Model," Transportation Research Part C: Emerging Technologies, vol. 127,DOI: 10.1016/j.trc.2021.103127, 2021.
[12] N. Oliker, S. Bekhor, "A Frequency Based Transit Assignment Model that Considers Online Information," Transportation Research Part C: Emerging Technologies, vol. 88, pp. 17-30, DOI: 10.1016/j.trc.2018.01.004, 2018.
[13] Y. Zhu, H. N. Koutsopoulos, N. H. M. Wilson, "A Probabilistic Passenger-To-Train Assignment Model Based on Automated Data," Transportation Research Part B: Methodological, vol. 104, pp. 522-542, DOI: 10.1016/j.trb.2017.04.012, 2017.
[14] Y. Zhu, H. N. Koutsopoulos, N. H. M. Wilson, "Passenger Itinerary Inference Model for Congested Urban Rail Networks," Transportation Research Part C: Emerging Technologies, vol. 123,DOI: 10.1016/j.trc.2020.102896, 2021.
[15] A. Nuzzolo, F. Russo, U. Crisalli, "A Doubly Dynamic Schedule-Based Assignment Model for Transit Networks," Transportation Science, vol. 35 no. 3, pp. 268-285, DOI: 10.1287/trsc.35.3.268.10149, 2001.
[16] M. H. Poon, S. C. Wong, C. O. Tong, "A Dynamic Schedule-Based Model for Congested Transit Networks," Transportation Research Part B: Methodological, vol. 38 no. 4, pp. 343-368, DOI: 10.1016/s0191-2615(03)00026-2, 2004.
[17] Z. Ma, H. N. Koutsopoulos, Y. Chen, N. H. M. Wilson, "Estimation of Denied Boarding in Urban Rail Systems: Alternative Formulations and Comparative Analysis," Transportation Research Record: Journal of the Transportation Research Board, vol. 2673 no. 11, pp. 771-778, DOI: 10.1177/0361198119857034, 2019.
[18] Y. Zhou, H. Yang, Y. Wang, X. Yan, "Integrated Line Configuration and Frequency Determination with Passenger Path Assignment in Urban Rail Transit Networks," Transportation Research Part B: Methodological, vol. 145, pp. 134-151, DOI: 10.1016/j.trb.2021.01.002, 2021.
[19] A. Nuzzolo, U. Crisalli, A. Comi, L. Rosati, "A Mesoscopic Transit Assignment Model Including Real-Time Predictive Information on Crowding," Journal of Intelligent Transportation Systems, vol. 20 no. 4, pp. 316-333, DOI: 10.1080/15472450.2016.1164047, 2016.
[20] L. Yu, H. Liu, Z. Fang, R. Ye, Z. Huang, Y. You, "A New Approach on Passenger Flow Assignment With Multi-Connected Agents," Physica A: Statistical Mechanics and Its Applications, vol. 628,DOI: 10.1016/j.physa.2023.129175, 2023.
[21] Y. Deng, Y. X. Chen, Y. J. Zhang, S. Mahadevan, "Fuzzy Dijkstra Algorithm for Shortest Path Problem Under Uncertain Environment," Applied Soft Computing, vol. 12 no. 3, pp. 1231-1237, DOI: 10.1016/j.asoc.2011.11.011, 2012.
[22] T. Mai, M. Fosgerau, E. Frejinger, "A Nested Recursive Logit Model for Route Choice Analysis," Transportation Research Part B: Methodological, vol. 75, pp. 100-112, DOI: 10.1016/j.trb.2015.03.015, 2015.
[23] S. Sun, W. Y. Szeto, "Logit-based Transit Assignment: Approach-Based Formulation and Paradox Revisit," Transportation Research Part B: Methodological, vol. 112, pp. 191-215, DOI: 10.1016/j.trb.2018.03.018, 2018.
[24] W. Zhu, R. Xu, "Generating Route Choice Sets With Operation Information on Metro Networks," Journal of Traffic and Transportation Engineering, vol. 3 no. 3, pp. 243-252, DOI: 10.1016/j.jtte.2016.05.001, 2016.
[25] W. Zhu, W.-l. Fan, A. M. Wahaballa, J. Wei, "Calibrating Travel Time Thresholds With Cluster Analysis and Afc Data for Passenger Reasonable Route Generation on an Urban Rail Transit Network," Transportation, vol. 47 no. 6, pp. 3069-3090, DOI: 10.1007/s11116-019-10040-8, 2019.
[26] W. Zhu, J. Wei, W. Fan, "Data Fusion Approach for Evaluating Route Choice Models in Large-Scale Complex Urban Rail Transit Networks," Journal of Transportation Engineering Part A-Systems, vol. 146 no. 1,DOI: 10.1061/jtepbs.0000284, 2020.
[27] E. H. Lee, K. Kim, S.-Y. Kho, D. K. Kim, S.-H. Cho, H. Y. Li, "Exploring for Route Preferences of Subway Passengers Using Smart Card and Train Log Data," Journal of Advanced Transportation, vol. 2022,DOI: 10.1155/2022/6657486, 2022.
[28] F. Zhou, R.-h. Xu, "Model of Passenger Flow Assignment for Urban Rail Transit Based on Entry and Exit Time Constraints," Transportation Research Record: Journal of the Transportation Research Board, vol. 2284 no. 1, pp. 57-61, DOI: 10.3141/2284-07, 2012.
[29] J. Zhao, F. Zhang, L. Tu, "Estimation of Passenger Route Choice Pattern Using Smart Card Data for Complex Metro Systems," IEEE Transactions on Intelligent Transportation Systems, vol. 18 no. 4, pp. 790-801, DOI: 10.1109/tits.2016.2587864, 2017.
[30] W. Zhu, W. Wang, Z. Huang, "Estimating Train Choices of Rail Transit Passengers With Real Timetable and Automatic Fare Collection Data," Journal of Advanced Transportation, vol. 2017,DOI: 10.1155/2017/5824051, 2017.
[31] B. Mo, Z. Ma, H. N. Koutsopoulos, J. Zhao, "Ex Post Path Choice Estimation for Urban Rail Systems Using Smart Card Data: An Aggregated Time-Space Hypernetwork Approach," Transportation Science, vol. 57 no. 2, pp. 313-335, DOI: 10.1287/trsc.2022.1177, 2023.
[32] J. Gu, Z. Jiang, Y. Sun, M. Zhou, S. Liao, J. Chen, "Spatio-temporal Trajectory Estimation Based on Incomplete Wi-Fi Probe Data in Urban Rail Transit Network," Knowledge-Based Systems, vol. 211,DOI: 10.1016/j.knosys.2020.106528, 2021.
[33] H. Zhang, G. Lu, Y. Lei, G. Zhang, I. Niyitanga, "A Hybrid Framework for Synchronized Passenger and Train Traffic Simulation in an Urban Rail Transit Network," International Journal of Reality Therapy, vol. 11 no. 6, pp. 912-941, DOI: 10.1080/23248378.2022.2109522, 2022.
[34] D. Cheng, Y. Li, S. Xia, G. Wang, J. Huang, S. Zhang, "A Fast Granular-Ball-Based Density Peaks Clustering Algorithm for Large-Scale Data," IEEE Transactions on Neural Networks and Learning Systems, vol. 35 no. 12, pp. 17202-17215, DOI: 10.1109/TNNLS.2023.3300916, 2024.
[35] Z. Wang, Y. Wang, "A New Density Peak Clustering Algorithm for Automatically Determining Clustering Centers," 2020 International Workshop on Electronic Communication and Artificial Intelligence,DOI: 10.1109/IWECAI50956.2020.00034, .
[36] A. Rodriguez, A. Laio, "Clustering by Fast Search and Find of Density Peaks," Science, vol. 344 no. 6191, pp. 1492-1496, DOI: 10.1126/science.1242072, 2014.
[37] E. Ruppert, "Finding the K Shortest Paths in Parallel," Algorithmica, vol. 28 no. 2, pp. 242-254, DOI: 10.1007/s004530010038, 2000.
[38] G. Li, A. Chen, "Strategy-Based Transit Stochastic User Equilibrium Model With Capacity and Number-of-Transfers Constraints," European Journal of Operational Research, vol. 305 no. 1, pp. 164-183, DOI: 10.1016/j.ejor.2022.05.040, 2023.
[39] J. Xie, H. Gao, W. Xie, X. Liu, P. W. Grant, "Robust Clustering by Detecting Density Peaks and Assigning Points Based on Fuzzy Weighted K-Nearest Neighbors," Information Sciences, vol. 354, pp. 19-40, DOI: 10.1016/j.ins.2016.03.011, 2016.
[40] B. J. Frey, D. Dueck, "Clustering by Passing Messages between Data Points," Science, vol. 315 no. 5814, pp. 972-976, DOI: 10.1126/science.1136800, 2007.
[41] M. Ester, H.-P. Kriegel, J. Sander, X. Xu, "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases With Noise," 2nd International Conference on Knowledge Discovery and Data Mining, vol. 96, pp. 226-231, 1996.
[42] J. MacQueen, "Some Methods for Classification and Analysis of Multivariate Observations," 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, 1967.
[43] W. Zhu, W. Fan, J. Wei, W. D. Fan, "Complete Estimation Approach for Characterizing Passenger Travel Time Distributions at Rail Transit Stations," Journal of Transportation Engineering, Part A: Systems, vol. 146 no. 7,DOI: 10.1061/JTEPBS.0000375, 2020.
[44] E. Rendón, I. Abundez, A. Arizmendi, E. M. Quiroz, "Internal Versus External Cluster Validation Indexes," International Journal of Computers, Communications & Control, vol. 5 no. 1, pp. 27-34, 2011.
[45] N. X. Vinh, J. Epps, J. Bailey, "Information Theoretic Measures for Clusterings Comparison: Is a Correction for Chance Necessary?," 26th Annual International Conference on Machine Learning, pp. 1073-1080, DOI: 10.1145/1553374.1553511, .
[46] R. Shang, W. Zhang, F. Li, L. Jiao, R. Stolkin, "Multi-Objective Artificial Immune Algorithm for Fuzzy Clustering Based on Multiple Kernels," Swarm and Evolutionary Computation, vol. 50,DOI: 10.1016/j.swevo.2019.01.001, 2019.
Copyright © 2025 Di Wen et al. Journal of Advanced Transportation published by John Wiley & Sons Ltd. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.