Full text

Turn on search term navigation

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

The rapid and continuous development of China has led to an increase in the number of vehicles. The National Bureau of Statistics of China announced that the number of privately owned vehicles reached 261.5 million in 2019 with 21.22 million vehicles increased in a year. 96 cities in China had more than one million registered vehicles [1]. The rapid increase of vehicles causes traffic congestion, parking problems, and environmental pollution. Public transportation affords a larger number of passengers and alleviates such problems. Mass transportation consumes less energy and emits less amount of pollutants than private transport. Therefore, urban planning puts a priority on public transportation. New technologies such as bus rapid transit (BRT) and driverless bus have been developed significantly with huge investment to support the public transportation system. However, a trip by bus takes a relatively long time and is not punctual, which makes people avoid it. Encouraging people to use buses more often requires optimized bus routes and punctuality of bus operation [2, 3]. However, the absence of an accurate operation schedule often causes long waiting times and bus bunching on the same route. For the punctual operation of the public buses, the bus schedule needs to be optimized, which needs an accurate prediction of the arrival time of buses on a route accurately. This not only meets the demand of ordinary passengers who want to know the arrival times of a bus at boarding stations but also optimizes the intelligent bus scheduling system and improves the operation efficiency of the bus company.

Several neural networks have been used to predict the arrival time of a bus: non-RNN network, RNN with the time series, and temporal and spatial RNN network. Several studies adopted non-RNN networks for predicting bus arrival and operation times using (1) MapReduce-based clustering with K-means [4], (2) a backpropagation (BP) neural network model [5], (3) a particle swarm algorithm [6], (4) a wide-depth recursive (WDR) learning model [7], and (5) RNN with the time series such as long short-term memory (LSTM) [8]. Models with LSTM processed the historical data of the global position system (GPS) and bus stop locations with the influence of different routes, drivers, weather conditions, time distribution [9], heterogeneous traffic flow, and real-time data [10–12]. The temporal and spatial RNN network with ConvLSTM or a spatiotemporal property model (STPM) was originally used to predict the precipitation [13]. However, it was also used for predicting bus arrival times based on the total operation time of a bus on a route, waiting and on-board times, transfer location wait times [14–16], and multilane short-term traffic flow [17] and for creating the multitime step deep neural network [18].

The bus is running on fixed lines with fixed stations. The spatial relationship between its stations determines the arrival times in the time series. Thus, this study used an RNN to predict the arrival time of a bus. A route of a bus has 30–40 bus stations in general. Arrival time prediction includes the time prediction of each station along the way from the starting to the finishing stop, the arrival times at subsequent stations, and the arrival time of the nearest vehicle to a station. This study first analyzed the bus arrival time. Based on the analysis, the input eigenvectors of a neural network were defined, and then, seven RNN models for predicting the arrival time from four categories were tested. Then, the proposed model was trained by the measured data of arrival and departure times of the buses in a route of Linyi, Shandong Province. Then, the multistep prediction of the arrival time was carried out.

This paper is organized as follows. Section 2 describes the theoretical background and introduces the recurrent neural network. Section 3 describes the pretreatment and analysis of data. Section 4 discusses the analysis result of the RNN model. Finally, Section 5 concludes this study.

2. Theoretical Background

A recurrent neural network (RNN) [19] has a feedback structure that processes sequential data for time-series prediction or classification. RNN is widely used in various applications, and new models using it have been suggested such as LSTM, GRU, and ConvLSTM. According to the data in this study, we divided the prediction into four categories and adopted a multistep prediction for bus arrival times. The time-series input data is essential for the prediction with optimal feature extraction and memory efficiency. The data is processed in an RNN with internal feedback and feedforward connection, which retain and reflect the state or memory of a long context window [20]. The RNN suffers from a common disadvantage of the gradient disappearance (gradient vanishing) and gradient explosion problem [21–23], which results in limited applications due to training problems. To solve the problems, Hochreiter et al. [24] proposed and continued improving LSTM for different applications [25, 26]. LSTM specializes in memorizing long sequences and effectively avoiding the problem of gradient disappearance. Hidden layers of LSTM use memory blocks that store the previous sequence information, while increasing the performance of three gates: input, output, and forget gates. These control the sequence information for memory. The gated recurrent unit (GRU) [27] is a modestly simplified LSTM. GRU combines the forget and input gate into an update gate and the cell and hidden state. A model with GRU is simpler and has less activation function and output computation than the standard LSTM model.

2.1. Pure LSTM and Pure GRU Model

Figure 1 shows the hidden units of LSTM which are replaced by memory blocks.

[figure omitted; refer to PDF]

Calculating $c_{t}$ and $h_{t}$ requires the following equations: $\begin{matrix} (1) & i_{t} = σ W_{i} X_{t} + U_{i} h_{t - 1} + b_{i} Input gate, \\ f_{t} = σ W_{f} X_{t} + U_{f} h_{t - 1} + b_{f} Forget gate, \\ o_{t} = σ W_{o} X_{t} + U_{o} h_{t - 1} + b_{o} Output gate, \\ {\tilde{c}}_{t} = tan h W_{c} X_{t} + U_{c} h_{t - 1} + b_{c} New memory cell, \\ c_{t} = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ {\tilde{c}}_{t} Memory cell, \\ h_{t} = o_{t} ⊙ tan h c_{t} . \end{matrix}$

In these equations, ⊙ Hadamard product is the multiplication of the corresponding elements in the operation matrix, $W_{i}$ , $W_{f}$ , $W_{o}$ , and $W_{c}$ are the weights of $X_{t}$ , $U_{i}$ , $U_{f}$ , $U_{o}$ , and $U_{c}$ are the weights of $h_{t - 1}$ , $b_{i}$ , $b_{f}$ , $b_{o}$ , and $b_{c}$ are the bias conditions, σ is the sigmoid function, and $tan h$ is the hyperbolic tangent function.

Figure 2 shows the GRU. There is only one hidden state $h_{t}$ in GRU. Through the linear transformation of the input tensor and hidden state, the weighted sum of the hidden state inflow is calculated with equations (2) and (3). The linear transformation for $r_{t}$ , h_t−1, and the input tensor is combined with the activation function of equation (4) to calculate the updated value of the hidden state. The mixed weight for calculation of the implicit state in the previous step is shown in equation (5). The final output $h_{t}$ is the same as LSTM. Compared with LSTM, there is one less activation function calculation and output calculation as well as the final hidden state update, so the calculation is relatively simple.

[figure omitted; refer to PDF]

These models use a single layer of LSTM and GRU. In the input layer, variables are, such as route, direction, vehicle model, and driver, also regarded as a part of the time sequence.

2.2. Multi-Input Model Separated by Time Series

As the variable is not sensitive to any specific ordering, the RNN cannot process it alone. However, a BP network can process through a connection layer. Thus, the integration of RNN and BP was used for the prediction network (Figure 4).

[figure omitted; refer to PDF]

The integrated network was in accordance with the characteristics of the input data. A two-part network used the time series-related input data such as route number, driver, departure time, and route length for LSTM processing. Through a connection layer, the prediction layer was processed. Since time series input data became shorter even with the addition of LSTM, the total trainable parameters were not significantly increased compared with pure LSTM.

2.3. LSTM Stacking Model

To achieve better accuracy of the prediction than a single layer, a multilayer LSTM was employed. Stacking four LSTMs had hidden units in 256, 128, 64, and 32 layers, respectively. Figure 5(a) shows the diagram of the stacking models. There is also a two-way LSTM composition, in which the forward and backward connections also employ a reverse projection function, which is suitable in our case to verify arrival time predictions. Figure 5(b) shows the diagram of the bidirectional network models.

[figure omitted; refer to PDF]

The results reveal the following:

(1) The GRU was more efficient than the LSTM model with fewer parameters and considerable accuracy

(2) The LSTM models except the ConvLSTM had more parameters and higher network accuracy than other models

(3) The dataset property did not influence the results of the models but the complexity of the models

(4) The ConvLSTM showed the highest accuracy as it processed the data of time and space, which indicated the need to include the space-related properties

In the process of arrival time-series prediction, the arrival times at subsequent bus stops were based on those at the previous bus stops. The ConvLSTM network model was selected to analyze the prediction accuracy through one- and two-step prediction and total time prediction.

Figure 12 shows the test sample set on the x-axis and the difference between the predicted and real values on the y-axis. The mean and RMSE were calculated from the mean values and mean square deviation of the differences. The one-step prediction had the highest accuracy, and the total time prediction (multistep prediction) showed the lowest accuracy. The regularity in the histogram of Figure 12 reveals that the one-step prediction has the smallest deviation and the highest error, which is related to the accumulation and propagation of errors in the prediction of the arrival times of the subsequent bus stops.

[figures omitted; refer to PDF]

5. Conclusion

The public transport system is a complex system with a high degree of uncertainty. The system is understood as a multistep prediction problem in which uncertainty leads to poor prediction accuracy. This paper first analyzed the main variables affecting this uncertainty, and then, the variables such as route, direction, vehicle, driver, departure hour, departure minute, day of the week, holiday, distance from the starting location, and weather were selected. The arrival time series before the current bus stops was also selected. These variables fully reflected the impact on the arrival time-series prediction. Among RNN networks for time-series analysis, we processed the data by using seven different network models in four different types of networks.

We analyzed and compared the predictive power of the seven RNN models with the variables and parameters in the measured dataset. We noticed an improvement in prediction accuracy by adding variables in one- and two-step prediction models, but not in the multistep (total time prediction) model. The multistep model increased the network complexity only. The ConvLSTM showed the highest prediction accuracy with spatiotemporal data. The statistics of one-, two-, and multistep prediction showed that the accumulation and propagation of the sequence prediction error caused more steps and a large deviation of the predicted time. The accurate bus arrival time prediction encourages more people to use buses for transportation and allows operating companies to optimize bus schedules for increasing the efficiency of their operation. This also improves the traffic condition in cities.

Accurate bus arrival information also relieves the anxiety of users by decreasing waiting time and helps to provide passengers with an improved service. The accurate prediction of bus arrival times can be integrated into an intelligent bus scheduling system in a smart transportation system. Such a system improves the management of a public transport system, increases the economic benefits of the system, and ultimately brings social benefits.

Acknowledgments

This work was supported by the Fujian Province Natural Fund Project (Grant no. 2020J01263), Science and Technology Planning Foreign Cooperation Project of Longyan (Grant no. 2019LYF7003), Open Fund Project of Fujian University Engineering Research Center for Disaster Prevention and Mitigation of Southeast Coastal Engineering Structure of Putian University (Grant no. 2019005), and Open Foundation Project of Fujian Provincial Key Laboratory of Higher Education (Putian University) (Grant no. ST19004).

References

[1] National Bureau of Statistics, Statistical Bulletin of the People’s Republic of China on National Economic and Social Development, 2020.

[2] H. Lu, Z. Sun, W. Qu, "Big data and its applications in urban intelligent transportation system," Journal of Transportation Systems Engineering and Information Technology, vol. 15, pp. 45-52, 2015.

[3] D. Li, Y. Yao, Z. Shao, "Big data in smart city," Geomatics and Information Science of Wuhan University, vol. 39, pp. 631-640, 2017.

[4] F. Xie, J. Gu, S. Zhang, "Predicting model of bus arrival time based on Map reduce clustering and neural network," Journal of Computer Application, vol. 37, pp. 118-129, 2017.

[5] L. Wang, Q. Su, R. Zheng, "Bus arrival time prediction based on Elman’s dynamic neural network," Mechanical & Electrical Technology, vol. 35, pp. 135-139, 2012.

[6] Y. Ji, J. Lu, X. Chen, "Prediction model of bus arrival time based on particle swarm optimization and wavelet neural network," Journal of Transportation Systems Engineering and Information Technology, vol. 16, pp. 60-66, 2016.

[7] Z. Wang, K. Fu, J. Ye, "Learning to estimate the travel time," Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; ACM, pp. 858-866, .

[8] J. Lu, L. Sun, Q. Shi, "Prediction of bus arrival time based on gated recurrent unit neural networks," Journal of Nantong University (Natural Science Edition), vol. 19, pp. 43-49, 2020.

[9] Q. Han, K. Liu, L. Zeng, G. He, L. Ye, F. Li, "A bus arrival time prediction method based on position calibration and LSTM," IEEE Access, vol. 8, pp. 42372-42383, DOI: 10.1109/ACCESS.2020.2976574, 2020.

[10] A. A. Agafonov, A. S. Yumaganov, "Bus arrival time prediction using recurrent neural network with LSTM architecture," Optical Memory and Neural Networks, vol. 28 no. 3, pp. 222-230, DOI: 10.3103/S1060992X19030081, 2019.

[11] W. Xiangxue, X. Lunhui, C. Kaixun, "Data-driven short-term forecasting for urban road network traffic based on data processing and LSTM-RNN," Arabian Journal for Science and Engineering, vol. 44 no. 4, pp. 3043-3060, DOI: 10.1007/s13369-018-3390-0, 2019.

[12] Z. Huang, Q. Li, F. Li, J. Xia, "A novel bus-dispatching model based on passenger flow and arrival time prediction," IEEE Access, vol. 7, pp. P106453-P106465, DOI: 10.1109/access.2019.2932801, 2019.

[13] X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W. Wong, W. Woo, "Convolutional LSTM network," A Machine Learning Approach for Precipitation Nowcasting, vol. 1, 2015.

[14] Y. Lai, L. Zhang, F. Yang, W. Lu, T. Wang, "Bus arrival time prediction algorithm based on the spatio-temporal correlation attribute model," Ruan Jian Xue Bao/Journal of Software, vol. 31 no. 3, pp. 648-662, 2020.

[15] H. Liu, H. Xu, Y. Yan, Z. Cai, T. Sun, W. Li, "Bus arrival time prediction based on LSTM and spatial-temporal feature vector," IEEE Access, vol. 8, pp. 11917-11929, DOI: 10.1109/ACCESS.2020.2965094, 2020.

[16] P. He, G. Jiang, S.-K. Lam, Y. Sun, "Learning heterogeneous traffic patterns for travel time prediction of bus journeys," Information Sciences, vol. 512, pp. 1394-1406, DOI: 10.1016/j.ins.2019.10.073, 2020.

[17] Y. Ma, Z. Zhang, A. Ihler, "Multi-lane short-term traffic forecasting with convolutional LSTM network," IEEE Access, vol. 8, pp. 34629-34643, DOI: 10.1109/ACCESS.2020.2974575, 2020.

[18] N. C. Petersen, F. Rodrigues, F. C. Pereira, "Multi-output bus travel time prediction with convolutional LSTM neural network," Expert Systems with Applications, vol. 120, pp. 426-435, DOI: 10.1016/j.eswa.2018.11.028, 2019.

[19] R. J. Williams, J. Peng, "An efficient gradient-based algorithm for on-line training of recurrent network trajectories," Neural Computation, vol. 2 no. 4, pp. 490-501, DOI: 10.1162/neco.1990.2.4.490, 1990.

[20] Z. C. Lipton, J. Berkowitz, C. Elkan, "A critical review of recurrent neural networks for sequence learning," 2015. https://arxiv.org/abs/1506.00019

[21] Lechevallier, Saporta - 2010 - in Proceedings of COMPSTAT’2010.pdf

[22] T. Zhang, "Solving large scale linear prediction problems using stochastic gradient descent algorithms," International Conference on Machine Learning. Omnipress., vol. 116, 2004.

[23] K. S. Tai, R. Socher, C. D. Manning, "Improved semantic representations from tree-structured long short-term memory networks," 2015. https://arxiv.org/abs/1503.00075

[24] S. Hochreiter, J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9 no. 8,DOI: 10.1162/neco.1997.9.8.1735, 1997.

[25] J. Schmidhuber, F. Cummins, "Learning to forget: continual prediction with LSTM," Proceedings of the 1999 Ninth International Conference on Artificial Neural Networks ICANN 99,DOI: 10.1049/cp:19991218, .

[26] F. A. Gers, N. N. Schraudolph, J. Schmidhuber, "Learning precise timing with LSTM recurrent networks," Journal of Machine Learning Research, vol. 29, 2002.

[27] J. Chung, C. Gulcehre, K. Cho, Y. Bengio, "Empirical evaluation of gated recurrent neural networks on sequence modeling," , 2014. https://arxiv.org/abs/1412.3555

Word count: 2705

Show less

Copyright © 2021 Zhi-Ying Xie et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/

Abstract

Translate

Accurate predictions of bus arrival times help passengers arrange their trips easily and flexibly and improve travel efficiency. Thus, it is important to manage and schedule the arrival times of buses for the efficient deployment of buses and to ease traffic congestion, which improves the service quality of the public transport system. However, due to many variables disturbing the scheduled transportation, accurate prediction is challenging. For accurate prediction of the arrival time of a bus, this research adopted a recurrent neural network (RNN). For the prediction, the variables affecting the bus arrival time were investigated from the data set containing the route, a driver, weather, and the schedule. Then, a stacked multilayer RNN model was created with the variables that were categorized into four groups. The RNN model with a separate multi-input and spatiotemporal sequence model was applied to the data of the arrival and leaving times of a bus from all of a Shandong Linyi bus route. The result of the model simulation revealed that the convolutional long short-term memory (ConvLSTM) model showed the highest accuracy among the tested models. The propagation of error and the number of prediction steps influenced the prediction accuracy.

Details

Title

Multistep Prediction of Bus Arrival Time with the Recurrent Neural Network

Author

Zhi-Ying Xie¹; Yuan-Rong, He¹; Chih-Cheng, Chen²

; Qing-Quan, Li³; Chia-Chun, Wu⁴

¹ School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China; Digital Fujian Institute of Natural Disaster Monitoring Big Data, Xiamen, Fujian 361024, China
² Department of Automatic Control Engineering, Feng Chia University, Taichung 40724, Taiwan; Department of Aeronautical Engineering, Chaoyang University of Technology, Taichung 413, Taiwan
³ Shenzhen Key Laboratory of Spatial Smart Sensing and Services, Shenzhen University, Shenzhen 518060, China
⁴ Department of Industrial Engineering and Management, National Quemoy University, Kinmen 892, Taiwan

Editor

Bosheng Song

Publication year

2021

Publication date

2021

Publisher

John Wiley & Sons, Inc.

ISSN

1024123X

e-ISSN

15635147

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1155/2021/6636367

ProQuest document ID

2503353187

Multistep Prediction of Bus Arrival Time with the Recurrent Neural Network

Jump to:

Full text

Abstract

Details

Suggested sources