This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
At present, commonly used passenger flow prediction methods are based on historical data including time-series methods, support vector machines, and neural networks [1–3]. For instance, Ni et al. [4] applied the autoregressive moving average (ARIMA) method to solve traffic flow prediction and proved that it can solve the problem of modeling about nonstationary time-series prediction. Xie et al. [5] designed the fuzzy time-series ARIMA method for long-term waterway traffic volume prediction. Li et al. [6] proposed a robust v-support vector regression (RSVR) method to forecast vessel traffic flow. Liu et al. [7] adopted a support vector machine- (SVM-) based regression prediction to predict the bus passenger flow in the target time window. Li et al. [8] put forward a backpropagation neural network (BPNN) model with population per distance band for traffic flow prediction of urban rail transit station. Hu et al. [9] developed a model re-sample recurrent neural network (RRNN) to forecast passenger traffic on mass rapid transit systems.
Due to the different advantages and disadvantages of various prediction methods, the prediction effect of a single mechanism prediction method is often not ideal. If two or more methods are organically combined to form a hybrid prediction method, it will overcome the deficiencies of a single prediction mechanism and improve the performance of passenger flow prediction [10, 11]. Khan et al. [12] combined wavelet transform (WT) with artificial neural network (ANN) and ARIMA into a hybrid model for meteorological drought forecasting, and the model inherits the merits of both WT and ANN-ARIMA. Wu et al. [13] created a hybrid model of ARIMA and wavelet neural network (WNN) combined with genetic algorithm to predict the river water quality. Yu et al. [14] built a novel SVR-ANN combined model with EEMD for rainfall prediction. Luo et al. [15] explored a combined prediction model based on the empirical mode decomposition, support vector regression, and wavelet neural network (EMD-SVR-WNN) to forecast the structural settlement and deformation. The above models achieved satisfactory results. It can be found that SVR and neural network are suitable for solving complex nonlinear problems, and the time-series model has great advantages for time-based prediction. However, there are still some inherent defects in the neural network model, such as ease of sinking into local optimization and the overfitting. Therefore, the SVR and time-series method are selected for hybrid prediction.
In this thesis, a combination of differential integrated moving average autoregressive model (ARIMA) and fuzzy support vector regression machine (FSVR) is used to implement a mixed forecasting strategy for railway passenger flow. And, apply it to the actual passenger flow forecast of Shanghai-Guangzhou high-speed railway in order to obtain good forecast performance. Support vector regression (SVR) is a general learning method based on the statistical learning theory of limited samples (SLT) [16]. Fuzzy support vector regression (FSVR) is a new type of support vector regression machine that combines fuzzy mathematics and support vector regression. It introduces fuzzy membership and improves the generalization of machine learning ability. According to the theory of time-series analysis, the ARIMA model is suitable for the prediction and analysis of stationary time series, and the passenger flow data is generally nonstationary series, which needs to be smoothed by difference. Therefore, the differential autoregressive moving average model (ARIMA) is used to predict passenger flow.
2. ARIMA
Differential autoregressive moving average model (ARIMA) is an important method for studying time series. In ARIMA (p, d, q), AR is autoregressive and p is the number of autoregressive items, MA is the moving average, q is the moving average item number, and d is the number of differences made to make it a stationary sequence. The ARIMA (p, d, q) model is an extension of the ARMA (p, q) model.
The basic form of the ARMA model is
After passing the difference, the basic form of the ARIMA model is
3. Fuzzy Support Vector Regression
The principle of FSVR is to find a function by minimizing the prediction error, use the nonlinear mapping function
In practical applications, different data points contribute differently to the training results, so FSVR solves the problem of overlearning due to the presence of noisy data by introducing fuzzy parameters to eliminate the influence of noise [18], that is, there is a fuzzy degree and each data point is connected so that a training set with fuzzy members will be generated.
For FSVR, let the training set be
The boundary conditions are
FSVR is for solving quadratic programming problems:
The dual form of equation (5):
Solving the dual problem (6), we can get the FSVR regression function:
4. Experiments
Using the high-speed rail passenger flow between Shanghai and Guangzhou as experimental data, the passenger flow is obtained by day, a total of 176 days of sample data are collected, the first 165 days of sample data are used to build the model, and the last 8 days of sample data are used as test samples to predict comparative analysis.
In order to reduce the computational complexity and accuracy of parameter selection, the raw data is normalized. Table 1 shows part of the passenger flow data.
Table 1
Part of the passenger flow data.
SN | Date | Passenger flow | Normalization |
1 | 20190517 | 1431 | 0.4630 |
2 | 20190518 | 1421 | 0.4475 |
3 | 20190519 | 1466 | 0.5170 |
4 | 20190520 | 1779 | 1.0000 |
5 | 20190524 | 1213 | 0.1265 |
6 | 20190525 | 1219 | 0.1358 |
7 | 20190526 | 1138 | 0.0108 |
8 | 20190527 | 1131 | 0.0000 |
Using the ARIMA model to predict the values, the results are as follows.
It can be seen from the prediction results shown in Figure 1, and the ARIMA model can realize the prediction and analysis of railway passenger traffic. The fluctuation of its prediction results is consistent with the actual passenger traffic curve, but there is a large delay phenomenon which causes a large prediction error and the prediction effect is not ideal.
[figure omitted; refer to PDF]
Based on FSVR’s passenger flow prediction, the results are as follows.
It can be seen from the prediction results shown in Figure 2, and the FSVR has a strong nonlinear approximation ability; it has shown good prediction performance in the process of railway passenger traffic forecast, especially in the short-term passenger traffic forecast; its prediction error is small, and the passenger traffic continues to increase or continue. The prediction error is small during the decrease, but at the extreme point, where the passenger traffic trend changes, that is, the passenger traffic changes from increasing to decreasing, or from decreasing to increasing, the prediction error is large. In other words, the dramatic fluctuations in passenger traffic reduce the generalization ability of FSVR and affect its prediction performance.
[figure omitted; refer to PDF]
Using the above ARIMA forecast results as the input items of FSVR, the mixed forecast of railway passenger traffic is realized. The results are as follows.
It can be seen from the prediction results shown in Figures 3 and 4 that the hybrid prediction method can combine the advantages of the two prediction methods to obtain the best prediction results. Compared with the ARIMA method, the delay of the hybrid method prediction results is greatly improved; compared with the FSVR, the prediction effect at the extreme point is significantly improved, and the prediction error is greatly reduced.
[figure omitted; refer to PDF]
In order to prove the performance of the proposed algorithm, it is compared with the ARIMA-WNN method and the EMD-SVR-WNN method. The results of the three hybrid prediction methods are shown in Figure 5.
[figure omitted; refer to PDF]
It can be seen from the prediction results in Figure 5 that, though the ARIMA-WNN method is accurate in the early prediction, it gradually appears the phenomenon of delay after 4 days. The overall trend of the EMD-SVR-WNN method is consistent with the original data; however, the overall predicted value is small. Compared with the above two methods, the prediction results of the ARIMA-FSVR method are more accurate. The forecast error indexes of various methods are shown in Table 2.
Table 2
Forecast error indexes.
ARIMA | FSVR | ARIMA-WNN | EMD-SVR-WNN | ARIMA-FSVR | |
RMSE | 77.9183 | 24.0677 | 55.5990 | 69.5719 | 10.7347 |
Correlation coefficient | −0.0318 | 0.9065 | 0.4907 | 0.4849 | 0.9822 |
0.9404 | 0.0019 | 0.2170 | 0.2233 | 0.0000 |
It can be seen from Table 2 that the standard error of the ARIMA-FSVR prediction is smaller than the ARIMA and FSVR methods. It is also smaller than the other two hybrid methods. The correlation coefficient of the ARIMA-FSVR method is less than 0.0001, and the
It can be found from the experimental results that the ARIMA-FSVR method can accurately predict the railway passenger traffic, handle complex nonlinear relationships, and obtain satisfactory prediction results.
5. Conclusions
In this paper, a new hybrid method was successfully proposed which achieved great improvements regarding both the prediction accuracy and robustness of the single-item models:
(1) The ARIMA-FSVR hybrid prediction method overcame the shortcomings exposed in the single-item forecasting method, and it can improve the ARIMA delay phenomenon.
(2) The ARIMA-FSVR hybrid prediction method surmounts the extreme point problem of the FSVR method.
(3) Empirical studies on the realistic passenger flow data indicated that the ARIMA-FSVR hybrid method was clearly superior to other benchmark hybrid models. This hybrid method obtained the lowest prediction error and had higher accuracy and more reliable prediction results.
In conclusion, the ARIMA-FSVR hybrid method can accurately predict the railway passenger traffic, overcoming the shortcomings of the single-item forecasting method and, at the same time, merging the advantages of single-item forecasting and improving the accuracy of the forecast. This method effectively solves the nonlinear problem of railway traffic data and provides a new and effective method for the nonlinear prediction problem in practical applications.
Acknowledgments
The project was supported by Science and Technology Research Project of Beijing-Shanghai High Speed Railway Co., Ltd. (Grant no. Beijing-Shanghai Scientific Research-2020-2), Scientific Research Projects of China Academy of Railway Sciences Co., Ltd. (Grant no. 2019YJ120), and Science and Technology Research and Development Plan of China Railway (Grant no. K2019X022).
[1] G. Ren, J. Gao, "Comparison of NARNN and ARIMA models for short-term metro passenger flow forecasting," Proceedings of the 19th COTA International Conference of Transportation Professionals g,DOI: 10.1061/9780784482292.119, .
[2] Y. Sun, B. Leng, W. Guan, "A novel wavelet-SVM short-time passenger flow prediction in Beijing subway system," Neurocomputing, vol. 166, pp. 109-121, DOI: 10.1016/j.neucom.2015.03.085, 2015.
[3] T.-H. Tsai, C.-K. Lee, C.-H. Wei, "Neural network based temporal feature models for short-term railway passenger demand forecasting," Expert Systems with Applications, vol. 36 no. 2, pp. 3728-3736, DOI: 10.1016/j.eswa.2008.02.071, 2009.
[4] L. Ni, X. Chen, H. Qian, "ARIMA model for traffic flow prediction based on wavelet analysis," Proceedings of the 2nd International Conference on Information Science and Engineering IEEE, pp. 1028-1031, DOI: 10.1109/ICISE.2010.5690910, .
[5] Y. Xie, P. Zhang, Y. Chen, "A fuzzy ARIMA correction model for transport volume forecast," Mathematical Problems in Engineering, vol. 2021,DOI: 10.1155/2021/6655102, 2021.
[6] M.-W. Li, D.-F. Han, W.-L. Wang, "Vessel traffic flow forecasting by RSVR with chaotic cloud simulated annealing genetic algorithm and KPCA," Neurocomputing, vol. 157, pp. 243-255, DOI: 10.1016/j.neucom.2015.01.010, 2015.
[7] W. Liu, Q. Tan, W. Wu, "Forecast and early warning of regional bus passenger flow based on machine learning," Mathematical Problems in Engineering, vol. 2020,DOI: 10.1155/2020/6625435, 2020.
[8] J. Li, M. Yao, Q. Fu, "Forecasting method for urban rail Transit ridership at station level using Back propagation neural network," Discrete Dynamics in Nature and Society, vol. 2016,DOI: 10.1155/2016/9527584, 2016.
[9] R. Hu, Y. C. Chiu, C. W. Hsieh, T. H. Chang, L. Liao, "Mass Rapid Transit system passenger traffic forecast using a Re-sample recurrent neural network," Journal of Advanced Transportation, vol. 2019,DOI: 10.1155/2019/8943291, 2019.
[10] S. Li, X. Liu, A. Lin, "Fractional frequency hybrid model based on EEMD for financial time series forecasting," Communications in Nonlinear Science and Numerical Simulation, vol. 89,DOI: 10.1016/j.cnsns.2020.105281, 2020.
[11] M. A. Jallal, A. González-Vidal, A. F. Skarmeta, S. Chabaa, A. Zerouala, "A hybrid neuro-fuzzy inference system-based algorithm for time series forecasting applied to energy consumption prediction," Applied Energy, vol. 268,DOI: 10.1016/j.apenergy.2020.114977, 2020.
[12] M. M. H. Khan, N. S. Muhammad, A. El-Shafie, "Wavelet based hybrid ANN-ARIMA models for meteorological Drought forecasting," Journal of Hydrology, vol. 590,DOI: 10.1016/j.jhydrol.2020.125380, 2020.
[13] J. Wu, Z. B. Li, L. Zhu, C. Li, "Hybrid model of ARIMA model and GAWNN for dissolved oxygen content prediction," Transactions of the Chinese Society for Agricultural Machinery, vol. 48, pp. 205-210, DOI: 10.6041/j.issn.1000-1298.2017.S0.033, 2017.
[14] X. Yu, G. Ling, L. He, S. Xia, W. Wang, "A SVR-ANN combined model based on ensemble emd for rainfall prediction," Applied Soft Computing, vol. 73,DOI: 10.1016/j.asoc.2018.09.018, 2018.
[15] X. Luo, W. Gan, L. Wang, Y. Chen, X. Meng, "A prediction model of structural settlement based on EMD-SVR-WNN," Advances in Civil Engineering, vol. 2020 no. 4,DOI: 10.1155/2020/8831965, 2020.
[16] X. Luo, D. Li, S. Zhang, "Traffic flow prediction during the holidays based on DFT and SVR," Journal of Sensors, vol. 2019 no. 10,DOI: 10.1155/2019/6461450, 2019.
[17] P. Huang, C. Wen, Li. P. Fu, Q. Y. Peng, Z. C. Li, "A hybrid model to improve the train running time prediction ability during high-speed railway disruptions," Safety Science, vol. 122,DOI: 10.1016/j.ssci.2019.104510, 2019.
[18] T. Bahraini, S. Ghazi, H. S. Yazdi, "Toward optimum fuzzy support vector machines using error distribution," Engineering Applications of Artificial Intelligence, vol. 90,DOI: 10.1016/j.engappai.2020.103545, 2020.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2021 Meng Ge et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/
Abstract
In order to improve the prediction accuracy of railway passenger traffic, an ARIMA model and FSVR are combined to propose a hybrid prediction method. The ARIMA prediction model is established based on the known railway passenger traffic data, and then, the ARIMA prediction results are used as the training set of the FSVR method. At the same time, the air price and historical passenger traffic data are introduced to predict the future passenger traffic, to realize the mixed prediction of railway passenger traffic. The case study demonstrates that the hybrid prediction method can effectively improve the prediction performance of railway passenger traffic. Compared with the single ARIMA method, the hybrid prediction method improves the delay of the prediction results. Compared with the FSVR prediction result, the hybrid prediction method greatly reduces the errors in the extreme points of passenger traffic and long-term prediction. The relevant research results of this paper provide a useful reference for the prediction of railway passenger traffic.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer