Abstract
As one of the main public transport systems all over the world, mass rapid transit (MRT) is widely served in the metropolitan areas. To meet the increasing travel demands in the future, accurately predicting MRT passenger flow is becoming more and more urgent and crucial. This paper aims to use an experimental way to objectively quantify and analyze the impacts of various combinations of traditional input features to improve the accuracy of MRT passenger flow prediction. We have built a series of passenger flow prediction models with different input features using a random forest approach. The features of passenger flow direction, temporal date, national holiday, lunar calendar date, previous average hourly passenger flow, and previous k-step hourly passenger flow and their trends are selected and applied in a multi-stage of the input feature combination. The typical encoding strategies of the input features have been further discussed and implemented. Finally, the optimal combination of the input features has been proposed with a case study at Taipei Main Station. The experimental results show that the proposed optimal combination of the input features and their appropriate codes can be helpful to improve the accuracy of passenger flow prediction, not only for the prediction results on weekdays and weekends, but also for them on national holidays.
Similar content being viewed by others
References
Breiman L (2001) Random forests. Mach Learn 45:5–32
Cai C, Yao E, Wang M, Zhang Y (2014) Prediction of urban railway station’s entrance and exit passenger flow based on multiply ARIMA model. J Beijing Jiaotong Univ 38(2):135–140
Chen MC, Wei Y (2011) Exploring time variants for short-term passenger flow. J Trans Geogr 19(4):488–498
Chen C, Wang Y, Li L, Hu J, Zhang Z (2012) The retrieval of intra-day trend and its influence on traffic prediction. Transp Res Part C Emerg Technol 22:103–118
Chen R, Liang CY, Hong WC, Gu DX (2015) Forecasting holiday daily tourist flow based on seasonal support vector regression with adaptive genetic algorithm. Appl Soft Comput 26:435–443
Chen G, Duan MZ, Zhang L (2016a) Urban rail transit network traffic dynamic optimization estimation. Comput Simulat 33(7):210–212, 233
Chen YL, Sha YW, Zhu XL, Zhang XH (2016b) Prediction of shanghai metro line 16 passenger flow based on time series analysis-with Lingang avenue station as a study case. Oper Res Fuzz 6(1):15–26
Feng SW, Li Q (2013) Car ownership control in Chinese mega cities: Shanghai, Beijing and Guangzhou [Online]. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3106623. Accessed 1 Sep 2013
Friedman J, Hastie T, Tibshirani R (2008) The elements of statistical learning: data mining, inference and prediction. Springer series in statistics, 2nd edn. Springer, New York
Hou LM, Ma GF (2011) Forecast of railway passenger traffic based on a grey linear regression combined model. Comput Simulat 28(7):1–3 (30)
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning with applications in R. Springer, New York
Jia Y, He P, Liu S, Cao L (2016) A combined forecasting model for passenger flow based on GM and ARMA. Int J Hybrid Inf Techn 9(2):215–226
Jiang X, Zhang L, Chen XM (2014) Short-term forecasting of high-speed rail demand: A hybrid approach combining ensemble empirical mode decomposition and gray support vector machine with real-world applications in China. Transp Res Part C Emerg Technol 44:110–127
Jiao PP, Li R, Sun T, Hou ZH, Ibrahim A (2016) Three revised kalman filtering models for short-term rail transit passenger flow prediction. Math Probl Eng 2016:1–10
Li JT, Yang JF (2007) Prediction of Dalian station passenger volume based on RBF neural network. J Dalian Jiaotong Univ 28(1):32–34
Li Y, Wang XD, Sun S, Ma XL, Lu GQ (2017) Forecasting short-term subway passenger flow under special events scenarios using multiscale radial basis function networks. Transp Res part C Emerg Technol 77:306–328
Li JQ, Liu L, Zhou MC, Yang JJ, Chen S, Liu HT, Wang Q, Pan H, Sun ZH, Tan F (2018) Feature selection and prediction of small-for-gestational-age infants. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-018-0892-2
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22
Liu LJ, Chen RC (2017) A novel passenger flow prediction model using deep learning methods. Transp Res Part C Emerg Technol 84:74–91
Ma Z, Xing J, Mesbah M, Ferreira L (2014) Predicting short-term bus passenger demand using a pattern hybrid approach. Transp Res Part C Emerg Technol 39:148–163
Milenković M, Švadlenka L, Melichar V, Bojović N, Avramović Z (2016) SARIMA modeling approach for railway passenger flow forecasting. Transp. https://doi.org/10.3846/16484142.2016.1139623
Ministry of transport (2015) 7 reasons to love public transport in 2015 [Online]. https://www.mot.gov.sg/Transport-Matters/Public-Transport/7-reasons-to-love-public-transport-in-2015/. Accessed 2 Jun 2015
Moreira-Matias L, Gama J, Ferreira M, Mendes-Moreira J, Damas L (2013) Predicting taxi-passenger demand using streaming data. IEEE Trans Intell Transp Syst 14(3):1393–1402
Ni M, He Q, Gao J (2017) Forecasting the subway passenger flow under event occurrences with social media. IEEE Trans Intell Transp Syst 18(6):1623–1632
Pirbhulal S, Zhang H, Wu WQ, Mukhopadhyay SC, Zhang YT (2018) Heart-beats based biometric random binary sequences generation to secure wireless body sensor networks. IEEE Trans Biomed Eng. https://doi.org/10.1109/TBME.2018.2815155
Su Y, Sun W (2017) Dynamic differential models for studying traffic flow and density. J Ambient Intell Human Comput. https://doi.org/10.1007/s12652-017-0506-4
Sun YX, Leng B, Guan W (2015) A novel wavelet-SVM short-time passenger flow prediction in Beijing subway system. Neurocomput 166:109–121
Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43(6):1947–1958
Taipei open government (2017) Inbound and outbound station-level passenger flow in Taipei metro in 2015 and 2016 [Online]. http://data.taipei/opendata/datalist/datasetMeta?oid=1d71c478-205f-42c5-8386-35f86d74fdd1. Accessed 26 May 2017
Taipei Rapid Transit Corporation (2017) Size of Taipei Metro [Online]. http://english.metro.taipei/ct.asp?xItem=1315555&ctNode=70214&mp=122036. Accessed 4 May 2017
Tsai TH, Lee CK, Wei CH (2009) Neural network based temporal feature models for short-term railway passenger demand forecasting. Expert Syst Appl 36(2):3728–3736
Wei Y, Chen MC (2012) Forecasting the short-term metro passenger flow with empirical mode decomposition and neural networks. Transp Res part C Emerg Technol 21(1):148–162
Wikipedia (2018a) Taichung City Bus [Online]. https://en.wikipedia.org/wiki/Taichung_City_Bus. Accessed 2 Sep 2018
Wikipedia (2018b) Overfitting [Online]. https://en.wikipedia.org/wiki/Overfitting. Accessed 15 Oct 2018
Wu WQ, Pirbhulal S, Zhang H, Mukhopadhyay SC (2018) Quantitative assessment for self-tracking of acute stress based on triangulation principle in wearable sensor system. IEEE J Biomed Health. https://doi.org/10.1109/JBHI.2018.2832069
Xie G, Wang S, Lai KK (2014) Short-term forecasting of air passenger by using hybrid seasonal decomposition and least squares support vector regression approaches. J Air Transp Manag 37:20–26
Zhang Y, Haghani A (2015) A gradient boosting method to improve travel time prediction. Transp Res part C Emerg Technol 58:308–324
Zhang CH, Song R, Sun Y (2011) Kalman filter-based short-term passenger flow forecasting on bus stop. J Trans Syst Eng Inf Technol 11(4):154–159
Acknowledgements
This work was supported by Ministry of Science and Technology, Taiwan, R.O.C. (Grant No.MOST-107-2221-E-324-018-MY2; MOST-106-2218-E-324-002); National Natural Science Foundation of China (Grant No. 61672442), Science and Technology Planning Project of Fujian Province, China (Grant No. 2016Y0079), and Young Teacher Education and Research Development Project of Fujian Province (Grant No. JAT170416).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, L., Chen, RC., Zhao, Q. et al. Applying a multistage of input feature combination to random forest for improving MRT passenger flow prediction. J Ambient Intell Human Comput 10, 4515–4532 (2019). https://doi.org/10.1007/s12652-018-1135-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-018-1135-2