Skip to main content
Log in

A parallel grid-search-based SVM optimization algorithm on Spark for passenger hotspot prediction

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Predicting passenger hotspots helps drivers quickly pick up travelers, reduces cruise expenses, and maximizes revenue per unit time in intelligent transportation systems. To improve the accuracy and robustness of passenger hotspot prediction (PHP), this paper proposes a parallel Grid-Search-based Support Vector Machine (GS-SVM) optimization algorithm on Spark, which provides an efficient methodology to search for passengers in a complex urban traffic network quickly. Specifically, to effectively locate passenger hotspots, an urban road network is gridded on the Spark parallel distributed computing platform. Moreover, to enhance the accuracy of PHP, the grid search (GS) approach is employed to optimize the radial basis function (RBF) of the support vector machine (SVM), and the cross-validation method is utilized to find out the global optimal parameter combination. Finally, the SVM optimization algorithm is implemented on Spark to improve the robustness of PHP. In particular, the proposed GS-SVM algorithm is applied to successfully predict passenger hotspots. By analyzing seven groups of data sets and comparing with serval state-of-the-art algorithms including autoregressive integrated moving average (ARIMA), support vector regression (SVR), long short-term memory (LSTM), and convolutional neural network (CNN), the results of an empirical study indicate that the MAPE value of our GS-SVM algorithm is lower than that of comparative algorithms at least 78.4%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Ali A, Zhu Y, Zakarya M (2021) A data aggregation based approach to exploit dynamic spatio-temporal correlations for citywide crowd flows prediction in fog computing. Multimed Tools Appl, pp 1–33

  2. Bashir M, Ashraf J, Habib A, Muzammil M (2020) An intelligent linear time trajectory data compression framework for smart planning of sustainable metropolitan cities. Transactions on Emerging Telecommunications Technologies e3886

  3. Boeing G (2021) Spatial information and the legibility of urban form: Big data in urban morphology. Int J Inf Manag 56:102013

    Article  Google Scholar 

  4. Chen L, Zheng L, Yang J, Xia D, Liu W (2020) Short-term traffic flow prediction: From the perspective of traffic flow decomposition. Neurocomputing 413:444–456

    Article  Google Scholar 

  5. García FT, Villalba LJG, Orozco ALS, Kim T-H (2019) A comparison of learning methods over raw data: Forecasting cab services market share in new york city. Multimed Tools Appl 78:29783–29804

    Article  Google Scholar 

  6. Gong Y, Jia L (2019) Research on SVM environment performance of parallel computing based on large data set of machine learning. The Journal of Supercomputing 75:5966–5983

    Article  Google Scholar 

  7. Hao S, Lee D-H, Zhao D (2019) Sequence to sequence learning with attention mechanism for short-term passenger flow prediction in large-scale metro system. Transportation Research Part C: Emerging Technologies 107:287–300

    Article  Google Scholar 

  8. Huan L, Lu Z (2020) Identification method of residents’ medical travel behavior characteristics driven by mobile signaling data: A case study of kunshan. In: 2020 5Th international conference on information science, computer technology and transportation (ISCTT), IEEE, pp 198–207

  9. Huang Z, Xu J, Zhan G, Zheng N, Xu M, Tu L (2019) Passenger searching from taxi traces using HITS-based inference model. In: 2019 20Th IEEE international conference on mobile data management (MDM), IEEE, pp 1440–149

  10. Jamil MS, Akbar S (2017) Taxi passenger hotspot prediction using automatic ARIMA model. In: 2017 3Rd international conference on science in information technology (ICSITech), IEEE, pp 23–28

  11. Ke J, Zheng H, Yang H, Chen XM (2017) Short-term forecasting of passenger demand under on-demand ride services: a spatio-temporal deep learning approach. Transportation Research Part C: Emerging Technologies 85:591–608

    Article  Google Scholar 

  12. Kim T, Sharda S, Zhou X, Pendyala RM (2020) A stepwise interpretable machine learning framework using linear regression LR and long short-term memory LSTM: City-wide demand-side prediction of yellow taxi and for-hire vehicle FHV service. Transportation Research Part C: Emerging Technologies 120:1–15

    Article  Google Scholar 

  13. Kuang L, Yan X, Tan X, Li S, Yang X (2019) Predicting taxi demand based on 3D convolutional neural network and multi-task learning. Remote Sens 11:1265

    Article  Google Scholar 

  14. Li W, Luo Q, Cai Q (2019) Coordination of last train transfers using potential passenger demand from public transport modes. IEEE Access 7:126037–126050

    Article  Google Scholar 

  15. Li X, Pan G, Wu Z, Qi G, Li S, Zhang D, Zhang W, Wang Z (2012) Prediction of urban human mobility using large-scale taxi traces and its applications. Frontiers of Computer Science 6:111–121

    MathSciNet  Google Scholar 

  16. Li W, Wang X, Zhang Y, Wu Q (2021) Traffic flow prediction over muti-sensor data correlation with graph convolution network. Neurocomputing 427:50–63

    Article  Google Scholar 

  17. Li M, Yan M, He H, Peng J (2021) Data-driven predictive energy management and emission optimization for hybrid electric buses considering speed and passengers prediction. Journal of Cleaner Production, pp 127139

  18. Li X, Zhang Y, Du M, Yang J (2020) The forecasting of passenger demand under hybrid ridesharing service modes: A combined model based on WT-FCBF-LSTM. Sustainable Cities and Society 62:1–39

    Article  Google Scholar 

  19. Liu D, Cheng S-F, Yang Y (2015) Density peaks clustering approach for discovering demand hot spots in city-scale taxi fleet dataset. In: 2015 IEEE 18th international conference on intelligent transportation systems, IEEE, pp 1831–1836

  20. Liu S, Pu J, Luo Q, Qu H, Ni LM, Krishnan R (2013) VAIT: A visual analytics system for metropolitan transportation. IEEE Trans Intell Transp Syst 14:1586–1596

    Article  Google Scholar 

  21. Liu L, Wu C, Zhang H, Naji HAH, Chu W, Atombo C Research on taxi drivers’ passenger hotspot selecting patterns based on GPS data: A case study in Wuhan. In: 2017 4Th international conference on transportation information and safety (ICTIS), IEEE, pp 432–441

  22. Luo H, Cai J, Zhang K, Xie R, Zheng L (2020) A multi-task deep learning model for short-term taxi demand forecasting considering spatiotemporal dependences. Journal of Traffic and Transportation Engineering (English Edition), pp 1–12

  23. Markou I, Kaiser K, Pereira FC (2019) Predicting taxi demand hotspots using automated internet search queries. Transportation Research Part C: Emerging Technologies 102:73–86

    Article  Google Scholar 

  24. Mouratidis K (2021) Urban planning and quality of life: A review of pathways linking the built environment to subjective well-being. Cities 115:1–12

    Article  Google Scholar 

  25. Mridha S, Ghosh S, Singh R, Bhattacharya S, Ganguly N Mining Twitter and taxi data for predicting taxi pickup hotspots. In: Proceedings of the 2017 IEEE/ACM international conference on advances in social networks analysis and mining, pp 27–30

  26. Mu B, Dai M (2019) Recommend taxi pick-up hotspots based on density-based clustering. In: 2019 IEEE 2Nd international conference on computer and communication engineering technology (CCET), IEEE, pp 176–181

  27. Niu K, Cheng C, Chang J, Zhang H, Zhou T (2018) Real-time taxi-passenger prediction with l-CNN. IEEE Trans Veh Technol 68:4122–4129

    Article  Google Scholar 

  28. Ou J, Sun J, Zhu Y, Jin H, Liu Y, Zhang F, Huang J, Wang X (2020) Stp-trellisnets: Spatial-temporal parallel trellisnets for metro station passenger flow prediction. In: Proceedings of the 29th ACM international conference on information and knowledge management, association for computing machinery, pp 1185–1194

  29. Peng H, Wang H, Du B, Bhuiyan MZA, Ma H, Liu J, Wang L, Yang Z, Du L, Wang S (2020) Spatial temporal incidence dynamic graph neural networks for traffic flow forecasting. Inf Sci 521:277–290

    Article  Google Scholar 

  30. Qin L, Li W, Li S (2019) Effective passenger flow forecasting using STL and ESN based on two improvement strategies. Neurocomputing 356:244–256

    Article  Google Scholar 

  31. Qu B, Yang W, Cui G, Wang X (2019) Profitable taxi travel route recommendation based on big taxi trajectory data. IEEE Trans Intell Transp Syst 21:653–668

    Article  Google Scholar 

  32. Saadallah A, Moreira-Matias L, Sousa R, Khiari J, Jenelius E, Gama J (2020) Bright-drift-aware demand predictions for taxi networks. IEEE Trans Knowl Data Eng 32:234–245

    Article  Google Scholar 

  33. Sai J, Wang B, Wu B Bppgd: Budgeted parallel primal gradient descent kernel SVM on Spark. In: 2016 IEEE First international conference on data science in cyberspace (DSC), IEEE, pp 74–79

  34. Shen J, Deng RH, Cheng Z, Nie L, Yan S (2015) On robust image spam filtering via comprehensive visual modeling. Pattern Recogn 48:3227–3238

    Article  Google Scholar 

  35. Shen J, Wang HH (2020) Fusion effect of SVM in Spark architecture for speech data mining in cluster structure. Int J Speech Technol 23:481–488

    Article  Google Scholar 

  36. Silva RA, Pires JM, Datia N, Santos MY, Martins B, Birra F (2019) Visual analytics for spatiotemporal events. Multimed Tools Appl 78:32805–32847

    Article  Google Scholar 

  37. Smith BL, Williams BM, Oswald RK (2002) Comparison of parametric and nonparametric models for traffic flow forecasting. Transportation Research Part C: Emerging Technologies 10:303–321

    Article  Google Scholar 

  38. Wang L, Qian X, Zhang Y, Shen J, Cao X (2020) Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE Trans Cybern 50:3330–3342

    Article  Google Scholar 

  39. Wang H, Xiao Y, Long Y (2017) Research of intrusion detection algorithm based on parallel SVM on Spark. In: 2017 7Th IEEE international conference on electronics information and emergency communication (ICEIEC), IEEE, pp 153–156

  40. Xia D, Lu X, Li H, Wang W, Li Y, Zhang Z (2018) A MapReduce-based parallel frequent pattern growth algorithm for spatiotemporal association analysis of mobile trajectory big data. Complexity 2018

  41. Xia D, Zhang M, Yan X, Bai Y, Zheng Y, Li Y, Li H (2021) A distributed WND-LSTM model on MapReduce for short-term traffic flow prediction. Neural Comput Applic 33:2393–2410

    Article  Google Scholar 

  42. Xu J, Rahmatizadeh R, Bölöni L, Turgut D (2017) Real-time prediction of taxi demand using recurrent neural networks. IEEE Trans Intell Transp Syst 19:2572–2581

    Article  Google Scholar 

  43. Yan B, Yang Z, Ren Y, Tan X, Liu E (2017) Microblog sentiment classification using parallel SVM in Apache Spark. In: 2017 IEEE International congress on big data (BigData Congress), IEEE, pp 282–288

  44. Yang X, Xue Q, Yang X, Yin H, Qu Y, Li X, Wu J (2021) A novel prediction model for the inbound passenger flow of urban rail transit. Information Sciences

  45. Yu H, Chen X, Li Z, Zhang G, Liu P, Yang J, Yang Y (2019) Taxi-based mobility demand formulation and prediction using conditional generative adversarial network-driven learning approaches. IEEE Trans Intell Transp Syst 20:3888–3899

    Article  Google Scholar 

  46. Zhang S, Tang J, Wang H, Wang Y, An S (2017) Revealing intra-urban travel patterns and service ranges from taxi trajectories. J Transp Geogr 61:72–86

    Article  Google Scholar 

  47. Zhao W, Gao Y, Ji T, Wan X, Ye F, Bai G (2019) Deep temporal convolutional networks for short-term traffic flow forecasting. IEEE Access 7:114496–114507

    Article  Google Scholar 

  48. Zhao T, Zhang B, He M, Wei Z, Zhou N, Yu J, Fan J (2018) Embedding visual hierarchy with deep networks for large-scale visual recognition. IEEE Trans Image Process 27:4740–4755

    Article  MathSciNet  Google Scholar 

  49. Zhou Y, Li J, Chen H, Wu Y, Wu J, Chen L (2020) A spatiotemporal attention mechanism-based model for multi-step citywide passenger demand prediction. Inf Sci 513:372–385

    Article  Google Scholar 

Download references

Acknowledgments

This work described in this paper was supported in part by the National Natural Science Foundation of China (Grant nos. 61762020, 62162012, 61773321, 62072061, and 62173278), the Science and Technology Talents Fund for Excellent Young of Guizhou (Grant no. QKHPTRC20195669), the Science and Technology Support Program of Guizhou (Grant no. QKHZC2021YB531), and the Scientific Research Platform Project of Guizhou Minzu University (Grant no. GZ-MUSYS[2021]04).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Dawen Xia or Huaqing Li.

Ethics declarations

Conflict of Interests

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xia, D., Zheng, Y., Bai, Y. et al. A parallel grid-search-based SVM optimization algorithm on Spark for passenger hotspot prediction. Multimed Tools Appl 81, 27523–27549 (2022). https://doi.org/10.1007/s11042-022-12077-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12077-x

Keywords

Navigation