skip to main content
10.1145/3297067.3297093acmotherconferencesArticle/Chapter ViewAbstractPublication PagesspmlConference Proceedingsconference-collections
research-article

Expressway Crash Prediction based on Traffic Big Data

Published: 28 November 2018 Publication History

Abstract

With the development of society, the number of vehicles increases rapidly. The vehicle plays an important role in people's life, however the problem of traffic safety caused by vehicles has also become increasingly prominent. In China, the high crash rate and casualty rate on expressways have always troubled traffic management department. So crash prediction on expressway becomes vital. Conventionally, crash prediction is based on traffic flow data. These data do not contain all the necessary factors. In this paper, we propose a method of prediction using real-world data, including historical accident data, road geometry data, vehicle speed data, and weather data. We treat the crash prediction problem as a binary classification problem. For classification, sample imbalanced is a great challenge in practice. Modifying sample weights is applied to handle this challenge. Three machine learning classification techniques, namely Random Forest (RF), Gradient Boosting Decision Tree (GBDT) and Xgboost, are considered to carry out the crash prediction task respectively. The best recall and precision rate of these models are respectively 0.764253 and 0.01062. The proposed method can be integrated into urban traffic control systems toward police dispatch and crash prevention.

References

[1]
http://www.mps.gov.cn/n2255079/n5590589/n5747791/n5778470/c5776516/content.html
[2]
Ren, H. et al. 2017. A Deep Learning Approach to the Prediction of Short-term Traffic Accident Risk. (2017).
[3]
Yuan, Q. et al. 2017. Cluster and factor analysis on data of fatal traffic crashes in China. International Conference on Transportation Information and Safety (2017), 211--224.
[4]
Chang, L.Y. et al. 2012. Analysis of Freeway Accident Frequency using Multivariate Adaptive Regression Splines. Procedia Engineering. 45, 2 (2012), 824--829.
[5]
Gill, G. et al. 2017. Investigation of Roadway Geometric and Traffic Flow Factors for Vehicle Crashes Using Spatiotemporal Interaction. ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. XLII-2/W7, (2017), 1163--1166.
[6]
Huang, Z. et al. 2017. Utilizing latent class logit model to predict crash risk. Ieee/acis International Conference on Computer and Information Science (2017), 161--165.
[7]
Ahmed, M.M. and Abdel-Aty, M.A. 2012. The Viability of Using Automatic Vehicle Identification Data for Real-Time Crash Prediction. IEEE Transactions on Intelligent Transportation Systems. 13, 2 (2012), 459--468.
[8]
Sun, J. and Sun, J. 2016. Real-time crash prediction on urban expressways: identification of key variables and a hybrid support vector machine model. Iet Intelligent Transport Systems. 10, 5 (2016), 331--337.
[9]
Abdel-Aty, M. et al. 2004. Predicting Freeway Crashes from Loop Detector Data by Matched Case-Control Logistic Regression. Transportation Research Record Journal of the Transportation Research Board. 1897, 1 (2004), 88--95.
[10]
Sun, P. et al. 2017. Traffic crash prediction based on incremental learning algorithm. IEEE International Conference on Big Data Analysis (2017), 182--185.
[11]
You, J. et al. 2017. Real-time crash prediction based on high definition monitoring systems. IEEE International Conference on Intelligent Transportation Engineering (2017), 208--211.
[12]
Chen, Q. et al. 2016. Learning deep representation from big and heterogeneous data for traffic accident inference. Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (2016), 338--344.
[13]
Abdel-Aty, M.A. and Pemmanaboina, R. 2006. Calibrating a real-time traffic crash-prediction model using archived weather and ITS traffic data. IEEE Transactions on Intelligent Transportation Systems. 7, 2 (2006), 167--174.
[14]
Xu, X. and Duan, L. 2017. Predicting Crash Rate Using Logistic Quantile Regression with Bounded Outcomes. IEEE Access. PP, 99 (2017), 1--1.
[15]
Alkheder, S. et al. 2016. Severity Prediction of Traffic Accident Using an Artificial Neural Network. Journal of Forecasting. 36, 1 (2016).
[16]
Najada, H.A. and Mahgoub, I. 2016. Big vehicular traffic Data mining: Towards accident and congestion prevention. Wireless Communications and Mobile Computing Conference (2016).
[17]
Rodriguez-Galiano, V.F. et al. 2012. An assessment of the effectiveness of a random forest classifier for land-cover classification. Isprs Journal of Photogrammetry & Remote Sensing. 67, 1 (2012), 93--104.
[18]
Wang, Y. et al. 2016. A mobile recommendation system based on logistic regression and Gradient Boosting Decision Trees. International Joint Conference on Neural Networks (2016), 1896--1902.
[19]
Chen, T. and Guestrin, C. 2016. XGBoost:A Scalable Tree Boosting System. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), 785--794.
[20]
Ly, A. et al. 2018. Analytic posteriors for Pearson's correlation coefficient. Statistica Neerlandica. 72, 1 (2018), 4--13.
[21]
Dai, J. and Xu, Q. 2013. Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Elsevier Science Publishers B. V.
[22]
Plackett, R.L. 1983. Karl Pearson and the Chi-Squared Test. International Statistical Review. 51, 1 (1983), 59--72.
[23]
He, H. and Garcia, E.A. 2009. Learning from Imbalanced Data. IEEE Transactions on Knowledge & Data Engineering. 21, 9 (2009), 1263--1284.
[24]
Holte, R. et al. 1989. Concept Learning and the Problem of Small Disjuncts. University of Texas at Austin.
[25]
Kohavi, R. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. International Joint Conference on Artificial Intelligence (1995), 1137--1143.
[26]
Pedregosa, F. et al. 2013. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 12, 10 (2013), 2825--28

Cited By

View all
  • (2025)Big Data Analytics in Behavior AnalysisMapping Human Data and Behavior With the Internet of Behavior (IoB)10.4018/979-8-3693-7545-7.ch009(205-238)Online publication date: 7-Feb-2025
  • (2024)The Effectiveness of Big Data-Driven Predictive Policing: Systematic ReviewJustice Evaluation Journal10.1080/24751979.2024.2371781(1-34)Online publication date: 5-Jul-2024
  • (2024)Explainable artificial intelligence for decarbonization: Alternative fuel vehicle adoption in disadvantaged communitiesInternational Journal of Sustainable Transportation10.1080/15568318.2024.231181318:5(393-407)Online publication date: 7-Feb-2024
  • Show More Cited By

Index Terms

  1. Expressway Crash Prediction based on Traffic Big Data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SPML '18: Proceedings of the 2018 International Conference on Signal Processing and Machine Learning
    November 2018
    177 pages
    ISBN:9781450366052
    DOI:10.1145/3297067
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 November 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Crash prediction
    2. feature extraction and selection
    3. machine learning
    4. sample imbalance

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    SPML '18

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)37
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Big Data Analytics in Behavior AnalysisMapping Human Data and Behavior With the Internet of Behavior (IoB)10.4018/979-8-3693-7545-7.ch009(205-238)Online publication date: 7-Feb-2025
    • (2024)The Effectiveness of Big Data-Driven Predictive Policing: Systematic ReviewJustice Evaluation Journal10.1080/24751979.2024.2371781(1-34)Online publication date: 5-Jul-2024
    • (2024)Explainable artificial intelligence for decarbonization: Alternative fuel vehicle adoption in disadvantaged communitiesInternational Journal of Sustainable Transportation10.1080/15568318.2024.231181318:5(393-407)Online publication date: 7-Feb-2024
    • (2024)Enhancing road safety with machine learningEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109086137:PAOnline publication date: 1-Nov-2024
    • (2023)Network-Level Safety Metrics for Overall Traffic Safety Assessment: A Case StudyIEEE Access10.1109/ACCESS.2022.322304611(17755-17778)Online publication date: 2023
    • (2023)BGCP-based traffic data imputation and accident detection applications for the national trunk highwayAccident Analysis & Prevention10.1016/j.aap.2023.107051186(107051)Online publication date: Jun-2023
    • (2022)Detection of Road Accidents Using Synthetically Generated Multi-Perspective Accident VideosIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2022.3222769(1-10)Online publication date: 2022
    • (2022)Identifying high crash risk segments in rural roads using ensemble decision tree-based modelsScientific Reports10.1038/s41598-022-24476-z12:1Online publication date: 21-Nov-2022
    • (2020)Analysis of Factors Affecting the Severity of Automated Vehicle Crashes Using XGBoost Model Combining POI DataJournal of Advanced Transportation10.1155/2020/88815452020(1-12)Online publication date: 18-Nov-2020
    • (2020)Prediction of Highway Tunnel Pavement Performance Based on Digital Twin and Multiple Time Series StackingAdvances in Civil Engineering10.1155/2020/88241352020:1Online publication date: 4-Dec-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media