Abstract
Analysis of crash injury severity is a promising research target in highway safety studies. A better understanding of crash severity risk factors is vital for the proactive implementation of suitable countermeasures. In literature, crash injury severity was widely studied using statistical models. Though these models have a sound theoretical basis and interpretability, they were based on several unrealistic assumptions, which, if flouted, may yield biased model estimations. To overcome the limitations of statistical models, applied machine learning has rapidly emerged on the horizon of highway safety analysis. This study aims to model injury severity of motor vehicle crashes using three advanced machine learning approaches, i.e., vanilla multi-layer perceptron (MLP) using Keras, MLP with embedding layers, and TabNet. Among the three models, TabNet may be considered a fairly complex framework which is based on attention-based network for tabular data. To improve the predictive performance of proposed models, hyperparameter tuning was carried out using the Bayesian optimization technique. Different evaluation metrics (i.e., accuracy, precision, recall, F−1 score, AUC, and training time) were utilized to compare all the models' injury severity classification performance. Experimental results showed that all the models yielded similar and adequate performance based on most of the evaluation metrics. However, based on training time, the Keras (MLP) model outperformed other models with a training time of 3.45 s which represents a reduction of 51% and 93% compared to MLP with embedding layers and TabNet, respectively. Feature importance analysis conducted using TabNet revealed that predictors such as number of vehicles involved, number of casualties, speed limit, junction location, vehicle type, and road type are the most sensitive variables aggravating the injury severity. The proposed supervised deep learning models supported by feature importance analysis make the modeling framework transparent and interpretable. The outcome of this study could provide essential guidance for practitioners for taking timely and concrete steps to improve highway safety. Moreover, this research will allow trauma and emergency centers to predict possible damage from a traffic accident and deploy the necessary emergency units to offer appropriate emergency treatment.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Haghighi N, Liu XC, Zhang G, Porter RJ (2018) Impact of roadway geometric features on crash severity on rural two-lane highways. Accid Anal Prev 111:34–42
Zheng L, Hou Q, Meng X (2020) Comparison of modelling methods accounting for temporal correlation in crash counts. J Transp Saf Secur 12:245–262
Xing F, Huang H, Zhan Z, Zhai X, Ou C, Sze NN, Hon KK (2019) Hourly associations between weather factors and traffic crashes: non-linear and lag effects. Anal Methods Accid Res 24:100109
Cafiso S, Di Graziano A, Di Silvestro G, La Cava G, Persaud B (2010) Development of comprehensive accident models for two-lane rural highways using exposure, geometry, consistency and context variables. Accid Anal Prev 42:1072–1079
Lord D, Mannering F (2010) The statistical analysis of crash-frequency data: a review and assessment of methodological alternatives. Transp Res Part A Policy Pract 44:291–305
Ambros J, Jurewicz C, Turner S, Kieć M (2018) An international review of challenges and opportunities in development and use of crash prediction models. Eur Transp Res Rev 10:35
Jamal A (2020) The dilemma of road safety in the eastern province of Saudi Arabia: consequences and prevention strategies. IJERP 5:1–23
Tauhidur Rahman M, Jamal A, Al-Ahmadi HM (2020) Examining hotspots of traffic collisions and their spatial relationships with land use: a GIS-based geographicallyweighted regression approach for Dammam, Saudi Arabia. ISPRS Int J Geo-Inf 9:1–22. https://doi.org/10.3390/ijgi9090540
Christoforou Z, Cohen S, Karlaftis MG (2010) Vehicle occupant injury severity on highways: an empirical investigation. Accid Anal Prev 42:1606–1620
Iranitalab A, Khattak A (2017) Comparison of four statistical and machine learning methods for crash severity prediction. Accid Anal Prev 108:27–36
Jeong H, Jang Y, Bowman PJ, Masoud N (2018) Classification of motor vehicle crash injury severity: a hybrid approach for imbalanced data. Accid Anal Prev 120:250–261
Ma C, Hao W, Xiang W, Yan W (2018) The impact of aggressive driving behavior on driver-injury severity at highway-rail grade crossings accidents. J Adv Transp 2018:1–10
Mesa-Arango R, Valencia-Alaix VG, Pineda-Mendez RA, Eissa T (2018) Influence of socioeconomic conditions on crash injury severity for an urban area in a developing country. Transp Res Rec 2672:41–53
Azimi G, Rahimi A, Asgari H, Jin X (2020) Severity analysis for large truck rollover crashes using a random parameter ordered logit model. Accid Anal Prev 135:105355
Rifaat SM, Chin HC (2007) Accident severity analysis using ordered probit model. J Adv Transp 41:91–114
Shankar V, Mannering F (1996) An exploratory multinomial logit analysis of single-vehicle motorcycle accident severity. J Saf Res 27:183–194. https://doi.org/10.1016/0022-4375(96)00010-2
Yu R, Abdel-Aty M (2014) Using hierarchical Bayesian binary probit models to analyze crash injury severity on high speed facilities with real-time traffic data. Accid Anal Prev 62:161–167. https://doi.org/10.1016/j.aap.2013.08.009
Huang H, Chin HC, Haque MM (2008) Severity of driver injury and vehicle damage in traffic crashes at intersections: a Bayesian hierarchical analysis. Accid Anal Prev 40:45–54. https://doi.org/10.1016/j.aap.2007.04.002
Li Z, Chen C, Ci Y, Zhang G, Wu Q, Liu C, Qian ZS (2018) Examining driver injury severity in intersection-related crashes using cluster analysis and hierarchical Bayesian models. Accid Anal Prev 120:139–151
Khattak AJ, Kantor P, Council FM (1998) Role of adverse weather in key crash types on limited-access: Roadways implications for advanced weather systems. Transp Res Rec 1621:10–19. https://doi.org/10.3141/1621-02
Chen C, Zhang G, Huang H, Wang J, Tarefder RA (2016) Examining driver injury severity outcomes in rural non-interstate roadway crashes using a hierarchical ordered logit model. Accid Anal Prev 96:79–87
O’Donnell CJ, Connor DH (1996) Predicting the severity of motor vehicle accident injuries using models of ordered multiple choice. Accid Anal Prev 28:739–753. https://doi.org/10.1016/S0001-4575(96)00050-4
Lee C, Abdel-Aty M (2008) Presence of passengers: does it increase or reduce driver’s crash potential? Accid Anal Prev 40:1703–1712. https://doi.org/10.1016/j.aap.2008.06.006
Aguero-Valverde J, Jovanis PP (2009) Bayesian multivariate Poisson lognormal models for crash severity modeling and site ranking. Transp Res Rec 2136:82–91
Russo BJ, Savolainen PT, Schneider WH IV, Anastasopoulos PC (2014) Comparison of factors affecting injury severity in angle collisions by fault status using a random parameters bivariate ordered probit model. Anal Methods Accid Res 2:21–29
Zeng Q, Wen H, Huang H, Pei X, Wong SC (2017) A multivariate random-parameters tobit model for analyzing highway crash rates by injury severity. Accid Anal Prev 99:184–191
Shankar V, Mannering F, Barfield W (1996) Statistical analysis of accident severity on rural freeways. Accid Anal Prev 28:391–401. https://doi.org/10.1016/0001-4575(96)00009-7
Osman M, Paleti R, Mishra S, Golias MM (2016) Analysis of injury severity of large truck crashes in work zones. Accid Anal Prev 97:261–273
Milton JC, Shankar VN, Mannering FL (2008) Highway accident severities and the mixed logit model: an exploratory empirical analysis. Accid Anal Prev 40:260–266. https://doi.org/10.1016/j.aap.2007.06.006
Wang J, Huang H, Xu P, Xie S, Wong SC (2020) Random parameter probit models to analyze pedestrian red-light violations and injury severity in pedestrian-motor vehicle crashes at signalized crossings. J Transp Saf Secur 12:818–837
Malyshkina NV, Mannering FL (2009) Markov switching multinomial logit model: an application to accident-injury severities. Accid Anal Prev 41:829–838. https://doi.org/10.1016/j.aap.2009.04.006
Xiong Y, Tobias JL, Mannering FL (2014) The analysis of vehicle crash injury-severity data: a Markov switching approach with road-segment heterogeneity. Transp Res Part B Methodol 67:109–128. https://doi.org/10.1016/j.trb.2014.04.007
Eluru N, Bhat CR (2007) A Joint econometric analysis of seat belt use and crash-related injury severity. Accid Anal Prev 39:1037–1049. https://doi.org/10.1016/j.aap.2007.02.001
Huang H, Siddiqui C, Abdel-Aty M (2011) Indexing crash worthiness and crash aggressivity by vehicle type. Accid Anal Prev 43:1364–1370. https://doi.org/10.1016/j.aap.2011.02.010
Li Z, Wu Q, Ci Y, Chen C, Chen X, Zhang G (2019) Using latent class analysis and mixed logit model to explore risk factors on driver injury severity in single-vehicle crashes. Accid Anal Prev 129:230–240
Chen S, Zhang S, Xing Y, Lu J (2020) Identifying the factors contributing to the severity of truck-involved crashes in shanghai river-crossing tunnel. Int J Environ Res Public Health 17:3155
Abdel-Aty M (2003) Analysis of driver injury severity levels at multiple locations using ordered probit models. J Saf Res 34:597–603. https://doi.org/10.1016/j.jsr.2003.05.009
Hu S-R, Li C-S, Lee C-K (2010) Investigation of key factors for accident severity at railroad grade crossings by using a logit model. Saf Sci 48:186–194. https://doi.org/10.1016/j.ssci.2009.07.010
Yasmin S, Eluru N, Ukkusuri SV (2014) Alternative ordered response frameworks for examining pedestrian injury severity in New York City. J Transp Saf Secur 6:275–300
Kamruzzaman M, Haque MM, Washington S (2014) Analysis of traffic injury severity in Dhaka, Bangladesh. Transp Res Rec 2451:121–130. https://doi.org/10.3141/2451-14
Kim J-K, Ulfarsson GF, Shankar VN, Mannering FL (2010) A note on modeling pedestrian-injury severity in motor-vehicle crashes with the mixed logit model. Accid Anal Prev 42:1751–1758. https://doi.org/10.1016/j.aap.2010.04.016
Tulu GS, Washington S, Haque MM, King MJ (2017) Injury severity of pedestrians involved in road traffic crashes in Addis Ababa, Ethiopia. J Transp Saf Secur 9:47–66
Rusli R, Haque MM, Saifuzzaman M, King M (2018) Crash severity along rural mountainous highways in Malaysia: an application of a combined decision tree and logistic regression model. Traffic Inj Prev 19:741–748
Chang F, Xu P, Zhou H, Chan AHS, Huang H (2019) Investigating injury severities of motorcycle riders: a two-step method integrating latent class cluster analysis and random parameters logit model. Accid Anal Prev 131:316–326. https://doi.org/10.1016/j.aap.2019.07.012
Ullah I, Liu K, Yamamoto T, Zahid M, Jamal A (2021) Electric vehicle energy consumption prediction using stacked generalization: an ensemble learning approach. Int J Green Energy 18:896–909
Zahid M, Chen Y, Khan S, Jamal A, Ijaz M, Ahmed T (2020) Predicting risky and aggressive driving behavior among taxi drivers: do spatio-temporal attributes matter? Int J Environ Res Public Health 17:3937
Zahid M, Chen Y, Jamal A, Al-Ahmadi HM, Al-Ofi AK (2020) Adopting machine learning and spatial analysis techniques for driver risk assessment: insights from a case study. Int J Environ Res Public Health 17(14):1–15. https://doi.org/10.3390/ijerph17145193
Anderson TK (2009) Kernel density estimation and K-means clustering to profile road accident hotspots. Accid Anal Prev 41:359–364. https://doi.org/10.1016/j.aap.2008.12.014
Mauro R, De Luca M, Dell’Acqua G (2013) Using a K-means clustering algorithm to examine patterns of vehicle crashes in before-after analysis. Mod Appl Sci 7:11
Fiorentini N, Losa M (2020) Handling imbalanced data in road crash severity prediction by machine learning algorithms. Infrastructures 5:61. https://doi.org/10.3390/infrastructures5070061
Abdelwahab HT, Abdel-Aty MA (2001) Development of artificial neural network models to predict driver injury severity in traffic accidents at signalized intersections. Transp Res Rec 1746:6–13. https://doi.org/10.3141/1746-02
Zeng Q, Huang H (2014) A stable and optimized neural network model for crash injury severity prediction. Accid Anal Prev 73:351–358. https://doi.org/10.1016/j.aap.2014.09.006
Amiri AM, Sadri A, Nadimi N, Shams M (2020) A comparison between artificial neural network and hybrid intelligent genetic algorithm in predicting the severity of fixed object crashes among elderly drivers. Accid Anal Prev 138:105468
Assi K (2020) Prediction of traffic crash severity using deep neural networks: a comparative study. In: Proceedings of the 2020 international conference on innovation and intelligence for informatics, computing and technologies (3ICT), IEEE, pp 1–6
Li Z, Liu P, Wang W, Xu C (2012) Using support vector machine models for crash injury severity analysis. Accid Anal Prev 45:478–486
Dong N, Huang H, Zheng L (2015) Support vector machine in crash prediction at the level of traffic analysis zones: assessing the spatial proximity effects. Accid Anal Prev 82:192–198. https://doi.org/10.1016/j.aap.2015.05.018
Mokhtarimousavi S, Anderson JC, Azizinamini A, Hadi M (2019) Improved support vector machine models for work zone crash injury severity prediction and analysis. Transp Res Rec 2673:680–692
Assi K, Rahman SM, Mansoor U, Ratrout N (2020) Predicting crash injury severity with machine learning algorithm synergized with clustering technique: a promising protocol. IJERPH 17:5497. https://doi.org/10.3390/ijerph17155497
Chen C, Zhang G, Yang J, Milton JC (2016) An explanatory analysis of driver injury severity in rear-end crashes using a decision table/naïve Bayes (DTNB) hybrid classifier. Accid Anal Prev 90:95–107
Arhin SA, Gatiba A (2020) Predicting crash injury severity at unsignalized intersections using support vector machines and naïve Bayes classifiers. Transp Saf Environ 2:120
Budiawan W, Saptadi S, Tjioe C, Phommachak T (2019) Traffic accident severity prediction using naive Bayes algorithm—a case study of Semarang toll road. Proc IOP Conf Ser Mater Sci Eng 598:012089
Zhang J, Li Z, Pu Z, Xu C (2018) Comparing prediction performance for crash injury severity among various machine learning and statistical methods. IEEE Access 6:60079–60087
Mondal AR, Bhuiyan MAE, Yang F (2020) Advancement of weather-related crash prediction model using nonparametric machine learning algorithms. SN Appl Sci 2:1–11
Abellán J, López G, de Oña J (2013) Analysis of traffic accident severity using decision rules via decision trees. Expert Syst Appl 40:6047–6054. https://doi.org/10.1016/j.eswa.2013.05.027
de Oña J, López G, Abellán J (2013) Extracting decision rules from police accident reports through decision trees. Accid Anal Prev 50:1151–1160
Lu P, Zheng Z, Ren Y, Zhou X, Keramati A, Tolliver D, Huang Y (2020) A gradient boosting crash prediction approach for highway-rail grade crossing crash analysis. J Adv Transp 2020:1–10
Alkheder S, Taamneh M, Taamneh S (2017) Severity prediction of traffic accident using an artificial neural network: traffic accident severity prediction using artificial neural network. J Forecast 36:100–108. https://doi.org/10.1002/for.2425
Jamal A, Umer W (2020) Exploring the injury severity risk factors in fatal crashes with neural network. IJERPH 17:7466. https://doi.org/10.3390/ijerph17207466
Wang W, Liu C, Chen D (2011) Predicting driver injury severity in freeway rear-end crashes by support vector machine. In: Proceedings of the proceedings 2011 international conference on transportation, mechanical, and electrical engineering (TMEE), IEEE, ChangChun, China, pp 1800–1803
Bernard JM (2017) An application of decision tree models to examine motor vehicle crash severity outcomes. Proc J Transp Res Forum 2017(56):73
Chong MM, Abraham A, Paprzycki M (2004) Traffic accident analysis using decision trees and neural networks. arXiv preprint cs/0405050
Ghasemzadeh A, Ahmed MM (2017) A probit-decision tree approach to analyze effects of adverse weather conditions on work zone crash severity using second strategic highway research program roadway information dataset
Ijaz M, Lan L, Zahid M, Jamal A (2021) A comparative study of machine learning classifiers for injury severity prediction of crashes involving three-wheeled motorized rickshaw. Accid Anal Prev 154:106094. https://doi.org/10.1016/j.aap.2021.106094
AlMamlook RE, Kwayu KM, Alkasisbeh MR, Frefer AA (2019) Comparison of machine learning algorithms for predicting traffic accident severity. In: Proceedings of the 2019 IEEE Jordan international joint conference on electrical engineering and information technology (JEEIT), IEEE, pp 272–276
Princess PJB, Silas S, Rajsingh EB (2021) Performance comparison of machine learning models for classification of traffic injury severity from imbalanced accident dataset. In: Intelligence in Big Data Technologies—Beyond the Hype, Springer, pp 361–369
Mokoatle M, Vukosi Marivate D, Michael Esiefarienrhe Bukohwo P (2019) Predicting road traffic accident severity using accident report data in South Africa. In: Proceedings of the proceedings of the 20th annual international conference on digital government research, pp 11–17
Pradhan B, Sameen MI (2020) Predicting injury severity of road traffic accidents using a hybrid extreme gradient boosting and deep neural network approach. In: Laser scanning systems in highway and safety assessment, Springer, pp 119–127
Assi K (2020) Traffic crash severity prediction—a synergy by hybrid principal component analysis and machine learning models. IJERPH 17:7598. https://doi.org/10.3390/ijerph17207598
Mansoor U, Ratrout NT, Rahman SM, Assi K (2020) Crash severity prediction using two-layer ensemble machine learning model for proactive emergency management. IEEE Access 8:210750–210762. https://doi.org/10.1109/ACCESS.2020.3040165
Sohn SY, Shin H (2001) Pattern recognition for road traffic accident severity in Korea. Ergonomics 44:107–117
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Guo C, Berkhahn F (2016) Entity embeddings of categorical variables. arXiv preprint arXiv:1604.06737
Arik SO, Pfister T (2019) Tabnet: attentive interpretable tabular learning. arXiv preprint arXiv:1908.07442
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. arXiv preprint arXiv:1706.03762
Wu J, Chen X-Y, Zhang H, Xiong L-D, Lei H, Deng S-H (2019) Hyperparameter optimization for machine learning models based on bayesian optimization. J Electron Sci Technol 17:26–40
Breiman L (2001) Random forests. Mach Learn 45:5–32
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232
Fisher A, Rudin C, Dominici F (2019) All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. J Mach Learn Res 20:1–81
Elgeldawi E, Sayed A, Galal AR, Zaki AM (2021) Hyperparameter tuning for machine learning algorithms used for Arabic sentiment analysis. Proc Inform 8:79
Eriksson D, Pearce M, Gardner J, Turner RD, Poloczek M (2019) Scalable global optimization via local Bayesian optimization. Adv Neural Inf Process Syst 32:5496–5507
Jamal A, Zahid M, Tauhidur Rahman M, Al-Ahmadi HM, Almoshaogeh M, Farooq D, Ahmad M (2021) Injury severity prediction of traffic crashes with ensemble machine learning techniques: a comparative study. Int J Inj Control Saf Promot 28:408–427
Chang L-Y, Chien J-T (2013) Analysis of driver injury severity in truck-involved accidents using a non-parametric classification tree model. Saf Sci 51:17–22
Razi-Ardakani H, Mahmoudzadeh A, Kermanshah M (2019) What factors results in having a severe crash? A closer look on distraction-related factors. Cogent Eng 6:1708652
Acknowledgements
The authors would like to acknowledge the financial support provided by the Deanship of Scientific Research at King Fahd University of Petroleum & Minerals (KFUPM) under Research Grant SB201021.
Funding
This research was funded by the Deanship of Research Oversight and Coordination at King Fahd University of Petroleum & Minerals (KFUPM) under Research Grant SB201021.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sattar, K., Chikh Oughali, F., Assi, K. et al. Transparent deep machine learning framework for predicting traffic crash severity. Neural Comput & Applic 35, 1535–1547 (2023). https://doi.org/10.1007/s00521-022-07769-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07769-2