Abstract
Traffic accidents are the leading cause of death and injury in many developed nations. Anyone utilizing the road can meet an accident at any moment of time. The type of collision also plays a role in determining who is accountable for the accident. The biggest advantage of classifying collisions in road accidents can pave a way for safer roads and reduced accident rates. A novel approach is proposed for classifying the type of collisions that might take place between vehicles and near by pedestrians, obstacles etc. on roads. A total of six hybrid classifiers are introduced in this article namely \(``{} \textit{XGBoost}\; classifier\; using\;ISSA''\), \(``{} \textit{XGBoost}\; classifier\; using\; ESSA''\), \(``{} \textit{XGBoost}\; classifier\; using\;\) \( \textit{TVBSSA}''\), \(``{} \textit{CatBoost}\; classifier\; using\) \(ISSA''\), \(``{} \textit{CatBoost}\; classifier\; using\; ESSA''\), and \(``{} \textit{CatBoost}\; classifier\; using\; \textit{TVBSSA}''\), The dataset considered in this article is the SWITRS dataset for classifying \(``Type\_of\_Collision''\). A total of 103000 accidents are considered when determining the \(``Type\_of\_Collision''\). It classifies the type of collisions using XGBoost algorithm, CatBoost Algorithm and three Nature Inspired Algorithms (NIA’s) have been used at the feature selection stage. The NIA’s considered for feature selection includes Improved Salp Swarm Algorithm (ISSA), Enhanced Salp Swarm Algorithm (ESSA), and Time-Varying Binary Salp Swarm Algorithm (TVBSSA). It is concluded that \(\textit{XGBoost}\; classifier\; using\; ISSA\) presents good stability with fewer hyper-parameters and the highest accuracy under different levels of training data volume. The value of Accuracy, Mean Square Error, and ROC-Auc in XGBoost using ISSA is 90.40, 0.1624 and 97.75, respectively. Moreover, the confusion matrix and evaluation metrics of \(\textit{XGBoost}\; classifier\; using\; ISSA\) performed better than the other two approaches. The findings of this study would be helpful in classifying the “type of collision”. These findings are highly significant in smart city projects to effectively establish timely proactive strategies and improve road traffic safety.







Similar content being viewed by others
Availability of data and material
Not Applicable
Code Availability
The PYTHON code written is available upon request to the corresponding author.
References
Petrović Đ, Mijailović R, Pešić D (2020) Traffic accidents with autonomous vehicles: type of collisions, manoeuvres and errors of conventional vehicles’ drivers. Trans Res Procedia 45:161–168
Gude A, Patrol CH (2020) California traffic collision data from switrs. [Online]. Available: https://www.kaggle.com/dsv/1671261
Thomas P, Frampton R (1999) Large and small cars in real-world crashes-patterns of use, collision types and injury outcomes. In: Annual Proceedings/Association for the Advancement of Automotive Medicine, vol 43. Association for the Advancement of Automotive Medicine, p 101
Sachelarie A, Gaiginschi R (2020) The investigation of pedestrians’ accident according the place where they are thrown. In IOP Conf Ser Mater Sci Eng, IOP Publishing 997(1):012131
Wood DP, Simms CK, Walsh D (2005) Vehicle-pedestrian collisions: Validated models for pedestrian impact and projection. Proc Inst Mech Eng D: J Automob Eng 219(2):183–195
Tiwari G (2020) Progress in pedestrian safety research. Int J Inj Control Saf Promot 27(1):35–43
Park Y, Garcia M (2020) Pedestrian safety perception and urban street settings. Int J Sustain Transp 14(11):860–871
Petrescu L, Petrescu A (2017) Vehicle-pedestrian collisions-aspects regarding pedestrian kinematics, dynamics and biomechanics. In: IOP Conf Ser Mater Sci Eng, IOP Publishing 252(1):012001
Rolison JJ, Regev S, Moutari S, Feeney A (2018) What are the factors that contribute to road accidents? an assessment of law enforcement views, ordinary drivers’ opinions, and road accident records. Accid Anal Prev 115:11–24
Gicquel L, Ordonneau P, Blot E, Toillon C, Ingrand P, Romo L (2017) Description of various factors contributing to traffic accidents in youth and measures proposed to alleviate recurrence. Front Psychiatry 8:94
Zhang X-F, Fan L (2013) A decision tree approach for traffic accident analysis of saskatchewan highways. In: 2013 26th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), pp 1–4.IEEE
Pu Z, Li Z, Jiang Y, Wang Y (2020) Full bayesian before-after analysis of safety effects of variable speed limit system. IEEE Trans Intell Transp Syst 22(2):964–976
Lv Y, Tang S, Zhao H (2009) Real-time highway traffic accident prediction based on the k-nearest neighbor method. In 2009 International Conference On Measuring Technology And Mechatronics Automation, vol 3, pp 547–550. IEEE
Hossain M, Muromachi Y (2012) A bayesian network based framework for real-time crash prediction on the basic freeway segments of urban expressways. Accid Anal Prev 45:373–381
Lin L, Wang Q, Sadek AW (2015) A novel variable selection method based on frequent pattern tree for real-time traffic accident risk prediction. Transp Res Part C Emerg Technol 55:444–459
Caliendo C, Guida M, Parisi A (2007) A crash-prediction model for multilane roads. Accid Anal Prev 39(4):657–670
Yu R, Abdel-Aty M (2013) Utilizing support vector machine in real-time crash risk evaluation. Accid Anal Prev 51:252–259
Beshah T, Ejigu D, Abraham A, Snasel V, Kromer P (2013) Mining pattern from road accident data: role of road user’s behaviour and implications for improving road safety. International journal of tomography and simulation 22(1):73–86
Priyanka A, Sathiyakumari K (2014) A comparative study of classification algorithm using accident data. Int J Comput Sci Eng Technol (IJCSET) 5(10):1018–1023
Chong MM, Abraham A, Paprzycki M (2004) Traffic accident analysis using decision trees and neural networks. arXiv:cs/0405050
Shiau Y-R, Tsai C-H, Hung Y-H, Kuo Y-T, et al (2015) The application of data mining technology to build a forecasting model for classification of road traffic accidents. Math Probl Eng, vol 2015
Zhang J, Li Z, Pu Z, Xu C (2018) Comparing prediction performance for crash injury severity among various machine learning and statistical methods. IEEE Access 6:60079–60087
Cigdem A, Ozden C (2018) Predicting the severity of motor vehicle accident injuries in adana-turkey using machine learning methods and detailed meteorological data. Int J Intell Syst Appl Eng 6(1):72–79
Ahmadi A, Jahangiri A, Berardi V, Machiani SG (2020) Crash severity analysis of rear-end crashes in california using statistical and machine learning classification methods. J Transp Saf Secur 12(4):522–546
Liao Y, Zhang J, Wang S, Li S, Han J (2018) Study on crash injury severity prediction of autonomous vehicles for different emergency decisions based on support vector machine model. Electronics 7(12):381
Wang J, Liu B, Fu T, Liu S, Stipancic J (2019) Modeling when and where a secondary accident occurs. Accid Anal Prev 130:160–166
Rezapour M, Molan AM, Ksaibati K (2020) Analyzing injury severity of motorcycle at-fault crashes using machine learning techniques, decision tree and logistic regression models. Int J Trans Sci Technol 9(2):89–99
Bahiru TK, Singh DK, Tessfaw EA (2018) Comparative study on data mining classification algorithms for predicting road traffic accident severity. In: 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT), pp 1655–1660. IEEE
Zong F, Xu H, Zhang H, et al (2013) Prediction for traffic accident severity: comparing the bayesian network and regression models. Mathematical Problems in Engineering, vol. 2013
Karthik L, Kumar G, Keswani T, Bhattacharyya A, Chandar SS, Bhaskara Rao K (2014) Protease inhibitors from marine actinobacteria as a potential source for antimalarial compound. PloS one 9(3):e90972
He Z, Yu W (2010) Stable feature selection for biomarker discovery. Comput Biol Chem 34(4):215–225
Kalina J (2014) Classification methods for high-dimensional genetic data. Biocybern Biomed Eng 34(1):10–18
Kahya MA, Altamir SA, Algamal ZY (2021) Improving whale optimization algorithm for feature selection with a time-varying transfer function. Numer Algebra Control Optim 11(1):87
Algamal ZY, Lee MH (2019) A two-stage sparse logistic regression for optimal gene selection in high-dimensional microarray data classification. Adv data anal class 13(3):753–771
Al-Fakih A, Algamal Z, Lee M, Aziz M, Ali H (2019) Qsar classification model for diverse series of antifungal agents based on improved binary differential search algorithm. SAR QSAR Environ Res 30(2):131–143
Hichem H, Elkamel M, Rafik M, Mesaaoud MT, Ouahiba C (2019) A new binary grasshopper optimization algorithm for feature selection problem. J King Saud Univ, Comp & Info,
Shrivastava P, Shukla A, Vepakomma P, Bhansali N, Verma K (2017) A survey of nature-inspired algorithms for feature selection to identify parkinson’s disease. Comput Methods Programs Biomed 139:171–179
Guozheng L, Meng W, Huajun Z (2004) An introduction to support vector machines and other kernel-based learning methods. Publishing House of Electronics industry, Beijing, p 3
Algamal Z, Qasim M, Lee M, Ali H (2020) Qsar model for predicting neuraminidase inhibitors of influenza a viruses (h1n1) based on adaptive grasshopper optimization algorithm. SAR QSAR Environ Res 31(11):803–814
Qasim OS, Algamal ZY (2020) Feature selection using different transfer functions for binary bat algorithm. Int J Math, Eng Manag Sci 5(4):697
Qasim OS, Algamal ZY (2018) Feature selection using particle swarm optimization-based logistic regression model. Chemom Intell Lab Syst 182:41–46
Qiu C (2019) A novel multi-swarm particle swarm optimization for feature selection. Genet Program Evolvable Mach 20(4):503–529
Yan C, Ma J, Luo H, Zhang G, Luo J (2019) A novel feature selection method for high-dimensional biomedical data based on an improved binary clonal flower pollination algorithm. Hum Hered 84(1):34–46
Feng Y-H, Wang G-G (2018) Binary moth search algorithm for discounted \(0-1\) knapsack problem. IEEE Access 6:10708–10719
Rais A-TQKS (2019) Hm mirjalili s alhussian h. Binary optimization using hybrid grey wolf optimization for feature selection IEEE Access 7:39496–39508
Emary E, Yamany W, Hassanien AE, Snasel V (2015) Multi-objective gray-wolf optimization for attribute reduction. Procedia Comput Sci 65:623–632
Hu P, Pan J-S, Chu S-C (2020) Improved binary grey wolf optimizer and its application for feature selection. Knowl-Based Syst 195:105746
Sayed GI, Darwish A, Hassanien AE (2018) A new chaotic whale optimization algorithm for features selection. J Classif 35(2):300–344
Mafarja M, Aljarah I, Faris H, Hammouri AI, Ala’M A-Z, Mirjalili S (2019) Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst Appl 117:267–286
Shang R, Wang W, Stolkin R, Jiao L (2017) Non-negative spectral learning and sparse regression-based dual-graph regularized feature selection. IEEE Trans Cybern 48(2):793–806
Shang R, Meng Y, Wang W, Shang F, Jiao L (2019) Local discriminative based sparse subspace learning for feature selection. Pattern Recognit 92:219–230
Shang R, Xu K, Shang F, Jiao L (2020) Sparse and low-redundant subspace learning-based dual-graph regularized robust feature selection. Knowl-Based Syst 187:104830
Karthikeyan S, Asokan P, Nickolas S (2014) A hybrid discrete firefly algorithm for multi-objective flexible job shop scheduling problem with limited resource constraints. Int J Adv Manuf Technol 72(9):1567–1579
Mirjalili S, Gandomi AH, Mirjalili SZ, Saremi S, Faris H, Mirjalili SM (2017) Salp swarm algorithm: A bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191
Hegazy AE, Makhlouf M, El-Tawel GS (2020) Improved salp swarm algorithm for feature selection. J King Saud Univ, Comp & Info 32(3):335–344
Osman AIA, Ahmed AN, Chow MF, Huang YF, El-Shafie A (2021) Extreme gradient boosting (xgboost) model to predict the groundwater levels in selangor malaysia. Ain Shams Eng J 12(2):1545–1556
Mirri S, Delnevo G, Roccetti M (2020) Is a covid-19 second wave possible in emilia-romagna (italy)? forecasting a future outbreak with particulate pollution and machine learning. Computation 8(3):74
Funding
Not Applicable
Author information
Authors and Affiliations
Contributions
First Author “Dr Insha Altaf” has done the Implementation part and has done article writing. Second Author Dr. Ajay Kaul has contibuted in the idea and framing of the article.
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conficts of interest/competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Altaf, I., Kaul, A. Classifying collisions in road accidents using XGBOOST, CATBOOST and SALP SWARM based optimization algorithms. Multimed Tools Appl 83, 38387–38410 (2024). https://doi.org/10.1007/s11042-023-16969-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16969-4