Abstract
Traffic accident detection is an important part of road safety, impacting the lives of those involved and others on the road. Using surveillance cameras on traffic poles to detect accidents poses unique challenges, such as incomplete dataset categories, small-sized detection objects, and the need for lightweight models. Current traffic accident recognition algorithms, while effective in detection, often require extensive resources, making deployment on edge devices difficult. This paper proposes a more accurate and lightweight traffic accident recognition model based on YOLOv8, optimized for traffic pole monitoring and deployment on edge devices. To improve small object detection, we made improvements to the neck. We modified the neck by adding a detection layer for small-sized objects using large-scale feature maps, along with a dedicated small object detection head (SODL-SODH). Additionally, we design a lightweight cross-scale feature fusion module (LCSFFM) to optimize the PAN-FPN structure, reducing model parameters and computational complexity while enhancing small-target detection. In the downsampling layer, we incorporate the squeeze-excited aggregate spatial attention module (SEASAM) into the C2F module to help the network focus on essential image information, with minimal impact on model parameters and computational complexity. To address dataset limitations, we built the traffic accident-type (TAT) dataset for training and evaluation, and validated it against other advanced methods. Experimental results show that our model outperforms the baseline on the TAT dataset, improving the mAP0.5 by 1% and reducing parameters by 25.9%. On the BDD-IW dataset, our TP-YOLOv8s outperforms other methods in terms of accuracy. Compared with the best other methods, it improves the mAP0.5 index by 1.4% and reduces the number of parameters by 84.1%.
















Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data Availability
The datasets used in the study are from the website and can be downloaded through the following links: BDD-IW dataset (https://github.com/ZhaoHe1023/Improved-YOLOv4), TAT dataset (https://github.com/Ningdashuai/TAT).
References
Xiao J (2019) Svm and knn ensemble learning for traffic incident detection. PhysA Stat Mech Appl 517:29–35. https://doi.org/10.1016/j.physa.2018.10.060
Tang Y (2013) Deep learning using linear support vector machines, https://doi.org/10.48550/arXiv.1306.0239. arXiv preprint arXiv:1306.0239
Abeywickrama T, Cheema MA, Taniar D (2016) K-nearest neighbors on road networks: a journey in experimentation and in-memory implementation, https://doi.org/10.48550/arXiv.1601.01549. arXiv preprint arXiv:1601.01549
Kumeda B, Zhang F, Zhou F, Hussain S, Almasri A, Assefa M (2019) Classification of road traffic accident data using machine learning algorithms. In: 2019 IEEE 11th International Conference on Communication Software and Networks (ICCSN), IEEE, Chongqing, China, pp 682–687. https://doi.org/10.1109/ICCSN.2019.8905362
Chakraborty P, Sharma A, Hegde C (2018) Freeway traffic incident detection from cameras: a semi-supervised learning approach. In: Zhang W, Bayen AM, Medina JJS, Barth MJ (ed) 21st International Conference on Intelligent Transportation Systems, ITSC 2018, Maui, HI, USA, November 4-7, 2018, IEEE, Maui, HI, USA, pp 1840–1845. https://doi.org/10.1109/ITSC.2018.8569426
Farhadi A, Redmon J (2018) Yolov3: An incremental improvement. In: Computer Vision and Pattern Recognition, vol 1804, pp. 1–6 https://doi.org/10.48550/arXiv.1804.02767 . Springer Berlin/Heidelberg, Germany
Jocher G (2022) Yolov5 Release v7.0. Accessed: 2022. https://github.com/ultralytics/yolov5/tree/v7.0
Xia Z, Gong J, Long Y, Ren W, Wang J, Lan H (2022) Research on traffic accident detection based on vehicle perspective. In: 2022 4th International Conference on Robotics and Computer Vision (ICRCV), IEEE, Wuhan, China, pp 223–227. https://doi.org/10.1109/ICRCV55858.2022.9953179
Tan M (2019) Efficientnet: Rethinking model scaling for convolutional neural networks, 6105–6114. arXiv preprint arXiv:1905.11946
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Gour D, Kanskar A (2019) Optimised yolo: algorithm for cpu to detect road traffic accident and alert system. Int J Eng Res Technol 8:160–163
Pillai MS, Chaudhary G, Khari M, Crespo RG (2021) Real-time image enhancement for an automatic automobile accident detection through cctv using deep learning. Soft Comput 25(18):11929–11940. https://doi.org/10.1007/S00500-021-05576-W
Lee C, Kim H, Oh S, Doo I (2021) A study on building a “real-time vehicle accident and road obstacle notification model’’ using ai cctv. Appl Sci 11(17):8210. https://doi.org/10.3390/app11178210
Ghahremannezhad H, Shi H, Liu C (2022) Real-time accident detection in traffic surveillance using deep learning. In: IEEE International Conference on Imaging Systems and Techniques, IST 2022, Kaohsiung, Taiwan, June 21-23, 2022, IEEE, Kaohsiung, Taiwan, pp 1–6. https://doi.org/10.1109/IST55454.2022.9827736
Ahmed MIB, Zaghdoud R, Ahmed MS, Sendi R, Alsharif S, Alabdulkarim J, Saad BAA, Alsabt R, Rahman A, Krishnasamy G (2023) A real-time computer vision based approach to detection and classification of traffic incidents. Big Data Cogn. Comput. 7(1):22. https://doi.org/10.3390/BDCC7010022
Li H, Xiong P, An J, Wang L (2018) Pyramid attention network for semantic segmentation. arXiv preprint arXiv:1805.10180
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125
Jocher G (2023) Yolov8. Accessed: 2023. https://github.com/ultralytics/ultralytics/tree/main
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: Optimal speed and accuracy of object detection, https://doi.org/10.48550/arXiv.2004.10934. arXiv preprint arXiv:2004.10934
Huang X, Wang X, Lv W, Bai X, Long X, Deng K, Dang Q, Han S, Liu Q, Hu X, et al (2021) Pp-yolov2: A practical object detector, https://doi.org/10.48550/arXiv.2104.10419. arXiv preprint arXiv:2104.10419
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7263–7271. https://doi.org/10.1109/CVPR.2017.690
Wang C-Y, Bochkovskiy A, Liao H-YM (2021) Scaled-yolov4: scaling cross stage partial network. In: Proceedings of the IEEE/cvf Conference on Computer Vision and Pattern Recognition, pp 13029–13038. https://doi.org/10.48550/arXiv.2011.08036Focustolearnmore
Wang C-Y, Bochkovskiy A, Liao H-YM (2023) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7464–7475. https://doi.org/10.48550/arXiv.2207.02696
Ge Z (2021) Yolox: Exceeding yolo series in 2021, https://doi.org/10.48550/arXiv.2107.08430. arXiv preprint arXiv:2107.08430
Li C, Li L, Geng Y, Jiang H, Cheng M, Zhang B, Ke Z, Xu X, Chu X (2023) Yolov6 v3. 0: a full-scale reloading, https://doi.org/10.48550/arXiv.2301.05586. arXiv preprint arXiv:2301.05586
Xu S, Wang X, Lv W, Chang Q, Cui C, Deng K, Wang G, Dang Q, Wei S, Du Y, et al (2022) Pp-yoloe: an evolved version of yolo, https://doi.org/10.48550/arXiv.2203.16250. arXiv preprint arXiv:2203.16250
Elfwing S, Uchibe E, Doya K (2018) Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw 107:3–11. https://doi.org/10.48550/arXiv.1702.03118
Agarap A (2018) Deep learning using rectified linear units (relu), https://doi.org/10.48550/arXiv.1803.08375. arXiv preprint arXiv:1803.08375
bibitemwang2020cspnet29 Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) Cspnet: a new backbone that can enhance learning capability of cnn. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391. https://doi.org/10.48550/arXiv.1911.11929
Mao A, Mohri M, Zhong Y (2023) Cross-entropy loss functions: theoretical analysis and applications. In: International Conference on Machine Learning, PMLR, pp 23803–23828. https://doi.org/10.48550/arXiv.2304.07288.
Li Q, Jia X, Zhou J, Shen L, Duan J (2024) Rediscovering bce loss for uniform classification, https://doi.org/10.48550/arXiv.2403.07289. arXiv preprint arXiv:2403.07289
Ren J, Zhang M, Yu C, Liu Z (2022) Balanced mse for imbalanced visual regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7926–7935. https://doi.org/10.48550/arXiv.2203.16427
Li X, Wang W, Wu L, Chen S, Hu X, Li J, Tang J, Yang J (2020) Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection. Adv Neural Inform Process Syst 33:21002–21012. https://doi.org/10.48550/arXiv.2006.04388
Yu J, Jiang Y, Wang Z, Cao Z, Huang T (2016) Unitbox: An advanced object detection network. In: Proceedings of the 24th ACM International Conference on Multimedia, pp 516–520. https://doi.org/10.1145/2964284.2967274
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 658–666. https://doi.org/10.1109/CVPR.2019.00075
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-iou loss: faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 12993–13000. https://doi.org/10.48550/arXiv.1911.08287
Yu F, Chen H, Wang X, Xian W, Chen Y, Liu F, Madhavan V, Darrell T (2020) Bdd100k: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2636–2645. https://doi.org/10.48550/arXiv.1805.04687
Wang R, Zhao H, Xu Z, Ding Y, Li G, Zhang Y, Li H (2023) Real-time vehicle target detection in inclement weather conditions based on yolov4. Front Neurorobotics 17:1058723. https://doi.org/10.3389/fnbot.2023.1058723
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European Conference on Computer Vision (ECCV). https://cocodataset.org
UCSD LISA Lab (2010) LISA Traffic Sign Dataset. University of California, San Diego. http://cvrr.ucsd.edu
Zhao Y, Lv W, Xu S, Wei J, Wang G, Dang Q, Liu Y, Chen J (2024) Detrs beat yolos on real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16965–16974. https://doi.org/10.48550/arXiv.2304.08069
Lin J, Mao X, Chen Y, Xu L, He Y, Xue H (2022) D 2etr: Decoder-only detr with computationally efficient cross-scale attention, https://doi.org/10.48550/arXiv.2203.00860. arXiv preprint arXiv:2203.00860
Soudy M, Afify Y, Badr N (2022) Repconv: a novel architecture for image scene classification on intel scenes dataset. Int J Intell Comput Inform Sci 22(2):63–73. https://doi.org/10.21608/ijicis.2022.118834.1163
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans pattern Anal Mach Intell 20(11):1254–1259. https://doi.org/10.1109/34.730558
Rensink RA (2000) The dynamic representation of scenes. Vis Cognit 7(1–3):17–42. https://doi.org/10.1080/135062800394667
Corbetta M, Shulman GL (2002) Control of goal-directed and stimulus-driven attention in the brain. Nat Rev Neurosci 3(3):201–215. https://doi.org/10.1038/nrn755
Larochelle H, Hinton GE (2010) Learning to combine foveal glimpses with a third-order boltzmann machine. Adv Neural Inform Process Syst, 23
Narayanan M (2023) Senetv2: Aggregated dense layer for channelwise and global representations, https://doi.org/10.48550/arXiv.2311.10807. arXiv preprint arXiv:2311.10807
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19. https://doi.org/10.48550/arXiv.1807.06521
Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua T-S (2017) Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5659–5667. https://doi.org/10.48550/arXiv.1611.05594
Zagoruyko S, Komodakis N (2016) Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer, https://doi.org/10.48550/arXiv.1612.03928. arXiv preprint arXiv:1612.03928
Shah AP, Lamare J-B, Nguyen-Anh T, Hauptmann A (2018) Cadp: a novel dataset for cctv traffic camera based accident analysis. In: 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), pp 1–9. https://doi.org/10.48550/arXiv.1809.05782. IEEE
Xu Y, Huang C, Nan Y, Lian S (2022) Tad: a large-scale benchmark for traffic accidents detection from video surveillance, https://doi.org/10.48550/arXiv.2209.12386. arXiv preprint arXiv:2209.12386
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.48550/arXiv.1506.01497
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp 21–37. https://doi.org/10.48550/arXiv.1512.02325 . Springer
Zhu X, Lyu S, Wang X, Zhao Q (2021) Tph-yolov5: Improved yolov5 based on transformer prediction head for object detection on drone-captured scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2778–2788
Funding
This research was funded by the Key R&D projects of Xinjiang Uygur Autonomous Region, grant number 2022B01006.
Author information
Authors and Affiliations
Contributions
Z.N. and T.Z. conceptualized the study; Z.N. curated and processed the data; Z.N. and T.Z. conducted formal analysis; G.S. acquired funding, provided resources, participated in the investigation, and supervised the project; Z.N. designed the methodology; A.W. and X.L. managed the project’s progress and coordination; Z.N. and X.L. validated the results and created visualizations; Z.N. wrote the main manuscript text; A.W. and T.Z. reviewed and edited the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
No potential conflict of interest was reported by the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ning, Z., Zhang, T., Li, X. et al. Tp-yolov8: a lightweight and accurate model for traffic accident recognition. J Supercomput 81, 622 (2025). https://doi.org/10.1007/s11227-025-07129-6
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-025-07129-6
Keywords
Profiles
- Tianze Zhang View author profile