Abstract
Intelligent Transportation System (ITS) aims to strengthen the connection between vehicles, roads, and people. As the important road information in ITS, intelligent detection of traffic signs has become an important part in the intelligent vehicle. In this paper, a lightweight vehicle mounted multi-scale traffic sign detector is proposed. First, guided by the attention fusion algorithm, an improved feature pyramid network is proposed, named AFPN. Assign weights according to the importance of information and fuse multi-dimensional attention maps to improve feature extraction and information retention capabilities. Second, a multi-head detection structure is designed to improve the positioning and detection capability of the detector. According to the target scale, the corresponding detection head is constructed to improve the target detection accuracy. The experimental results show that compared with other state-of-the-art methods, the proposed method not only has excellent detection accuracy with 50.3% for small targets and 64.8% for large targets but also can better trade-off detection speed and detection accuracy. Furthermore, the proposed detector is deployed on the Jetson Xavier NX and integrated with the vehicle-mounted camera, inverter, and LCD to realize real-time traffic sign detection on the vehicle terminal, and the speed reaches 25.6 FPS.












Similar content being viewed by others
Data availibility
The data that support the findings of this study are available from the corresponding author, Zhekang Dong, upon reasonable request.
References
Wang J, Chen Y, Dong Z, Gao M, Lin H, Miao Q (2023) Sabv-depth: a biologically inspired deep learning network for monocular depth estimation. Knowl-Based Syst 263:110301. https://doi.org/10.1016/j.knosys.2023.110301
Min W, Liu R, He D, Han Q, Wei Q, Wang Q (2022) Traffic sign recognition based on semantic scene understanding and structural traffic sign location. IEEE Trans Intell Transp Syst 23(9):15794–15807. https://doi.org/10.1109/TITS.2022.3145467
Dong Z, Ji X, Zhou G, Gao M, Qi D (2023) Multimodal neuromorphic sensory-processing system with memristor circuits for smart home applications. IEEE Trans Ind Appl 59(1):47–58. https://doi.org/10.1109/TIA.2022.3188749
Gudigar A, Chokkadi S, Raghavendra U, Acharya UR (2017) Multiple thresholding and subspace based approach for detection and recognition of traffic sign. Multimedia Tools Appl 76:6973–6991. https://doi.org/10.1007/s11042-016-3321-6
Creusen IM, Wijnhoven RGJ, Herbschleb E (2010) de With PHN Color exploitation in hog-based traffic sign detection. In: 2010 IEEE International Conference on Image Processing, pp 2669–2672 https://doi.org/10.1109/ICIP.2010.5651637
Ahmed S, Kamal U, Hasan MK (2022) Dfr-tsd: a deep learning based framework for robust traffic sign detection under challenging weather conditions. IEEE Trans Intell Transp Syst 23(6):5150–5162. https://doi.org/10.1109/TITS.2020.3048878
Dong Z, Ji X, Lai CS, Qi D (2023) Design and implementation of a flexible neuromorphic computing system for affective communication via memristive circuits. IEEE Commun Mag 61(1):74–80. https://doi.org/10.1109/MCOM.001.2200272
Chen Y, Wang J, Dong Z, Yang Y, Luo Q, Gao M (2022) An attention based yolov5 network for small traffic sign recognition. In: 2022 IEEE 31st International Symposium on Industrial Electronics (ISIE), pp 1158–1164 https://doi.org/10.1109/ISIE51582.2022.9831717
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 936–944 https://doi.org/10.1109/CVPR.2017.106
Qiao S, Chen LC, Yuille A (2021) Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 10208–10219 https://doi.org/10.1109/CVPR46437.2021.01008
Zhao G, Ge W, Yu Y (2021)Graphfpn: Graph feature pyramid network for object detection. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp 2743–2752 https://doi.org/10.1109/ICCV48922.2021.00276
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8759–8768 https://doi.org/10.1109/CVPR.2018.00913
Shen L, You L, Peng B, Zhang C (2021) Group multi-scale attention pyramid network for traffic sign detection. Neurocomputing 452:1–14. https://doi.org/10.1016/j.neucom.2021.04.083
Hu M, Li Y, Fang L, Wang S A2-fpn: Attention aggregation based feature pyramid network for instance segmentation. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 15338–15347 (2021). https://doi.org/10.1109/CVPR46437.2021.01509
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7132–7141 https://doi.org/10.1109/CVPR.2018.00745
Luo Z, Li J, Zhu Y (2021) A deep feature fusion network based on multiple attention mechanisms for joint iris-periocular biometric recognition. IEEE Signal Process Lett 28:1060–1064. https://doi.org/10.1109/LSP.2021.3079850
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017)Attention is all you need, pp 6000–6010 https://doi.org/10.5555/3295222.3295349
Han K, Wang Y, Chen H, Chen X, Guo J, Liu Z, Tang Y, Xiao A, Xu C, Xu Y, Yang Z, Zhang Y, Tao D (2023) A survey on vision transformer. IEEE Trans Pattern Anal Mach Intell 45(1):87–110. https://doi.org/10.1109/TPAMI.2022.3152247
Yan M, Wang J, Li J, Zhang K, Yang Z (2020) Traffic scene semantic segmentation using self-attention mechanism and bi-directional gru to correlate context. Neurocomputing 386:293–304. https://doi.org/10.1016/j.neucom.2019.12.007
Li J, Wang Z (2019) Real-time traffic sign recognition based on efficient cnns in the wild. IEEE Trans Intell Transp Syst 20(3):975–984. https://doi.org/10.1109/TITS.2018.2843815
Wang CY, Mark Liao HY, Wu YH, Chen PY, Hsieh JW, Yeh IH (2020) Cspnet: A new backbone that can enhance learning capability of cnn. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 1571–1580 https://doi.org/10.1109/CVPRW50498.2020.00203
Ma N, Zhang X, Zheng HT, Sun J (2018) Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Computer Vision – ECCV 2018, pp. 122–138. Springer, Cham https://doi.org/10.1007/978-3-030-01264-9_8
Güney E, Bayilmiş C, Çakan B (2022) An implementation of real-time traffic signs and road objects detection based on mobile gpu platforms. IEEE Access 10:86191–86203. https://doi.org/10.1109/ACCESS.2022.3198954
Yu J, Ye X, Tu Q (2022) Traffic sign detection and recognition in multiimages using a fusion model with yolo and vgg network. IEEE Trans Intell Transp Syst 23(9):16632–16642. https://doi.org/10.1109/TITS.2022.3170354
Liu Y, Peng J, Xue J-H, Chen Y, Fu Z-H (2021) Tsingnet: scale-aware and context-rich feature learning for traffic sign detection and recognition in the wild. Neurocomputing 447:10–22. https://doi.org/10.1016/j.neucom.2021.03.049
Gao Z, Xie J, Wang Q, Li P (2019) Global second-order pooling convolutional networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3019–3028 https://doi.org/10.1109/CVPR.2019.00314
Yang Z, Zhu L, Wu Y, Yang Y (2020) Gated channel transformation for visual recognition. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 11791–11800 https://doi.org/10.1109/CVPR42600.2020.01181
Guo M-H, Xu T-X, Liu J-J, Liu Z-N, Jiang P-T, Mu T-J, Song-Hai Z, Marti RR, Cheng M-M, Hu S-M (2022) Attention mechanisms in computer vision: a survey. Science 8:331–368. https://doi.org/10.1007/s41095-022-0271-y
Hu J, Shen L, Albanie S, Sun G, Vedaldi A (2018) Gather-excite: exploiting feature context in convolutional neural networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18, pp 9423–9433. Curran Associates Inc., Red Hook, NY, USA .https://doi.org/10.5555/3327546.3327612
Woo S, Park J, Lee JY, Kweon IS (2018) Cbam: Convolutional block attention module. In: Computer Vision–ECCV 2018, pp 3–19. Springer, Cham https://doi.org/10.1007/978-3-030-01234-2_1
Liu Y, Shao, Z, Hoffmann N (2021) Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions
Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 10778–10787 https://doi.org/10.1109/CVPR42600.2020.01079
Zhang D, Zhang H, Tang J, Wang M, Hua X, Sun Q (2020) Feature pyramid transformer. In: Computer Vision–ECCV 2020, pp 323–339. Springer, Cham https://doi.org/10.1007/978-3-030-58604-1_20
Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: A single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp 9259–9266 https://doi.org/10.1609/aaai.v33i01.33019259
Xie J, Ma Z, Chang D, Zhang G, Guo J (2022) Gpca: a probabilistic framework for gaussian process embedded channel attention. IEEE Trans Pattern Anal Mach Intell 44(11):8230–8248. https://doi.org/10.1109/TPAMI.2021.3102955
Chen J, Jia K, Chen W, Lv Z, Zhang R (2022) Neural computing and applications. Neural Comput Appl 34:2233–2245. https://doi.org/10.1007/s00521-021-06526-1
Chiu YC, Tsai CY, Ruan MD, Shen GY, Lee TT (2020) Mobilenet-ssdv2: An improved object detection model for embedded systems. In: 2020 International Conference on System Science and Engineering (ICSSE), pp 1–5 https://doi.org/10.1109/ICSSE50014.2020.9219319
Joseph R, Ali F (2018) YOLOv3: An Incremental Improvement
Ge Z, Liu S, Wang F, Li Z, Sun J (2021) YOLOX: Exceeding YOLO Series in
Wang J, Chen Y, Dong Z, Gao M Improved YOLOv5 network for real-time multi-scale traffic sign detection (2023). https://doi.org/10.1007/s00521-022-08077-5
Qi D, Tan W, Yao Q, Liu J (2022) YOLO5Face: Why Reinventing a Face Detector
Sun P, Jiang Y, Xie E, Shao W, Yuan Z, Wang C, Luo P (2021) What makes for end-to-end object detection? IN Proceedings of the 38th International Conference on Machine Learning, vol 139, pp 9934–9944 https://doi.org/10.48550/arXiv.2012.05780
Tian Z, Shen C, Chen H, He T (2022) Fcos: a simple and strong anchor-free object detector. IEEE Trans Pattern Anal Mach Intell 44(4):1922–1933. https://doi.org/10.1109/TPAMI.2020.3032166
Cheng X, Yu J (2021) Retinanet with difference channel attention and adaptively spatial feature fusion for steel surface defect detection. IEEE Trans Instrum Meas 70:1–11. https://doi.org/10.1109/TIM.2020.3040485
Yang C, Huang Z, Wang N (2022) Querydet: Cascaded sparse query for accelerating high-resolution small object detection. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 13658–13667 https://doi.org/10.1109/CVPR52688.2022.01330
A.C GJ, Stoken A, Borovec J (2022) ultralytics/yolov5: v6.2 - YOLOv5 Classification Models. https://github.com/ultralytics/yolov5
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: Computer Vision–ECCV 2016, pp 630–645. Springer, Cham . https://doi.org/10.1007/978-3-319-46493-0_38
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2261–2269 https://doi.org/10.1109/CVPR.2017.243
Funding
This work was jointly supported by The Key R &D Project of Hangzhou under Grants No. 2022AIZD0009, 2022AIZD0022, The Key Research and Development Program of Zhejiang Province Grant No. 2022C01062, The National Natural Science Foundation of China under Grants No. 62001416.
Author information
Authors and Affiliations
Contributions
Conceptualization, JW and YC; methodology, JW; software, YC; validation, YG and YY; formal analysis, JW and YG; investigation, YY; resources, MG; data curation, JW and YC; writing-original draft preparation, JW; writing-review and editing, YY and ZD; visualization, YC; supervision, ZD; project administration, ZD and MG; funding acquisition, ZD. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical Approval
This article does not contain any studies involving animals performed by any of the authors. This article does not contain any studies involving human participants performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file 1 (mp4 32350 KB)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, J., Chen, Y., Gu, Y. et al. A lightweight vehicle mounted multi-scale traffic sign detector using attention fusion pyramid. J Supercomput 80, 3360–3381 (2024). https://doi.org/10.1007/s11227-023-05594-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-023-05594-5