Skip to main content
Log in

A lightweight vehicle mounted multi-scale traffic sign detector using attention fusion pyramid

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Intelligent Transportation System (ITS) aims to strengthen the connection between vehicles, roads, and people. As the important road information in ITS, intelligent detection of traffic signs has become an important part in the intelligent vehicle. In this paper, a lightweight vehicle mounted multi-scale traffic sign detector is proposed. First, guided by the attention fusion algorithm, an improved feature pyramid network is proposed, named AFPN. Assign weights according to the importance of information and fuse multi-dimensional attention maps to improve feature extraction and information retention capabilities. Second, a multi-head detection structure is designed to improve the positioning and detection capability of the detector. According to the target scale, the corresponding detection head is constructed to improve the target detection accuracy. The experimental results show that compared with other state-of-the-art methods, the proposed method not only has excellent detection accuracy with 50.3% for small targets and 64.8% for large targets but also can better trade-off detection speed and detection accuracy. Furthermore, the proposed detector is deployed on the Jetson Xavier NX and integrated with the vehicle-mounted camera, inverter, and LCD to realize real-time traffic sign detection on the vehicle terminal, and the speed reaches 25.6 FPS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availibility

The data that support the findings of this study are available from the corresponding author, Zhekang Dong, upon reasonable request.

References

  1. Wang J, Chen Y, Dong Z, Gao M, Lin H, Miao Q (2023) Sabv-depth: a biologically inspired deep learning network for monocular depth estimation. Knowl-Based Syst 263:110301. https://doi.org/10.1016/j.knosys.2023.110301

    Article  Google Scholar 

  2. Min W, Liu R, He D, Han Q, Wei Q, Wang Q (2022) Traffic sign recognition based on semantic scene understanding and structural traffic sign location. IEEE Trans Intell Transp Syst 23(9):15794–15807. https://doi.org/10.1109/TITS.2022.3145467

    Article  Google Scholar 

  3. Dong Z, Ji X, Zhou G, Gao M, Qi D (2023) Multimodal neuromorphic sensory-processing system with memristor circuits for smart home applications. IEEE Trans Ind Appl 59(1):47–58. https://doi.org/10.1109/TIA.2022.3188749

    Article  Google Scholar 

  4. Gudigar A, Chokkadi S, Raghavendra U, Acharya UR (2017) Multiple thresholding and subspace based approach for detection and recognition of traffic sign. Multimedia Tools Appl 76:6973–6991. https://doi.org/10.1007/s11042-016-3321-6

    Article  Google Scholar 

  5. Creusen IM, Wijnhoven RGJ, Herbschleb E (2010) de With PHN Color exploitation in hog-based traffic sign detection. In: 2010 IEEE International Conference on Image Processing, pp 2669–2672 https://doi.org/10.1109/ICIP.2010.5651637

  6. Ahmed S, Kamal U, Hasan MK (2022) Dfr-tsd: a deep learning based framework for robust traffic sign detection under challenging weather conditions. IEEE Trans Intell Transp Syst 23(6):5150–5162. https://doi.org/10.1109/TITS.2020.3048878

    Article  Google Scholar 

  7. Dong Z, Ji X, Lai CS, Qi D (2023) Design and implementation of a flexible neuromorphic computing system for affective communication via memristive circuits. IEEE Commun Mag 61(1):74–80. https://doi.org/10.1109/MCOM.001.2200272

    Article  Google Scholar 

  8. Chen Y, Wang J, Dong Z, Yang Y, Luo Q, Gao M (2022) An attention based yolov5 network for small traffic sign recognition. In: 2022 IEEE 31st International Symposium on Industrial Electronics (ISIE), pp 1158–1164 https://doi.org/10.1109/ISIE51582.2022.9831717

  9. Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 936–944 https://doi.org/10.1109/CVPR.2017.106

  10. Qiao S, Chen LC, Yuille A (2021) Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 10208–10219 https://doi.org/10.1109/CVPR46437.2021.01008

  11. Zhao G, Ge W, Yu Y (2021)Graphfpn: Graph feature pyramid network for object detection. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp 2743–2752 https://doi.org/10.1109/ICCV48922.2021.00276

  12. Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8759–8768 https://doi.org/10.1109/CVPR.2018.00913

  13. Shen L, You L, Peng B, Zhang C (2021) Group multi-scale attention pyramid network for traffic sign detection. Neurocomputing 452:1–14. https://doi.org/10.1016/j.neucom.2021.04.083

    Article  Google Scholar 

  14. Hu M, Li Y, Fang L, Wang S A2-fpn: Attention aggregation based feature pyramid network for instance segmentation. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 15338–15347 (2021). https://doi.org/10.1109/CVPR46437.2021.01509

  15. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7132–7141 https://doi.org/10.1109/CVPR.2018.00745

  16. Luo Z, Li J, Zhu Y (2021) A deep feature fusion network based on multiple attention mechanisms for joint iris-periocular biometric recognition. IEEE Signal Process Lett 28:1060–1064. https://doi.org/10.1109/LSP.2021.3079850

    Article  Google Scholar 

  17. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017)Attention is all you need, pp 6000–6010 https://doi.org/10.5555/3295222.3295349

  18. Han K, Wang Y, Chen H, Chen X, Guo J, Liu Z, Tang Y, Xiao A, Xu C, Xu Y, Yang Z, Zhang Y, Tao D (2023) A survey on vision transformer. IEEE Trans Pattern Anal Mach Intell 45(1):87–110. https://doi.org/10.1109/TPAMI.2022.3152247

    Article  Google Scholar 

  19. Yan M, Wang J, Li J, Zhang K, Yang Z (2020) Traffic scene semantic segmentation using self-attention mechanism and bi-directional gru to correlate context. Neurocomputing 386:293–304. https://doi.org/10.1016/j.neucom.2019.12.007

    Article  Google Scholar 

  20. Li J, Wang Z (2019) Real-time traffic sign recognition based on efficient cnns in the wild. IEEE Trans Intell Transp Syst 20(3):975–984. https://doi.org/10.1109/TITS.2018.2843815

    Article  Google Scholar 

  21. Wang CY, Mark Liao HY, Wu YH, Chen PY, Hsieh JW, Yeh IH (2020) Cspnet: A new backbone that can enhance learning capability of cnn. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 1571–1580 https://doi.org/10.1109/CVPRW50498.2020.00203

  22. Ma N, Zhang X, Zheng HT, Sun J (2018) Shufflenet v2: Practical guidelines for efficient cnn architecture design. In: Computer Vision – ECCV 2018, pp. 122–138. Springer, Cham https://doi.org/10.1007/978-3-030-01264-9_8

  23. Güney E, Bayilmiş C, Çakan B (2022) An implementation of real-time traffic signs and road objects detection based on mobile gpu platforms. IEEE Access 10:86191–86203. https://doi.org/10.1109/ACCESS.2022.3198954

    Article  Google Scholar 

  24. Yu J, Ye X, Tu Q (2022) Traffic sign detection and recognition in multiimages using a fusion model with yolo and vgg network. IEEE Trans Intell Transp Syst 23(9):16632–16642. https://doi.org/10.1109/TITS.2022.3170354

    Article  Google Scholar 

  25. Liu Y, Peng J, Xue J-H, Chen Y, Fu Z-H (2021) Tsingnet: scale-aware and context-rich feature learning for traffic sign detection and recognition in the wild. Neurocomputing 447:10–22. https://doi.org/10.1016/j.neucom.2021.03.049

    Article  Google Scholar 

  26. Gao Z, Xie J, Wang Q, Li P (2019) Global second-order pooling convolutional networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 3019–3028 https://doi.org/10.1109/CVPR.2019.00314

  27. Yang Z, Zhu L, Wu Y, Yang Y (2020) Gated channel transformation for visual recognition. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 11791–11800 https://doi.org/10.1109/CVPR42600.2020.01181

  28. Guo M-H, Xu T-X, Liu J-J, Liu Z-N, Jiang P-T, Mu T-J, Song-Hai Z, Marti RR, Cheng M-M, Hu S-M (2022) Attention mechanisms in computer vision: a survey. Science 8:331–368. https://doi.org/10.1007/s41095-022-0271-y

    Article  Google Scholar 

  29. Hu J, Shen L, Albanie S, Sun G, Vedaldi A (2018) Gather-excite: exploiting feature context in convolutional neural networks. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. NIPS’18, pp 9423–9433. Curran Associates Inc., Red Hook, NY, USA .https://doi.org/10.5555/3327546.3327612

  30. Woo S, Park J, Lee JY, Kweon IS (2018) Cbam: Convolutional block attention module. In: Computer Vision–ECCV 2018, pp 3–19. Springer, Cham https://doi.org/10.1007/978-3-030-01234-2_1

  31. Liu Y, Shao, Z, Hoffmann N (2021) Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions

  32. Tan M, Pang R, Le QV (2020) Efficientdet: scalable and efficient object detection. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 10778–10787 https://doi.org/10.1109/CVPR42600.2020.01079

  33. Zhang D, Zhang H, Tang J, Wang M, Hua X, Sun Q (2020) Feature pyramid transformer. In: Computer Vision–ECCV 2020, pp 323–339. Springer, Cham https://doi.org/10.1007/978-3-030-58604-1_20

  34. Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: A single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp 9259–9266 https://doi.org/10.1609/aaai.v33i01.33019259

  35. Xie J, Ma Z, Chang D, Zhang G, Guo J (2022) Gpca: a probabilistic framework for gaussian process embedded channel attention. IEEE Trans Pattern Anal Mach Intell 44(11):8230–8248. https://doi.org/10.1109/TPAMI.2021.3102955

    Article  Google Scholar 

  36. Chen J, Jia K, Chen W, Lv Z, Zhang R (2022) Neural computing and applications. Neural Comput Appl 34:2233–2245. https://doi.org/10.1007/s00521-021-06526-1

    Article  Google Scholar 

  37. Chiu YC, Tsai CY, Ruan MD, Shen GY, Lee TT (2020) Mobilenet-ssdv2: An improved object detection model for embedded systems. In: 2020 International Conference on System Science and Engineering (ICSSE), pp 1–5 https://doi.org/10.1109/ICSSE50014.2020.9219319

  38. Joseph R, Ali F (2018) YOLOv3: An Incremental Improvement

  39. Ge Z, Liu S, Wang F, Li Z, Sun J (2021) YOLOX: Exceeding YOLO Series in

  40. Wang J, Chen Y, Dong Z, Gao M Improved YOLOv5 network for real-time multi-scale traffic sign detection (2023). https://doi.org/10.1007/s00521-022-08077-5

  41. Qi D, Tan W, Yao Q, Liu J (2022) YOLO5Face: Why Reinventing a Face Detector

  42. Sun P, Jiang Y, Xie E, Shao W, Yuan Z, Wang C, Luo P (2021) What makes for end-to-end object detection? IN Proceedings of the 38th International Conference on Machine Learning, vol 139, pp 9934–9944 https://doi.org/10.48550/arXiv.2012.05780

  43. Tian Z, Shen C, Chen H, He T (2022) Fcos: a simple and strong anchor-free object detector. IEEE Trans Pattern Anal Mach Intell 44(4):1922–1933. https://doi.org/10.1109/TPAMI.2020.3032166

    Article  Google Scholar 

  44. Cheng X, Yu J (2021) Retinanet with difference channel attention and adaptively spatial feature fusion for steel surface defect detection. IEEE Trans Instrum Meas 70:1–11. https://doi.org/10.1109/TIM.2020.3040485

    Article  Google Scholar 

  45. Yang C, Huang Z, Wang N (2022) Querydet: Cascaded sparse query for accelerating high-resolution small object detection. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 13658–13667 https://doi.org/10.1109/CVPR52688.2022.01330

  46. A.C GJ, Stoken A, Borovec J (2022) ultralytics/yolov5: v6.2 - YOLOv5 Classification Models. https://github.com/ultralytics/yolov5

  47. He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: Computer Vision–ECCV 2016, pp 630–645. Springer, Cham . https://doi.org/10.1007/978-3-319-46493-0_38

  48. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2261–2269 https://doi.org/10.1109/CVPR.2017.243

Download references

Funding

This work was jointly supported by The Key R &D Project of Hangzhou under Grants No. 2022AIZD0009, 2022AIZD0022, The Key Research and Development Program of Zhejiang Province Grant No. 2022C01062, The National Natural Science Foundation of China under Grants No. 62001416.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, JW and YC; methodology, JW; software, YC; validation, YG and YY; formal analysis, JW and YG; investigation, YY; resources, MG; data curation, JW and YC; writing-original draft preparation, JW; writing-review and editing, YY and ZD; visualization, YC; supervision, ZD; project administration, ZD and MG; funding acquisition, ZD. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Zhekang Dong.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical Approval

This article does not contain any studies involving animals performed by any of the authors. This article does not contain any studies involving human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 32350 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Chen, Y., Gu, Y. et al. A lightweight vehicle mounted multi-scale traffic sign detector using attention fusion pyramid. J Supercomput 80, 3360–3381 (2024). https://doi.org/10.1007/s11227-023-05594-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05594-5

Keywords

Navigation