Abstract
Traffic sign detection (TSD) is a key issue for smart vehicles. Traffic sign recognition (TSR) contributes beneficial information, including directions and alerts for advanced driver assistance systems (ADAS) and Cooperative Intelligent Transport Systems (CITS). Traffic signs are tough to detect in practical autonomous driving scenes using an extremely accurate real-time approach. Object detection methods such as Yolo V4 and Yolo V4-tiny consolidated with Spatial Pyramid Pooling (SPP) are analyzed in this paper. This work evaluates the importance of the SPP principle in boosting the performance of Yolo V4 and Yolo V4-tiny backbone networks in extracting features and learning object features more effectively. Both models are measured and compared with crucial measurement parameters, including mean average precision (mAP), working area size, detection time, and billion floating-point number (BFLOPS). Experiments show that Yolo V4_1 (with SPP) outperforms the state-of-the-art schemes, achieving 99.4% accuracy in our experiments, along with the best total BFLOPS (127.26) and mAP (99.32%). In contrast with earlier studies, the Yolo V3 SPP training process only receives 98.99% accuracy for mAP with IoU 90.09. The training mAP rises by 0.44% with Yolo V4_1 (mAP 99.32%) in our experiment. Further, SPP can enhance the achievement of all models in the experiment.
Similar content being viewed by others
References
Avramović A, Sluga D, Tabernik D, Skočaj D, Stojnić V, Ilc N (2020) Neural-network-based traffic sign detection and recognition in high-definition images using region focusing and parallelization IEEE Access 8 https://doi.org/10.1109/ACCESS.2020.3031191
Balali V, Ashouri Rad A, Golparvar-Fard M (2015) Detection, classification, and mapping of U.S. traffic signs using google street view images for roadway inventory management. Vis Eng 3:1–18. https://doi.org/10.1186/s40327-015-0027-1
Basbug AM, Sert M (2019) Acoustic scene classification using spatial pyramid pooling with convolutional neural networks. In: proceedings - 13th IEEE international conference on semantic computing, ICSC 2019. Pp 128–131
Bochkovskiy A (2020) Darknet: Open Source Neural Networks in Python. https://github.com/AlexeyAB/darknet. Accessed 13 Jan 2020
Bochkovskiy A, Wang C-Y, Mark Liao H-Y (2020) YOLOv4: optimal speed and accuracy of object detection. arXiv:200410934 1–17
Chen H, He Z, Shi B, Zhong T (2019) Research on recognition method of electrical components based on YOLO V3. IEEE Access 7:157818–157829. https://doi.org/10.1109/ACCESS.2019.2950053
Chen RC, Dewi C, Huang SW, Caraka RE (2020) Selecting critical features for data classification based on machine learning methods. Journal of Big Data 7:1–26. https://doi.org/10.1186/s40537-020-00327-4
Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, In, pp 3642–3649
Dewi C, Chen R-C (2019) Human activity recognition based on evolution of features selection and random Forest. 2019 IEEE international conference on systems, man and cybernetics (SMC) 2496–2501 https://doi.org/10.1109/SMC.2019.8913868
Dewi C, Chen RC (2019) Random forest and support vector machine on features selection for regression analysis. Int J Innov Comput Inf Control 15:2027–2037. https://doi.org/10.24507/ijicic.15.06.2027
Dewi C, Chen R-C, Hendry LY-T (2019) Similar Music Instrument Detection via Deep Convolution YOLO-Generative Adversarial Network. 2019 IEEE 10th International Conference on Awareness Science and Technology (iCAST) 1–6 . https://doi.org/10.1109/ICAwST.2019.8923404
Dewi C, Chen R-C, Tai S-K (2020) Evaluation of robust spatial pyramid pooling based on convolutional neural network for traffic sign recognition system. Electronics 9:889. https://doi.org/10.3390/electronics9060889
Dewi C, Chen RC, Liu Y-T (2020) Taiwan stop sign recognition with customize anchor. In: ICCMS ‘20, February 26–28, 2020. QLD, Australia, Brisbane, pp 51–55
Dewi C, Chen R, Liu Y, Yu H (2021) Various generative adversarial networks model for synthetic prohibitory sign image generation. Appl Sci 11:2913
Dewi C, Chen R-C, Liu Y-T, Tai S-K (2021) Synthetic Data generation using DCGAN for improved traffic sign recognition. Neural Comput & Applic 33:1–15
Ellahyani A, El Ansari M, El Jaafari I (2016) Traffic sign detection and recognition based on random forests. Applied Soft Computing Journal 46:805–815. https://doi.org/10.1016/j.asoc.2015.12.041
Geng K, Yin G (2020) Using deep learning in infrared images to enable human gesture recognition for autonomous vehicles. IEEE Access 8:88227–88240. https://doi.org/10.1109/ACCESS.2020.2990636
Ghiasi G, Lin TY, Le Q V. (2018) Dropblock: a regularization method for convolutional networks. In: Advances in Neural Information Processing Systems. pp. 10727–10737
Grauman K, Darrell T (2005) The pyramid match kernel: discriminative classification with sets of image features. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 1458–1465
Guo F, Qian Y, Shi Y (2021) Real-time railroad track components inspection based on the improved YOLOv4 framework. Automation in construction 125 https://doi.org/10.1016/j.autcon.2021.103596
He K, Zhang X, Ren S, Sun J (2015) Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824
Houben S, Stallkamp J, Salmen J, Schlipsing M, Igel C (2013) Detection of traffic signs in real-world images: The German traffic sign detection benchmark. In: Proceedings of the International Joint Conference on Neural Networks. Dallas, TX, USA, pp 1–8
Kang H, Chen C (2020) Fast implementation of real-time fruit detection in apple orchards using deep learning. Comput Electron Agric 168:1–10. https://doi.org/10.1016/j.compag.2019.105108
Kaplan Berkaya S, Gunduz H, Ozsen O, Akinlar C, Gunal S (2016) On circular traffic sign detection and recognition. Expert Syst Appl 48:67–75. https://doi.org/10.1016/j.eswa.2015.11.018
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60:84–90. https://doi.org/10.1145/3065386
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp. 1–8
Lee HS, Kim K (2018) Simultaneous traffic sign detection and boundary estimation using convolutional neural network. IEEE Trans Intell Transp Syst 19:1652–1663. https://doi.org/10.1109/TITS.2018.2801560
Liu Q, Furber S (2016) Noisy softplus: a biology inspired activation function. In: Lecture Notes in Computer Science. pp. 405–412
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, In, pp 8759–8768
Liu J, Huang Y, Peng J, Yao J, Wang L (2018) Fast object detection at constrained energy. IEEE Trans Emerg Top Comput 6:409–416. https://doi.org/10.1109/TETC.2016.2577538
Loshchilov I, Hutter F (2017) SGDR: Stochastic gradient descent with warm restarts. In: 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings. Toulon, France, pp 1–16
Mao QC, Sun HM, Liu YB, Jia RS (2019) Mini-YOLOv3: real-time object detector for embedded applications. IEEE Access 7:133529–133538. https://doi.org/10.1109/ACCESS.2019.2941547
Min W, Li X, Wang Q, Zeng Q, Liao Y (2019) New approach to vehicle license plate location based on new model YOLO-L and plate pre-identification. IET Image Process 13:1041–1049. https://doi.org/10.1049/iet-ipr.2018.6449
Misra D (2019) Mish: A Self Regularized Non-Monotonic Neural Activation Function. arXiv 1–14
Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. Honolulu, HI, USA, pp 6517–6525
Redmon J, Farhadi A (2018) YOLOv3: An Incremental Improvement. CoRR abs/1804.0:1–6
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Salti S, Petrelli A, Tombari F, Fioraio N, Di Stefano L (2015) Traffic sign detection via interest region extraction. Pattern Recogn 48:1039–1049. https://doi.org/10.1016/j.patcog.2014.05.017
Shi R, Li T, Yamaguchi Y (2020) An attribution-based pruning method for real-time mango detection with YOLO network. Computers and electronics in agriculture 1–11. https://doi.org/10.1016/j.compag.2020.105214
Shustanov A, Yakimov P (2017) CNN Design for Real-Time Traffic Sign Recognition. In: Procedia Engineering. pp. 718–725, CNN Design for Real-Time Traffic Sign Recognition
Tabernik D, Skocaj D (2020) Deep learning for large-scale traffic-sign detection and recognition. IEEE Trans Intell Transp Syst 21:1427–1440. https://doi.org/10.1109/TITS.2019.2913588
Tai S, Dewi C, Chen R, Liu Y, Jiang X (2020) Deep learning for traffic sign recognition based on spatial pyramid pooling with scale analysis. Applied Sciences (Switzerland) 10:6997. https://doi.org/10.3390/app10196997
Tan M, Pang R, Le Q V (2020) EfficientDet: scalable and efficient object detection. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR). Pp 10778–10787
Tian Y, Yang G, Wang Z, Wang H, Li E, Liang Z (2019) Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Comput Electron Agric 157:417–426. https://doi.org/10.1016/j.compag.2019.01.012
Wang H, Yu H (2020) Traffic sign detection algorithm based on improved YOLOv4. In: ITAIC 2020 - IEEE 9th joint international information technology and artificial intelligence conference. Pp 1946–1950
Wang C, Liao HM, Wu Y, Chen P (2020) CSPNet: a new backbone that can enhance learning capability of cnn. In: proceedings of the IEEE conference on computer vision and pattern recognition workshop (CVPR workshop). P 2
Wu F, Jin G, Gao M, He Z, Yang Y (2019) Helmet detection based on improved YOLO V3 deep model. In: proceedings of the 2019 IEEE 16th international conference on networking, Sensing and Control, ICNSC 2019. Banff, AB, Canada, pp 363–368
Xu Q, Lin R, Yue H, Huang H, Yang Y, Yao Z (2020) Research on small target detection in driving scenarios based on improved Yolo network. IEEE Access 8:27574–27583. https://doi.org/10.1109/ACCESS.2020.2966328
Yang T, Long X, Sangaiah AK, Zheng Z, Tong C (2018) Deep detection network for real-life traffic sign in vehicular networks. Comput Netw 136:95–104. https://doi.org/10.1016/j.comnet.2018.02.026
Yang H, Chen L, Chen M, Ma Z, Deng F, Li M, Li X (2019) Tender tea shoots recognition and positioning for picking robot using improved YOLO-V3 model. IEEE Access 7:180998–181011. https://doi.org/10.1109/ACCESS.2019.2958614
Yun S, Han D, Chun S, Oh SJ, Choe J, Yoo Y (2019) CutMix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 6022–6031
Zhang Z, He T, Zhang H, Zhang Z, Xie J, Li M (2019) Bag of Freebies for Training Object Detection Neural Networks. arXiv:190204103v3 1–9
Zhang J, Xie Z, Sun J, Zou X, Wang J (2020) A cascaded R-CNN with multiscale attention and imbalanced samples for traffic sign detection. IEEE Access 8:29742–29754. https://doi.org/10.1109/ACCESS.2020.2972338
Zhang R, Zhu F, Liu J, Liu G (2020) Depth-wise separable convolutions and multi-level pooling for an efficient spatial CNN-based Steganalysis. IEEE Transactions on Information Forensics and Security 15:1138–1150. https://doi.org/10.1109/TIFS.2019.2936913
Zhu Z, Liang D, Zhang S, Huang X, Li B, Hu S (2016) Traffic-sign detection and classification in the wild. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. pp. 2110–2118
Zoph B, Cubuk ED, Ghiasi G, Lin T-Y, Shlens J, Le Q V. (2019) Learning data augmentation strategies for object detection. https://arxiv.org/abs/190611172v1 1–13
Acknowledgments
This paper is supported by the Ministry of Science and Technology, Taiwan. The Nos are MOST-107-2221-E-324 -018 -MY2 and MOST-109-2622-E-324 -004, Taiwan. This research is also partially sponsored by Chaoyang University of Technology (CYUT) and Higher Education Sprout Project, Ministry of Education (MOE), Taiwan, under the project name: “The R&D and the cultivation of talent for health-enhancement products.”
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declared that they have no conflicts of interest in this work.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dewi, C., Chen, RC., Jiang, X. et al. Deep convolutional neural network for enhancing traffic sign recognition developed on Yolo V4. Multimed Tools Appl 81, 37821–37845 (2022). https://doi.org/10.1007/s11042-022-12962-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12962-5