Abstract
Convolutional neural networks method is a commonly used traffic sign recognition method based on deep learning over recent years. However, traffic signs contain objects of different sizes. Since small objects occupy a small input image area, the features that can be extracted are less, and the detection difficulty is greater than that of medium and large objects, and it is still challenging to achieve high-speed and high-accuracy detection of all objects of different sizes at the same time. In this paper, a model for detecting traffic signs is proposed, namely CDFF and CDFF-s. The model contains the following four modules: (1) in the backbone part of the model, we apply an improved activation function FMish to increase training stability, (2) after the backbone of the model, we apply the DFb-SPP module to perform context and semantic fusion, (3) in the neck part of the model, we use the DFb module for feature fusion, which also reduces the number of parameters, and (4) in the head part of the model, we propose a loss function SCIoU, which is optimized for small objects and the model is converged faster. The experimental results on the general traffic sign datasets TT100K and LISA show that the proposed two models can achieve accurate small object detection without losing the detection accuracy of medium and large objects. In addition, excellent results are also obtained on the remote sensing dataset RSOD with similar object size distribution. Meanwhile, the detection speed is faster than YOLOv4, which can meet the accuracy and real-time requirements of automatic driving systems and assisted driving systems.













Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Akatsuka H, Imai S (1987) Road signposts recognition system. No. 870239. SAE Technical Paper
Bi Z, Yu L, Gao H, Zhou P, Yao H (2020) Improved VGG model-based efficient traffic sign recognition for safe driving in 5G scenarios. Int J Mach Learn Cybernet 1–12
Gudigar A, Chokkadi S, Raghavendra U (2016) A review on automatic detection and recognition of traffic sign. Multimedia Tools Appl 75(1):333–364
Ojala T, Pietikainen M, Harwood D (1994) Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Proceedings of 12th international conference on pattern recognition, vol 1. IEEE, pp 582–585
Grigorescu SE, Petkov N, Kruizinga P (2002) Comparison of texture features based on Gabor filters. IEEE Trans Image Process 11(10):1160–1167
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), vol 1. IEEE, pp 886–893.
Stallkamp J, Schlipsing M, Salmen J, Igel C (2011) The German traffic sign recognition benchmark: a multi-class classification competition. In: The 2011 international joint conference on neural networks. IEEE, pp 1453–1460
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Ren S, He K, Girshick R, Sun J (2016) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767.
Bochkovskiy A, Wang CY, Liao HYM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Cireşan D, Meier U, Masci J, Schmidhuber J (2011) A committee of neural networks for traffic sign classification. In: The 2011 international joint conference on neural networks. IEEE, pp 1918–1921
CireAan D, Meier U, Masci J, Schmidhuber J (2012) Multi-column deep neural network for traffic sign classification. Neural Netw 32:333–338
Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: European conference on computer vision. Springer, Cham, pp 354–370
Chen J, Jia K, Chen W, Lv Z, Zhang R (2021) A real-time and high-precision method for small traffic-signs recognition. Neural Comput Appl 1–13
Poudel RP, Liwicki S, Cipolla R (2019) Fast-scnn: Fast semantic segmentation network. arXiv preprint arXiv:1902.04502
Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: A single-shot object detector based on multi-level feature pyramid network. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, no 01, pp 9259–9266
Adarsh P, Rathi P, Kumar M (2020) YOLO v3-Tiny: Object Detection and Recognition using one stage improved model. In: 2020 6th international conference on advanced computing and communication systems (ICACCS). IEEE, pp 687–694
Jiang Z, Zhao L, Li S, Jia Y (2020) Real-time object detection method based on improved YOLOv4-tiny. arXiv preprint arXiv:2011.04244
Yan B, Fan P, Lei X, Liu Z, Yang F (2021) A real-time apple targets detection method for picking robot based on improved YOLOv5. Remote Sens 13(9):1619
Mogelmose A, Trivedi MM, Moeslund TB (2012) Vision-based traffic sign detection and analysis for intelligent driver assistance systems: perspectives and survey. IEEE Trans Intell Transp Syst 13(4):1484–1497
Long Y, Gong Y, Xiao Z, Liu Q (2017) Accurate object localization in remote sensing images based on convolutional neural networks. IEEE Trans Geosci Remote Sens 55(5):2486–2498
Soo S (2014) Object detection using Haar-cascade Classifier. Institute of Computer Science, University of Tartu 2(3):1–12
Padilla R, Costa Filho CFF, Costa MGF (2012) Evaluation of haar cascade classifiers designed for face detection. World Acad Sci Eng Technol 64:362–365
Setjo CH, Achmad B (2017) Thermal image human detection using Haar-cascade classifier. In: 2017 7th international annual engineering seminar (InAES). IEEE, pp 1–6
Zhu Q, Yeh MC, Cheng KT, Avidan S (2006) Fast human detection using a cascade of histograms of oriented gradients. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR'06), vol 2. IEEE, pp 1491–1498
Suykens JA, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Suthaharan S (2016) Support vector machine. In: Machine learning models and algorithms for big data classification. Springer, Boston, pp 207–235
Chen S, Wang W, Van Zuylen H (2009) Construct support vector machine ensemble to detect traffic incident. Expert Syst Appl 36(8):10976–10986
Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8
Yan J, Lei Z, Wen L, Li SZ (2014) The fastest deformable part model for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2497–2504
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:91–99
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016). Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, Cham, pp 21–37
Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659
Li Z, Zhou F (2017) FSSD: feature fusion single shot multibox detector. arXiv preprint arXiv:1712.00960.
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Jocher G, Stoken A, Borovec J, Changyu L, Hogan A (2020) Ultralytics/yolov5: v3. 1-Bug Fixes and Performance Improvements. https://doi.org/10.5281/zenodo.3908559
Zeng Y, Lan J, Ran B, Wang Q, Gao J (2015) Restoration of motion-blurred image based on border deformation detection: a traffic sign restoration model. PLoS ONE 10(4):e0120885
Bahlmann C, Zhu Y, Ramesh V, Pellkofer M, Koehler T (2005) A system for traffic sign detection, tracking, and recognition using color, shape, and motion information. In: IEEE Proceedings. intelligent vehicles symposium, 2005. IEEE, pp 255–260
Fleyeh H (2004) Color detection and segmentation for road and traffic signs. In: IEEE conference on cybernetics and intelligent systems, vol 2. IEEE, pp 809–814
Won WJ, Lee M, Son JW (2008) Implementation of road traffic signs detection based on saliency map model. In: 2008 IEEE intelligent vehicles symposium. IEEE, pp 542–547
John V, Yoneda K, Liu Z, Mita S (2015) Saliency map generation by the convolutional neural network for real-time traffic light detection using template matching. IEEE Trans Comput Imaging 1(3):159–173
Abukhait J, Abdel-Qader I, Oh JS, Abudayyeh O (2012) Road sign detection and shape recognition invariant to sign defects. In: 2012 IEEE international conference on electro/information technology. IEEE, pp 1–6
Chourasia JN, Bajaj P (2010) Centroid based detection algorithm for hybrid traffic sign recognition system. In: 2010 3rd international conference on emerging trends in engineering and technology. IEEE, pp 96–100
Froba B, Ernst A (2004) Face detection with the modified census transform. In: Sixth IEEE international conference on automatic face and gesture recognition, 2004. Proceedings. IEEE, pp 91–96
Møgelmose A, Liu D, Trivedi MM (2015) Detection of US traffic signs. IEEE Trans Intell Transp Syst 16(6):3116–3125
Karthikeyan D, Enitha C, Bharathi S, Durkadevi K (2020) Traffic sign detection and recognition using image processing. Int J Eng Res Technol (IJERT) NCICCT—2020 8(08)
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. arXiv preprint arXiv:1710.09829
Kumar AD (2018) Novel deep learning model for traffic sign detection using capsule networks. arXiv preprint arXiv:1805.04424
Alghmgham DA, Latif G, Alghazo J, Alzubaidi L (2019) Autonomous traffic sign (ATSR) detection and recognition using deep CNN. Procedia Comput Sci 163:266–274
Chen G, Wang H, Chen K, Li Z, Song Z, Liu Y, Chen W, Knoll A (2020) A survey of the four pillars for small object detection: Multiscale representation, contextual information, super-resolution, and region proposal. IEEE Trans Syst Man Cybernet Syst
Liu Z, Li D, Ge SS, Tian F (2020) Small traffic sign detection from large image. Appl Intell 50(1):1–13
Zhang R, Yin D, Ding J, Luo Y, Liu W, Yuan M, Zhu C, Zhou Z (2019) A detection method for low-pixel ratio object. Multimedia Tools Appl 78(9):11655–11674
Leng J, Liu Y, Du D, Zhang T, Quan P (2019) Robust obstacle detection and recognition for driver assistance systems. IEEE Trans Intell Transp Syst 21(4):1560–1571
Fang P, Shi Y (2018) Small object detection using context information fusion in faster R-CNN. In: 2018 IEEE 4th international conference on computer and communications (ICCC). IEEE, pp 1537–1540
Lim JS, Astrid M, Yoon HJ, Lee SI (2021) Small object detection using context and attention. In: 2021 international conference on artificial intelligence in information and communication (ICAIIC). IEEE, pp 181–186
Pang Y, Cao J, Wang J, Han J (2019) JCS-Net: Joint classification and super-resolution network for small-scale pedestrian detection in surveillance images. IEEE Trans Inf Forensics Secur 14(12):3322–3331
Yang Z, Chai X, Wang R, Guo W, Wang W, Pu L, Chen X (2019) Prior knowledge guided small object detection on high-resolution images. In: 2019 IEEE international conference on image processing (ICIP). IEEE, pp 86–90
Wilms C, Frintrop S (2018) AttentionMask: Attentive, efficient object proposal generation focusing on small objects. In: Asian conference on computer vision. Springer, Cham, pp 678–694
Wang R, Jiao L, Xie C, Chen P, Du J, Li R (2021) S-RPN: Sampling-balanced region proposal network for small crop pest detection. Comput Electron Agric 187:106290
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Icml
Ramachandran P, Zoph B, Le QV (2017) Swish: a self-gated activation function. arXiv preprint arXiv:1710.05941, 7, 1
Misra D (2019) Mish: a self regularized non-monotonic neural activation function. arXiv preprint arXiv:1908.08681, 4, 2.
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-IoU loss: Faster and better learning for bounding box regression. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, no 07, pp 12993–13000
Iandola F, Moskewicz M, Karayev S, Girshick R, Darrell T, Keutzer K (2014) Densenet: Implementing efficient convnet descriptor pyramids. arXiv preprint arXiv:1404.1869.
Veit A, Wilber MJ, Belongie S (2016) Residual networks behave like ensembles of relatively shallow networks. Adv Neural Inf Process Syst 29:550–558
Ge Z, Liu S, Wang F, Li Z, Sun J (2021) Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430
Acknowledgements
This work was supported by the National Natural Science Foundation of China (grant no. 61972239 and 62071122), the Key Research and Development Program Projects of Shaanxi Province (grant no. 2020GY-024 and 2021GY-182). The authors would like to thank the anonymous reviewers and the associated editor for their valuable comments and suggestions that improved the clarity of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interests regarding the publication of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Evaluating metrics
To quantitatively evaluate the proposed approach, we follow previous work [10,11,12,13, 17, 46]. The evaluation metrics include AP (Average Precision) series, AR (Average Recall) series and mAP (mean Average Precision). With Precision as the vertical coordinate and Recall as the horizontal coordinate, the area composed of the PR (Precision-Recall) curve and the coordinate axis is AP. mAP is the average value of AP for each category, which contains both Precision and Recall, and is the main evaluation metric of the model.
The main evaluation metrics are defined as follows:
-
FPS: Frames Per Second, how many frames of images can be detected in one second.
-
Detecting results:
TP = True positive
TN = True negative
FP = False positive
FN = False negative
-
P: \({\text{Precision}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FP}}}}\)
-
R: \({\text{Recall}} = \frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FN}}}}\)
-
AP: \({\text{AP}} = \mathop \int \limits_{0}^{1} p\left( r \right){\text{d}}r\)
-
mAP: \({\text{mAP}} = \frac{{\mathop \sum \nolimits_{i = 1}^{K} {\text{AP}}_{i} }}{K}\)
-
AR: \({\text{AR}} = 2\mathop \int \limits_{0.5}^{1} {\text{recall}}\left( o \right)do\)
-
APS and ARS: AP and AR for small objects of area smaller than \(32^{2}\);
-
APM and ARM: AP and AR for objects of area between \(32^{2}\) and \(96^{2}\);
-
APL and ARL: AP and AR for large objects of area bigger than \(96^{2}\).
\(K\) is the number of categories.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, L., Wang, L., Zhu, Y. et al. CDFF: a fast and highly accurate method for recognizing traffic signs. Neural Comput & Applic 35, 643–662 (2023). https://doi.org/10.1007/s00521-022-07782-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07782-5