Skip to main content
Log in

Real-time small traffic sign detection with revised faster-RCNN

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Traffic sign detection is a crucial step for automatic driving and Intelligent Transportation. Promising results have been achieved in the area of traffic sign detection, but most of them are limited to ideal environment, where the traffic signs are very clear and large. Actually, traffic sign detection is always realized based on object detection methods. However, existing object detection methods failed to detect most of the traffic signs, especially in surveillance videos or driving recorder videos. In fact, traffic signs, i.e. traffic lights, or distant road signs in driving recorded video, always cover less than 5% of the whole image in the view of camera. Therefore, in this paper, we dedicate an effort to propose a real-time small traffic sign detection approach based on revised Faster-RCNN. More specifically, firstly, we use a small region proposal generator to extract the characteristics of small traffic signs. That is to say, considering that the stride of generator is too large, we remove the pool4 layer of VGG-16 and adopt dilation for ResNet. Secondly, we combine the revised architecture of Faster-RCNN with Online Hard Examples Mining (OHEM) to make the system more robust to locate the region of small traffic signs. Finally, we conduct extensive experiments and empirical evaluations on several different videos to demonstrate the satisfying performance of our approach. i.e., the experimental results show our approach improve the mean average precision by 12.1% over the original object detection algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://host.robots.ox.ac.uk/pascal/VOC/

  2. http://www.image-net.org/challenges/LSVRC/

  3. http://benchmark.ini.rub.de/

References

  1. Bartlett W (1997) Mel. Seemore: combining color, shape, and texture histogramming in a neurally inspired approach to visual object recognition. Neural comput 9(4):777–804

    Article  Google Scholar 

  2. Baro X, Escalera S, Vitria J, Pujol O, Radeva P (2009) Traffic sign recognition using evolutionary adaboost detection and forest-ecoc classification. IEEE Trans Intell Transp Syst 10(1):113–126

    Article  Google Scholar 

  3. Belongie S, Malik J, Puzicha J (2001) Matching shapes. In: IEEE International conference on computer vision, pp 454–461

  4. Chen C, Liu MY, Tuzel O, Xiao J (2016) R-cnn for small object detection. In: Asian conference on computer vision, pp 214–230

  5. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Computer vision and pattern recognition, 2005. CVPR 2005. IEEE computer society conference on, pp 886– 893

  6. De A, la E, Moreno LE, Salichs MA, Armingol JM (1997) Road traffic sign detection and classification. IEEE Trans Ind Electron 44(6):848–859

    Article  Google Scholar 

  7. Felzenszwalb Pedro, McAllester David, Ramanan Deva (2008) A discriminatively trained, multiscale, deformable part model. In: Computer vision and pattern recognition, 2008. CVPR 2008. IEEE conference on, pp 1–8. IEEE

  8. Gidaris S, Komodakis N (2015) Object detection via a multi-region and semantic segmentation-aware cnn model. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1134–1142

  9. Girshick R (2015) Fast r-cnn. In: IEEE International conference on computer vision, pp 1440–1448

  10. Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. Computer Science, pp 580–587

  11. Greenhalgh J, Mirmehdi M (2012) Real-time detection and recognition of road traffic signs. IEEE Trans Intell Transp Syst 13(4):1498–1506

    Article  Google Scholar 

  12. He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, pp 346–361

  13. Khan FS, Anwer RM, Weijer JVd, Bagdanov AD, Vanrell M, Lopez AM (2012) Color attributes for object detection. In: Computer vision and pattern recognition (CVPR), 2012 IEEE conference on, pp 3306–3313. IEEE

  14. Li Y, He K, Sun J et al (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387

  15. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, C Lawrence Z (2014) Microsoft coco Common objects in context. In: European conference on computer vision, pp 740–755

  16. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, pp 21–37

  17. Loshchilov I, Hutter F (2015) Online batch selection for faster training of neural networks Mathematics

  18. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  19. Qian R, Zhang B, Yue Y, Wang Z, Coenen F (2016) Robust chinese traffic sign detection and recognition with deep convolutional neural network. In: International conference on natural computation, pp 791–796

  20. Redmon J, Divvala S, Girshick R, once AF (2016) You only look Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779– 788

  21. Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis & Machine Intelligence 39(6): 1137

    Article  Google Scholar 

  22. Rui Hu, Barnard M, Collomosse J (2010) Gradient field descriptor for sketch based retrieval and localization. In: IEEE International conference on image processing, pp 1025–1028

  23. Sang J, Changsheng Xu, Liu J (2012) User-aware image tag refinement via ternary semantic analysis. IEEE Transactions on Multimedia 14(3):883–895

    Article  Google Scholar 

  24. Sang J, Xu C (2012) Right buddy makes the difference An early exploration of social relation analysis in multimedia applications. In: Proceedings of the 20th ACM international conference on Multimedia, pp 19–28. ACM

  25. Sang J, Fang Q, Xu C (2017) Exploiting social-mobile information for location visualization. ACM Transactions on Intelligent Systems and Technology (TIST) 8(3):39

    Google Scholar 

  26. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, Overfeat YL (2013) Integrated recognition, localization and detection using convolutional networks. Eprint Arxiv

  27. Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: IEEE Conference on computer vision and pattern recognition, pp 761–769

  28. Simo-Serra E, Trulls E, Ferraz L, Kokkinos I, Moreno-Noguer F (2014) Fracking deep convolutional image descriptors, arXiv:1412.6537.2

  29. Stallkamp J, Schlipsing M, Salmen J, Igel C (2012) Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition. Neural Networks the Official Journal of the International Neural Network Society 32(2):323–332

    Article  Google Scholar 

  30. Takeki A, Tu TT, Yoshihashi R, Kawakami R, Iida M, Naemura T (2016) Detection of small birds in large images by combining a deep detector with semantic segmentation. In: IEEE International conference on image processing, pp 3977–3981

  31. Uijlings JR, Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

    Article  Google Scholar 

  32. Wang X, Gupta A (2015) Unsupervised learning of visual representations using videos pp 2794–2802

  33. Xie K, Ge S, Ye Q, Luo Z (2016) Traffic sign recognition based on attribute-refinement cascaded convolutional neural networks. In: Pacific rim conference on multimedia, pp 201–210

  34. Yang B, Yan J, Lei Z, Li SZ (2014) Aggregate channel features for multi-view face detection. In: Biometrics (IJCB), 2014 IEEE international joint conference on, pp 1–8

  35. Zhao WL, Ngo CW (2013) Flip-invariant sift for copy and object detection. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society 22(3):980–91

    Article  MathSciNet  MATH  Google Scholar 

  36. Zhu Y, Zhang C, Zhou D, Wang X, Bai X, Liu W (2016) Traffic sign detection and recognition using fully convolutional network guided proposals. Neurocomputing 214:758–766

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant NO. 61401023.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangyu Gao.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, C., Gao, G. & Zhang, Y. Real-time small traffic sign detection with revised faster-RCNN. Multimed Tools Appl 78, 13263–13278 (2019). https://doi.org/10.1007/s11042-018-6428-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6428-0

Keywords

Navigation