Skip to main content
Log in

A Practical and Highly Optimized Convolutional Neural Network for Classifying Traffic Signs in Real-Time

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Classifying traffic signs is an indispensable part of Advanced Driver Assistant Systems. This strictly requires that the traffic sign classification model accurately classifies the images and consumes as few CPU cycles as possible to immediately release the CPU for other tasks. In this paper, we first propose a new ConvNet architecture. Then, we propose a new method for creating an optimal ensemble of ConvNets with highest possible accuracy and lowest number of ConvNets. Our experiments show that the ensemble of our proposed ConvNets (the ensemble is also constructed using our method) reduces the number of arithmetic operations 88 and \(73\,\%\) compared with two state-of-art ensemble of ConvNets. In addition, our ensemble is \(0.1\,\%\) more accurate than one of the state-of-art ensembles and it is only \(0.04\,\%\) less accurate than the other state-of-art ensemble when tested on the same dataset. Moreover, ensemble of our compact ConvNets reduces the number of the multiplications 95 and \(88\,\%\), yet, the classification accuracy drops only 0.2 and \(0.4\,\%\) compared with these two ensembles. Besides, we also evaluate the cross-dataset performance of our ConvNet and analyze its transferability power in different layers. We show that our network is easily scalable to new datasets with much more number of traffic sign classes and it only needs to fine-tune the weights starting from the last convolution layer. We also assess our ConvNet through different visualization techniques. Besides, we propose a new method for finding the minimum additive noise which causes the network to incorrectly classify the image by minimum difference compared with the highest score in the loss vector.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. http://benchmark.ini.rub.de/.

  2. The ConvNet architecture and its trained models are available at https://github.com/pcnn/traffic-sign-recognition.

  3. The percent of the samples which are always within the top 2 classification scores.

  4. We calculated the number of the multiplications of a ConvNet taking into account the number of the multiplications for convolving the filters of each layer with the N-channel input from the previous layer, number of the multiplications required for computing the activations of each layer and the number of the multiplications imposed by normalization layers. We showed in Sect. 3 that tanh function utilized in Ciresan et al. (2012) can be efficiently computed using 10 multiplications. ReLU activation used in Jin et al. (2014) does not need any multiplications and Leaky ReLU units in our ConvNet compute the results using only 1 multiplication. Finally, considering that pow(float, float) function needs only 1 multiplication and 64 shift operations (http://tinyurl.com/yehg932), the normalization layer in Jin et al. (2014) requires \(k\times k+3\) multiplications per each element in the feature map.

References

  • Aghdam, H. H., Heravi, E. J., & Puig, D. (2015). A unified framework for coarse-to-fine recognition of traffic signs using Bayesian network and visual attributes. In: 10th international conference on computer vision theory and applications (VISAPP) (pp. 87–96). doi:10.5220/0005303500870096

  • Baró, X., Escalera, S., Vitrià, J., Pujol, O., & Radeva, P. (2009). Traffic sign recognition using evolutionary adaboost detection and forest-ECOC classification. IEEE Transactions on Intelligent Transportation Systems, 10(1), 113–126. doi:10.1109/TITS.2008.2011702.

    Article  Google Scholar 

  • Ciresan, D., Meier, U., Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. In 2012 IEEE conference on computer vision and pattern recognition (pp. 3642–3649). IEEE. doi:10.1109/CVPR.2012.6248110, arXiv:1202.2745v1

  • Coates, A., & Ng, A. Y. (2012). Learning feature representations with K-means. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7700 LECTU:561–580, doi:10.1007/978-3-642-35289-8-30

  • Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T. (2014). DeCAF: A deep convolutional activation feature for generic visual recognition. In: International conference on machine learning (pp. 647–655) arXiv:1310.1531.

  • Dosovitskiy, A., & Brox, T. (2015). Inverting convolutional networks with convolutional networks (pp. 1–15). arXiv preprint arXiv:1506.02753

  • Fleyeh, H., & Davami, E. (2011). Eigen-based traffic sign recognition. IET Intelligent Transport Systems, 5(3), 190. doi:10.1049/iet-its.2010.0159.

    Article  Google Scholar 

  • Gao, X. W., Podladchikova, L., Shaposhnikov, D., Hong, K., & Shevtsova, N. (2006). Recognition of traffic signs based on their colour and shape features extracted using human vision models. Journal of Visual Communication and Image Representation, 17(4), 675–685. doi:10.1016/j.jvcir.2005.10.003.

    Article  Google Scholar 

  • Girshick, R., Donahue, J., Darrell, T., Berkeley, U. C., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. doi:10.1109/CVPR.2014.81, arXiv:1311.2524.

  • Greenhalgh, J., & Mirmehdi, M. (2012). Real-time detection and recognition of road traffic signs. IEEE Transactions on Intelligent Transportation Systems, 13(4), 1498–1506. doi:10.1109/tits.2012.2208909.

    Article  Google Scholar 

  • He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. arXiv preprint arXiv:1502.01852

  • Hinton, G. (2014). Dropout : A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research (JMLR), 15, 1929–1958.

    MathSciNet  MATH  Google Scholar 

  • Hsu, S. H., & Huang, C. L. (2001). Road sign detection and recognition using matching pursuit method. Image and Vision Computing, 19(3), 119–129. doi:10.1016/S0262-8856(00)00050-0.

    Article  Google Scholar 

  • Huang, G., Mao, K. Z., Siew, C., Huang, D. (2013). A hierarchical method for traffic sign classification with support vector machines. In: The 2013 international joint conference on neural networks (IJCNN) pp 1–6. IEEE. doi:10.1109/IJCNN.2013.6706803

  • Jin, J., Fu, K., & Zhang, C. (2014). Traffic sign recognition with hinge loss trained convolutional neural networks. IEEE Transactions on Intelligent Transportation Systems, 15(5), 1991–2000. doi:10.1109/TITS.2014.2308281.

    Article  Google Scholar 

  • Krizhevsky, A., Sutskever, I., Hinton, G. (2012). Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105. Curran Associates, Inc.

  • Larsson, F., & Felsberg, M. (2011). Using Fourier descriptors and spatial models for traffic sign recognition. In: Image analysis lecture notes in computer science (Vol 6688, pp. 238–249). Springer. doi:10.1007/978-3-642-21227-7_23

  • Liu, H., Liu, Y., & Sun, F. (2014). Traffic sign recognition using group sparse coding. Information Sciences, 266, 75–89. doi:10.1016/j.ins.2014.01.010.

    Article  Google Scholar 

  • Lu, K., Ding, Z., & Ge, S. (2012). Sparse-representation-based graph embedding for traffic sign recognition. IEEE Transactions on Intelligent Transportation Systems, 13(4), 1515–1524. doi:10.1109/TITS.2012.2220965.

    Article  Google Scholar 

  • Maas, A. L., Hannun, A. Y., & Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models. In International conference on machine learning (ICML) workshop on deep learning (Vol 30)

  • Mahendran, A., & Vedaldi, A. (2015). Understanding deep image representations by inverting them. In Computer vision and pattern recognition (pp. 5188–5196). IEEE, Boston. doi:10.1109/CVPR.2015.7299155, arXiv:1412.0035

  • Maldonado-Bascon, S., Lafuente-Arroyo, S., Gil-Jimenez, P., Gomez-Moreno, H., & Lopez-Ferreras, F. (2007). Road-sign detection and recognition based on support vector machines. IEEE Transactions on Intelligent Transportation Systems, 8(2), 264–278. doi:10.1109/TITS.2007.895311.

    Article  MATH  Google Scholar 

  • Maldonado Bascón, S., Acevedo Rodríguez, J., Lafuente Arroyo, S., Fernndez Caballero, A., & López-Ferreras, F. (2010). An optimization on pictogram identification for the road-sign recognition task using SVMs. Computer Vision and Image Understanding, 114(3), 373–383. doi:10.1016/j.cviu.2009.12.002.

    Article  Google Scholar 

  • Mathias, M., Timofte, R., Benenson, R., & Van Gool, L. (2013). Traffic sign recognition–How far are we from the solution? International joint conference on neural networks,. doi:10.1109/IJCNN.2013.6707049.

    Google Scholar 

  • Møgelmose, A., Trivedi, M. M., & Moeslund, T. B. (2012). Vision-based traffic sign detection and analysis for intelligent driver assistance systems: Perspectives and survey. IEEE Transactions on Intelligent Transportation Systems, 13(4), 1484–1497. doi:10.1109/TITS.2012.2209421.

    Article  Google Scholar 

  • Moiseev, B., Konev, A., Chigorin, A., & Konushin, A. (2013). Evaluation of traffic sign recognition methods trained on synthetically generated data. In: 15th international conference on advanced concepts for intelligent vision systems (ACIVS), Springer, Pozna?, pp 576–583, doi:10.1007/978-3-319-02895-8_52

  • Paclík, P., Novovičová, J., Pudil, P., & Somol, P. (2000). Road sign classification using Laplace kernel classifier. Pattern Recognition Letters, 21(13–14), 1165–1173. doi:10.1016/S0167-8655(00)00078-7.

    Article  MATH  Google Scholar 

  • Piccioli, G., De Micheli, E., Parodi, P., & Campani, M. (1996). Robust method for road sign detection and recognition. Image and Vision Computing, 14(3), 209–223. doi:10.1016/0262-8856(95)01057-2.

    Article  Google Scholar 

  • Ruta, A., Li, Y., & Liu, X. (2010). Robust class similarity measure for traffic sign recognition. IEEE Transactions on Intelligent Transportation Systems, 11(4), 846–855. doi:10.1109/TITS.2010.2051427.

    Article  Google Scholar 

  • Sermanet, P., & Lecun, Y. (2011). Traffic sign recognition with multi-scale convolutional networks. In Proceedings of the international joint conference on neural networks (pp. 2809–2813). doi:10.1109/IJCNN.2011.6033589

  • Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y. (2013). OverFeat : Integrated recognition , localization and detection using convolutional networks. In arXiv preprint arXiv:1312.6229, pp. 1–15

  • Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In International conference on learning representation (ICLR) (pp. 1–13), 1409.1556v5.

  • Simonyan, K., Vedaldi, A., & Zisserman, A. (2013). Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv. preprint, 13126034, 1–8.

    Google Scholar 

  • Stallkamp, J., Schlipsing, M., Salmen, J., & Igel, C. (2012). Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks, 32, 323–332. doi:10.1016/j.neunet.2012.02.016.

    Article  Google Scholar 

  • Sun, Z. L., Wang, H., Lau, W. S., Seet, G., & Wang, D. (2014). Application of BW-ELM model on traffic sign recognition. Neurocomputing, 128, 153–159. doi:10.1016/j.neucom.2012.11.057.

    Article  Google Scholar 

  • Szegedy, C., Reed, S., Sermanet, P., Vanhoucke, V., & Rabinovich, A. (2014a). Going deeper with convolutions. In: arXiv preprint arXiv:1409.4842, pp. 1–12.

  • Szegedy, C., Zaremba, W., Sutskever, I. (2014b). Intriguing properties of neural networks. arXiv:1312.6199v4

  • Tibshirani, R. (1994). Regression Selection and Shrinkage via the Lasso.,. doi:10.2307/2346178.

    Google Scholar 

  • Timofte R, Van Gool, L. (2011). Sparse representation based projections. In: 22nd British Machine Vision Conference (pp. 61.1–61.12). BMVA Press. doi:10.5244/C.25.61

  • Timofte, R., Zimmermann, K., & Van Gool, L. (2011). Multi-view traffic sign detection, recognition, and 3D localisation. Machine Vision and Applications (November):1–15. doi:10.1007/s00138-011-0391-3.

  • Van Der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9, 2579–2605. doi:10.1007/s10479-011-0841-3.

    MATH  Google Scholar 

  • Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., & Gong, Y. (2010). Locality-constrained linear coding for image classification. In IEEE computer vision and pattern recognition (CVPR) (pp. 3360–3367). doi:10.1109/CVPR.2010.5540018.

  • Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How transferable are features in deep neural networks ? Neural Information Processing System (NIPS), 27. arXiv:1411.1792v1.

  • Yuan, X., Hao, X., Chen, H., & Wei, X. (2014). Robust traffic sign recognition based on color global and local oriented edge magnitude patterns. IEEE Transactions on Intelligent Transportation Systems, 15(4), 1466–1474. doi:10.1109/TITS.2014.2298912.

    Article  Google Scholar 

  • Zaklouta, F., & Stanciulescu, B. (2012). Real-time traffic-sign recognition using tree classifiers. IEEE Transactions on Intelligent Transportation Systems, 13(4), 1507–1514. doi:10.1109/TITS.2012.2225618.

    Article  Google Scholar 

  • Zaklouta, F., & Stanciulescu, B. (2014). Real-time traffic sign recognition in three stages. Robotics and Autonomous Systems, 62(1), 16–24. doi:10.1016/j.robot.2012.07.019.

    Article  Google Scholar 

  • Zaklouta, F., Stanciulescu, B., & Hamdoun, O. (2011). Traffic sign classification using K-d trees and random forests. In Proceedings of the international joint conference on neural networks (pp. 2151–2155). doi:10.1109/IJCNN.2011.6033494.

  • Zeiler, M., & Fergus, R. (2014). Visualizing and understanding convolutional networks. European Conference on Computer Vision (ECCV), 8689, 818–833. doi:10.1007/978-3-319-10590-1_53.1311.2901.

  • Zeng, Y., Xu, X., Fang, Y., & Zhao, K. (2015). Traffic sign recognition using deep convolutional networks and extreme learning machine. In Intelligence science and big data engineering. image and video data engineering (IScIDE ) (pp. 272–280). Springer. doi:10.1007/978-3-319-23989-7_28.

Download references

Acknowledgments

The authors are grateful for the support granted by Generalitat de Catalunya’s Agècia de Gestió d’Ajuts Universitaris i de Recerca (AGAUR) through FI-DGR 2015 and Martí Franquès 2015 fellowships.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamed Habibi Aghdam.

Additional information

Communicated by Hiroshi Ishikawa, Takeshi Masuda, Yasuyo Kita and Katsushi Ikeuchi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aghdam, H.H., Heravi, E.J. & Puig, D. A Practical and Highly Optimized Convolutional Neural Network for Classifying Traffic Signs in Real-Time. Int J Comput Vis 122, 246–269 (2017). https://doi.org/10.1007/s11263-016-0955-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-016-0955-9

Keywords

Navigation