skip to main content
10.1145/3546157.3546166acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicisdmConference Proceedingsconference-collections
research-article

Traffic Sign Recognition with Vision Transformers

Published:22 August 2022Publication History

ABSTRACT

Traffic sign recognition is an integral part of future autonomous driving systems. Deep learning has been applied in this task, while the performance of the recent vision Transformers is unexplored. In this study, eight different vision Transformers are validated in three real-world traffic sign datasets for the first time. The experimental results demonstrate that the best vision Transformer has a performance between the pre-trained DenseNet and the DenseNet trained from scratch. Besides, the best vision Transformers model has less training time compared to DenseNet.

References

  1. Mathias M, Timofte R, Benenson R, Traffic sign recognition—How far are we from the solution?[C]//The 2013 international joint conference on Neural networks (IJCNN). IEEE, 2013: 1-8.Google ScholarGoogle Scholar
  2. Jiang W, Zhang L. Edge-siamnet and edge-triplenet: New deep learning models for handwritten numeral recognition[J]. IEICE Transactions on Information and Systems, 2020, 103(3): 720-723.Google ScholarGoogle ScholarCross RefCross Ref
  3. Jiang W. Evaluation of deep learning models for Urdu handwritten characters recognition[C]//Journal of Physics: Conference Series. IOP Publishing, 2020, 1544(1): 012016.Google ScholarGoogle ScholarCross RefCross Ref
  4. Zaibi A, Ladgham A, Sakly A. A Lightweight Model for Traffic Sign Classification Based on Enhanced LeNet-5 Network[J]. Journal of Sensors, 2021, 2021.Google ScholarGoogle Scholar
  5. Loukmane A, Graña M, Mestari M. A Model for Classification of Traffic Signs Using Improved Convolutional Neural Network and Image Enhancement[C]//2020 Fourth International Conference On Intelligent Computing in Data Sciences (ICDS). IEEE, 2020: 1-8.Google ScholarGoogle Scholar
  6. Haque W A, Arefin S, Shihavuddin A S M, DeepThin: A novel lightweight CNN architecture for traffic sign recognition without GPU requirements[J]. Expert Systems with Applications, 2021, 168: 114481.Google ScholarGoogle ScholarCross RefCross Ref
  7. Luo X, Zhu J, Yu Q. Efficient convNets for fast traffic sign recognition[J]. IET Intelligent Transport Systems, 2019, 13(6): 1011-1015.Google ScholarGoogle ScholarCross RefCross Ref
  8. Satti S K, Devi K S, Dhar P, Enhancing and classifying traffic signs using computer vision and deep convolutional neural network[C]//International Conference on Machine Learning, Image Processing, Network Security and Data Sciences. Springer, Singapore, 2020: 243-253.Google ScholarGoogle Scholar
  9. Arcos-Garcia A, Alvarez-Garcia J A, Soria-Morillo L M. Evaluation of deep neural networks for traffic sign detection systems[J]. Neurocomputing, 2018, 316: 332-344.Google ScholarGoogle ScholarCross RefCross Ref
  10. Cao J, Song C, Peng S, Improved traffic sign detection and recognition algorithm for intelligent vehicles[J]. Sensors, 2019, 19(18): 4021.Google ScholarGoogle ScholarCross RefCross Ref
  11. Yazdan R, Varshosaz M. Improving traffic sign recognition results in urban areas by overcoming the impact of scale and rotation[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 171: 18-35.Google ScholarGoogle ScholarCross RefCross Ref
  12. Zhang J, Wang W, Lu C, Lightweight deep network for traffic sign classification[J]. Annals of Telecommunications, 2020, 75(7): 369-379.Google ScholarGoogle ScholarCross RefCross Ref
  13. Gao Y, Xiao G. Real-time chinese traffic warning signs recognition based on cascade and CNN[J]. Journal of Real-Time Image Processing, 2021, 18(3): 669-680.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Stallkamp J, Schlipsing M, Salmen J, The German traffic sign recognition benchmark: a multi-class classification competition[C]//The 2011 international joint conference on neural networks. IEEE, 2011: 1453-1460.Google ScholarGoogle Scholar
  15. Zhang J, Xie Z, Sun J, A cascaded R-CNN with multiscale attention and imbalanced samples for traffic sign detection[J]. IEEE Access, 2020, 8: 29742-29754.Google ScholarGoogle ScholarCross RefCross Ref
  16. Dosovitskiy A, Beyer L, Kolesnikov A, An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv:2010.11929, 2020.Google ScholarGoogle Scholar
  17. Zhou D, Kang B, Jin X, Deepvit: Towards deeper vision transformer[J]. arXiv preprint arXiv:2103.11886, 2021.Google ScholarGoogle Scholar
  18. Touvron H, Cord M, Sablayrolles A, Going deeper with image transformers[J]. arXiv preprint arXiv:2103.17239, 2021.Google ScholarGoogle Scholar
  19. Hassani A, Walton S, Shah N, Escaping the big data paradigm with compact transformers[J]. arXiv preprint arXiv:2104.05704, 2021.Google ScholarGoogle Scholar
  20. Chen C F, Fan Q, Panda R. Crossvit: Cross-attention multi-scale vision transformer for image classification[J]. arXiv preprint arXiv:2103.14899, 2021.Google ScholarGoogle Scholar
  21. Heo B, Yun S, Han D, Rethinking spatial dimensions of vision transformers[J]. arXiv preprint arXiv:2103.16302, 2021.Google ScholarGoogle Scholar
  22. Chu X, Tian Z, Wang Y, Twins: Revisiting the design of spatial attention in vision transformers[J]. arXiv preprint arXiv:2104.13840, 2021, 1(2): 3.Google ScholarGoogle Scholar
  23. Caron M, Touvron H, Misra I, Emerging properties in self-supervised vision transformers[J]. arXiv preprint arXiv:2104.14294, 2021.Google ScholarGoogle Scholar
  24. Touvron H, Cord M, Douze M, Training data-efficient image transformers & distillation through attention[C]//International Conference on Machine Learning. PMLR, 2021: 10347-10357.Google ScholarGoogle Scholar
  25. Wu H, Xiao B, Codella N, Cvt: Introducing convolutions to vision transformers[J]. arXiv preprint arXiv:2103.15808, 2021.Google ScholarGoogle Scholar
  26. Tolstikhin I, Houlsby N, Kolesnikov A, Mlp-mixer: An all-mlp architecture for vision[J]. arXiv preprint arXiv:2105.01601, 2021.Google ScholarGoogle Scholar

Index Terms

  1. Traffic Sign Recognition with Vision Transformers

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICISDM '22: Proceedings of the 6th International Conference on Information System and Data Mining
        May 2022
        144 pages
        ISBN:9781450396257
        DOI:10.1145/3546157

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 August 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)86
        • Downloads (Last 6 weeks)8

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format