Tripartite Architecture License Plate Recognition Based on Transformer

Xia, Ran; Song, Wei; Liu, Xiangchun; Zhao, Xiaobing

doi:10.1007/978-981-99-8432-9_33

Ran Xia¹⁵,
Wei Song^15,16,17,
Xiangchun Liu¹⁵ &
…
Xiaobing Zhao^15,16,17

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14426))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

521 Accesses

Abstract

Under natural conditions, license plate recognition is easily affected by factors such as lighting and shooting angles. Given the diverse types of Chinese license plates and the intricate structure of Chinese characters compared to Latin characters, accurate recognition of Chinese license plates poses a significant challenge. To address this issue, we introduce a novel Chinese License Plate Transformer (CLPT). In CLPT, license plate images pass through a Transformer encoder, and the resulting Tokens are divided into four categories via an Auto Token Classify (ATC) mechanism. These categories include province, main, suffix, and noise. The first three categories serve to predict the respective parts of the license plate - the province, main body, and suffix. In our tests, we employed YOLOv8-pose as the license plate detector, which excels in detecting both bounding boxes and key points, aiding in the correction of perspective transformation in distorted license plates. Experimental results on the CCPD, CLPD, and CBLPRD datasets demonstrate the superior performance of our method in recognizing both single-row and double-row license plates. We achieved an accuracy rate of 99.6%, 99.5%, and 89.3% on the CCPD Tilt, Rotate, and Challenge subsets, respectively. In addition, our method attained an accuracy of 87.7% in the CLPD and 99.9% in the CBLPRD, maintaining an impressive 99.5% accuracy even for yellow double-row license plates in the CBLPRD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOV4: optimal speed and accuracy of object detection. arXiv preprint: arXiv:2004.10934 (2020)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint: arXiv:2010.11929 (2020)
Gong, Y., et al.: Unified Chinese license plate detection and recognition with high efficiency. J. Vis. Commun. Image Represent. 86, 103541 (2022)
Article Google Scholar
Li, H., Wang, P., Shen, C.: Toward end-to-end car license plate detection and recognition with deep neural networks. IEEE Trans. Intell. Transp. Syst. 20(3), 1126–1136 (2018)
Article Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Mehta, S., Rastegari, M.: MobileViT: light-weight, general-purpose, and mobile-friendly vision transformer. arXiv preprint: arXiv:2110.02178 (2021)
Raj, S., Gupta, Y., Malhotra, R.: License plate recognition system using yolov5 and CNN. In: 2022 8th International Conference on Advanced Computing and Communication Systems (ICACCS), vol. 1, pp. 372–377. IEEE (2022)
Google Scholar
Redmon, J., Farhadi, A.: YOLOV3: an incremental improvement. arXiv preprint: arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Google Scholar
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention. In: International Conference on Machine Learning, pp. 10347–10357. PMLR (2021)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, P., Da, C., Yao, C.: Multi-granularity prediction for scene text recognition. In: Avidan, S., Brostow, G., Cisse, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision - ECCV 2022. Lecture Notes in Computer Science, vol. 13688, pp. 339–355. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19815-1_20
Chapter Google Scholar
Wang, Y., Bian, Z.P., Zhou, Y., Chau, L.P.: Rethinking and designing a high-performing automatic license plate recognition approach. IEEE Trans. Intell. Transp. Syst. 23(7), 8868–8880 (2021)
Article Google Scholar
Wu, K., et al.: TinyViT: fast pretraining distillation for small vision transformers. In: Avidan, S., Brostow, G., Cisse, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision - ECCV 2022. Lecture Notes in Computer Science, vol. 13681, pp. 68–85. Springer, Cham (2022)
Chapter Google Scholar
Xu, Z., et al.: Towards end-to-end license plate detection and recognition: a large dataset and baseline. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 261–277. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_16
Chapter Google Scholar
Zhang, L., Wang, P., Li, H., Li, Z., Shen, C., Zhang, Y.: A robust attentional framework for license plate recognition in the wild. IEEE Trans. Intell. Transp. Syst. 22(11), 6967–6976 (2020)
Article Google Scholar
Zou, Y., et al.: License plate detection and recognition based on YOLOV3 and ILPRNET. SIViP 16(2), 473–480 (2022)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Engineering, Minzu University of China, Beijing, 100081, China
Ran Xia, Wei Song, Xiangchun Liu & Xiaobing Zhao
National Language Resource Monitoring and Research Center of Minority Languages, Minzu University of China, Beijing, 100081, China
Wei Song & Xiaobing Zhao
Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance of MOE, Minzu University of China, Beijing, 100081, China
Wei Song & Xiaobing Zhao

Authors

Ran Xia
View author publications
You can also search for this author in PubMed Google Scholar
Wei Song
View author publications
You can also search for this author in PubMed Google Scholar
Xiangchun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaobing Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Song .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xia, R., Song, W., Liu, X., Zhao, X. (2024). Tripartite Architecture License Plate Recognition Based on Transformer. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14426. Springer, Singapore. https://doi.org/10.1007/978-981-99-8432-9_33

Download citation

DOI: https://doi.org/10.1007/978-981-99-8432-9_33
Published: 24 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8431-2
Online ISBN: 978-981-99-8432-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Tripartite Architecture License Plate Recognition Based on Transformer