Driver License Field Detection Using Real-Time Deep Networks

Tsai, Chun-Ming; Hsieh, Jun-Wei; Chang, Ming-Ching; Lin, Yu-Chen

doi:10.1007/978-3-030-55789-8_52

Chun-Ming Tsai¹²,
Jun-Wei Hsieh¹³,
Ming-Ching Chang¹⁴ &
…
Yu-Chen Lin¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12144))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

1928 Accesses

Abstract

We present an automatic system for real-time visual detection and recognition of multiple driver’s license fields using an effective deep YOLOv3 detection network. Driver licenses are essential Photo IDs frequently checked by law enforcement and insurers. Automatic detection and recognition of multiple fields from the license can replace manual key-in and significantly improve workflow. In this paper, we developed an Intelligent Driving License Reading System (IDLRS) addressing the following challenging problems: (1) varying fields and contents from multiple types and versions of driver licenses, (2) varying capturing angles and illuminations from a mobile camera, (3) fast processing for real-world applications. To retain high detection accuracy and versatility, we propose to directly detect multiple field contents in a single shot by adopting and fine-tuning the recent YOLOv3-608 detector, which can detect 11 fields from the new Taiwan driver license with accuracy of 97.5%. Our approach does not rely on text detection or OCR and outperforms them when tested with large viewing angles. To further examine such capability, we perform evaluations in 4 large tilting view configurations (top, bottom, left, right), and achieve accuracies of 93.3%, 90.2%, 97.5%, 94.3%, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Driving license in Taiwan. https://en.wikipedia.org/wiki/Driving_license_in_Taiwan
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. PAMI 39(6), 1137–1149 (2017)
Article Google Scholar
Liu, W. et al., : Single Shot MultiBox Detector. In: arXiv preprint (2016). arXiv:1512.02325v5
Redmon, J., Farhadi, A.: YOLOv3: An Incremental Improvement. In: arXiv preprint (2018). arXiv:1804.02767
The PASCAL VOC project. http://host.robots.ox.ac.uk/pascal/VOC/#bestpractice
Seo, W., Koo, H.I., Cho, N.I.: Junction-based table detection in camera-captured document images. IJDAR 18(1), 47–57 (2015)
Article Google Scholar
e Silva, A.C., Jorge, A., Torgo, L.: Automatic selection of table areas in documents for information extraction. In: Pires, F.M., Abreu, S. (eds.) EPIA 2003. LNCS (LNAI), vol. 2902, pp. 460–465. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-24580-3_54
Chapter Google Scholar
Gilani, A., Qasim, S.R., Malik, I., Shafait, F.: Table detection using deep learning. In: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, pp. 771–776 (2107)
Google Scholar
Hassan, T., Baumgartner, R.: Table recognition and understanding from PDF files. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Parana, pp. 1143–1147 (2007)
Google Scholar
Rashid, S.F., Akmal, A., Adnan, M., Aslam, A.A., Dengel, A.: Table recognition in heterogeneous documents using machine learning. In: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, pp. 777–782 (2017)
Google Scholar
Göbel, M., Hassan, T., Oro, E., Orsi, G.: A methodology for evaluating algorithms for table understanding in PDF documents. In: ACM Symposium on Document Engineering, pp. 45–48 (2012)
Google Scholar
Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 12th International Conference on Document Analysis and Recognition, Washington, DC, pp. 1449–1453 (2013)
Google Scholar
Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localization in natural images. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas (2016)
Google Scholar
Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multioriented text detection with fully convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas (2016)
Google Scholar
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. Int. J. Comput. Vis. 116(1), 1–20 (2016)
Article MathSciNet Google Scholar
Liu, Y., Jin, L.: Deep matching prior network: toward tighter multioriented text detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, vol. 2, p. 8 (2017)
Google Scholar
Ma, J., et al.: Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimedia 20(11), 3111–3122 (2018)
Article Google Scholar
Moysset, B., Kermorvant, C., Wolf, C.: Learning to detect, localize and recognize many text objects in document images from few examples. IJDAR 21(3), 161–175 (2018). https://doi.org/10.1007/s10032-018-0305-2
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You Only Look Once: Unified, Real-Time Object Detection. In: arXiv preprint (2016). arXiv:1506.02640v5
Redmon, J., Farhadi, A.: YOLO9000: Better, Faster, Stronger. In: arXiv preprint (2016). arXiv:1612.08242v1
Tzutalin. LabelImg. Git code. https://github.com/tzutalin/labelImg
ImageNet. http://www.image-net.org
Darknet: Open Source Neural Networks in C. https://pjreddie.com/darknet/
AlexeyAB/darknet. https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

Download references

Acknowledgements

This work is supported by the Ministry of Science and Technology, Taiwan, under Grants MOST 107-2221-E-845-005 – and MOST 108-2221-E-845 -003 -MY3. We thank Walter Slocombe for paper writing improvement.

Author information

Authors and Affiliations

Department of Computer Science, University of Taipei, Taipei, Taiwan
Chun-Ming Tsai & Yu-Chen Lin
College of Artificial Intelligence and Green Energy, National Chiao Tung University, Hsinchu, Taiwan
Jun-Wei Hsieh
Department of Computer Science, University at Albany, State University of New York, Albany, USA
Ming-Ching Chang

Authors

Chun-Ming Tsai
View author publications
You can also search for this author in PubMed Google Scholar
Jun-Wei Hsieh
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Ching Chang
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Chen Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chun-Ming Tsai .

Editor information

Editors and Affiliations

Iwate Prefectural University, Takizawa, Japan
Hamido Fujita
Harbin Institute of Technology (Shenzhen), Shenzhen, China
Philippe Fournier-Viger
Texas State University, San Marcos, TX, USA
Moonis Ali
Iwate Prefectural University, Takizawa, Japan
Jun Sasaki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsai, CM., Hsieh, JW., Chang, MC., Lin, YC. (2020). Driver License Field Detection Using Real-Time Deep Networks. In: Fujita, H., Fournier-Viger, P., Ali, M., Sasaki, J. (eds) Trends in Artificial Intelligence Theory and Applications. Artificial Intelligence Practices. IEA/AIE 2020. Lecture Notes in Computer Science(), vol 12144. Springer, Cham. https://doi.org/10.1007/978-3-030-55789-8_52

Download citation

DOI: https://doi.org/10.1007/978-3-030-55789-8_52
Published: 04 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55788-1
Online ISBN: 978-3-030-55789-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics