Deep Learning for Optical Character Recognition and Its Application to VAT Invoice Recognition

Wang, Yu; Gui, Guan; Zhao, Nan; Yin, Yue; Huang, Hao; Li, Yunyi; Wang, Jie; Yang, Jie; Zhang, Haijun

doi:10.1007/978-981-13-6508-9_12

Yu Wang⁴⁰,
Guan Gui⁴⁰,
Nan Zhao⁴⁰,
Yue Yin⁴⁰,
Hao Huang⁴⁰,
Yunyi Li⁴⁰,
Jie Wang⁴⁰,
Jie Yang⁴⁰ &
…
Haijun Zhang⁴⁰

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 517))

Included in the following conference series:

International Conference in Communications, Signal Processing, and Systems

2538 Accesses
2 Citations

Abstract

Optical character recognition (OCR) is considered as one of long-term and hot research topics due to the fact that OCR technique can change the documents from paper to computer-readable format by consistently growing. However, the recognition accuracy of current OCR technique is required to improve some special applications such as in reimbursement of value-added tax (VAT) invoices. This paper proposes two OCR techniques by using deep convolutional neural network (CNN) and residual network (ResNet), respectively. According to our test dataset, the formerly proposed techniques can reach up to 97.08%, while the latter can increase to 99.38%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Modi, H., Scholar, P.G., Parikh, M.C.: A review on optical character recognition techniques. Int. J. Comput. Appl. 160(6), 975–8887 (2017)
Google Scholar
Sawant, A.S., Chougule, D.G.: Script independent text pre-processing and segmentation for OCR. In: International Conference on Electrical, Electronics, Signals, Communication and Optimization (EESCO), pp. 1–5 (2015)
Google Scholar
Mohammad, F., Anarase, J., Shingote, M., Ghanwat, P.: Optical character recognition implementation using pattern matching. Int. J. Comput. Sci. Inform. Technol. 5(2), 2088–2090 (2014)
Google Scholar
Yi, C., Tian, Y.: Scene text recognition in mobile applications by character descriptor and structure configuration. IEEE Trans. Image Process. 23(7), 2972–2982 (2014)
Article MathSciNet Google Scholar
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)
Article Google Scholar
Wigington, C., Stewart, S., Davis, B., Barrett, B., Price, B., Cohen, S.: Data augmentation for recognition of handwritten words and lines using a CNN-LSTM network. In: In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pp. 639–645 (2017)
Google Scholar
Vairalkar, M.K.: Edge detection of images using Sobel operator. Int. J. Emerg. Technol. Adv. Eng. 2(1), 291–293 2012
Google Scholar
Tabatabai, A.J., Mitchell, O.R.: Edge location to subpixel values in digital imagery. IEEE Trans. Pattern Anal. Mach. Intell. 6(2), 188–201 (1984)
Article Google Scholar
Gupta, M.R., Jacobson, N.P., Garcia, E.K.: OCR binarization and image pre-processing for searching historical documents. Pattern Recogn. 40(2), 389–397 (2007)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1–9 (2012)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Ohta, M., Takasu, A., Adachi, J.: Retrieval methods for English-text with miss recognized OCR characters. In: International Conference on Document Analysis and Recognition, pp. 950–956 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Telecommunication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing, 21003, China
Yu Wang, Guan Gui, Nan Zhao, Yue Yin, Hao Huang, Yunyi Li, Jie Wang, Jie Yang & Haijun Zhang

Authors

Yu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guan Gui
View author publications
You can also search for this author in PubMed Google Scholar
Nan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yue Yin
View author publications
You can also search for this author in PubMed Google Scholar
Hao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yunyi Li
View author publications
You can also search for this author in PubMed Google Scholar
Jie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Yang
View author publications
You can also search for this author in PubMed Google Scholar
Haijun Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guan Gui .

Editor information

Editors and Affiliations

Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA
Qilian Liang
School of Information and Communication Engineering, Dalian University of Technology, Dalian, China
Xin Liu
School of Information Science and Technology, Dalian Maritime University, Dalian, China
Zhenyu Na
College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin, China
Wei Wang
College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin, China
Jiasong Mu
College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin, China
Baoju Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Y. et al. (2020). Deep Learning for Optical Character Recognition and Its Application to VAT Invoice Recognition. In: Liang, Q., Liu, X., Na, Z., Wang, W., Mu, J., Zhang, B. (eds) Communications, Signal Processing, and Systems. CSPS 2018. Lecture Notes in Electrical Engineering, vol 517. Springer, Singapore. https://doi.org/10.1007/978-981-13-6508-9_12

Download citation

DOI: https://doi.org/10.1007/978-981-13-6508-9_12
Published: 14 June 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6507-2
Online ISBN: 978-981-13-6508-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics