Research on fast text recognition method for financial ticket image

Zhang, Hanning; Dong, Bo; Zheng, Qinghua; Feng, Boqin

doi:10.1007/s10489-022-03467-7

Research on fast text recognition method for financial ticket image

Published: 09 April 2022

Volume 52, pages 18156–18166, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Hanning Zhang^1,2,
Bo Dong ORCID: orcid.org/0000-0001-7695-9072^3,4,
Qinghua Zheng¹ &
…
Boqin Feng¹

352 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Currently, deep learning methods have been widely applied and thus promoted the development of different fields. In the financial accounting field, the rapid increase in the number of financial tickets dramatically increases labor costs; hence, using a deep learning method to relieve the pressure on accounting is necessary. At present, a few works have applied deep learning methods to financial ticket recognition. However, first, their approaches only cover a few types of tickets. In addition, the precision and speed of their recognition models cannot meet the requirements of practical financial accounting systems. Moreover, none of the methods provides a detailed analysis of both the types and content of tickets. Therefore, this paper first analyzes the different features of 482 kinds of financial tickets, divides all kinds of financial tickets into three categories, and proposes different recognition patterns for each category. These recognition patterns can meet almost all types of financial ticket recognition needs. Second, regarding the fixed format types of financial tickets (accounting for 68.27% of the total types of tickets), we propose a simple yet efficient network named the Financial Ticket Faster Detection network (FTFDNet) based on a Faster R-CNN. Furthermore, according to the characteristics of the financial ticket text, in order to obtain higher recognition accuracy, the loss function, Region Proposal Network (RPN), and Non-Maximum Suppression (NMS) are improved to make FTFDNet focus more on text. Finally, we perform a comparison with the best ticket recognition model from the ICDAR2019 invoice competition. The experimental results prove the effectiveness of these improvements. The accuracy of this method reaches 97.4% and the recognition speed increases by 50%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. Inproceedings of the IEEE international conference on computer vision, pp 2961–2969
Pramanik A, Pal SK, Maiti J, Mitra P (2021) Granulated rcnn and multi-class deep sort for multi-object detection and tracking IEEE Transactions on Emerging Topics in Computational Intelligence
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, pages 21–37. Springer
Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: a single-shot object detector based on multi-level feature pyramid network. Inproceedings of the AAAI conference on artificial intelligence, pp 9259–9266
Tian Z, Huang W, He T, He P, Qiao Y (2016) Detecting text in natural image with connectionist text proposal network. In: European conference on computer vision, pages 56–72. Springer
Zhou X, Yao C, He W, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. Inproceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 5551–5560
Shi B, Wang X, Lyu P, Yao C, Bai X (2016) Robust scene text recognition with automatic rectification. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 4168–4176
Lyu P, Liao M, Yao C, Wenhao W, Bai X (2018) Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. Inproceedings of the European Conference on Computer Vision (ECCV), pp 67–83
Liu Y, Chen H, Shen C, He T, Jin L, Wang L (2020) Abcnet: Real-time scene text spotting with adaptive bezier-curve network. Inproceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9809–9818
Shi B, Bai X, Yao C (2016) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition, vol 39
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need
Sun Y, Mao X, Hong S, Wenhua X, Gui G (2019) Template matching-based method for intelligent invoice information identification. IEEE Access 7:28392–28401
Article Google Scholar
Zhang J, Ren F, Ni H, Zhang Z, Wang K (2019) Research on information recognition of vat invoice based on computer vision. In: IEEE 6th International Conference on Cloud Computing and Intelligence Systems (CCIS), pages 126–130, IEEE, 2019
Palm RB, Winther O, Laws F (2017) Cloudscan-a configuration-free invoice analysis system using recurrent neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), volume 1, pages 406–413. IEEE
Blanchard J, Belaïd Y, Belaïd A (2019) Automatic generation of a custom corpora for invoice analysis and recognition. In: International Conference on Document Analysis and Recognition Workshops (ICDARW), volume 7, pages 1–1, IEEE, 2019
Yi F, Zhao Y-F, Sheng G-Q, Xie K, Wen C, Tang X-G, Qi X (2019) Dual model medical invoices recognition. Sensors 19(20):4370
Article Google Scholar
Zhang H, Zheng Q, Bo D, Feng B (2021) A financial ticket image intelligent recognition system based on deep learning. Knowl-Based Syst 222:106955
Article Google Scholar
Zhang H, Wu C, Zhang Z, Zhu Y, Zhang Z, Lin H, Sun Y, He T, Mueller J, Manmatha R et al (2020) Resnest:, Split-attention networks. arXiv:2004.08955
Zhong Z, Jin L, Huang S (2017) Deeptext: A new approach for text proposal generation and text detection in natural images. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 1208–1212. IEEE
Xie S, Girshick R, Dollár P, Zhuowen T, He K (2017) Aggregated residual transformations for deep neural networks. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 1492–1500
Jain R, Gupta M, Taneja S, Jude Hemanth D (2021) Deep learning based detection and analysis of covid-19 on chest x-ray images. Appl Intell 51(3):1690–1700
Article Google Scholar
Dai X, Yuan X, Wei X (2021) Tirnet: Object detection in thermal infrared images for autonomous driving. Appl Intell 51(3):1244–1261
Article Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ma J, Shao W, Ye H, Li W, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia 20(11):3111–3122
Article Google Scholar
Zhang H, Liu J, Chen T (2019) Scene text detection with inception text proposal generation module. Inproceedings of the 2019 11th International Conference on Machine Learning and Computing, pp 456–460
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. Inproceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 658–666
Chen Z, Chen K, Lin W, See J, Yang C (2020) Piou loss: Towards accurate oriented object detection in complex environments. In: European Conference on Computer Vision (ECCV2020)
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-iou loss: Faster and better learning for bounding box regression
Lyu P, Yao C, Wenhao W, Yan S, Bai X (2018) Multi-oriented scene text detection via corner localization and region segmentation. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 7553–7563
Liu W, Yuan X, Zhang Y, Liu M, Xiao Z, Wu J (2020) An end to end method for taxi receipt automatic recognition based on neural network
Yang J, Gao Y, Ding Y, Sun Y, Meng Y, Zhang W (2019) Deep learning aided system design method for intelligent reimbursement robot. IEEE Access 7:96232–96239
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, 710049, China
Hanning Zhang, Qinghua Zheng & Boqin Feng
Shaanxi Province Key Laboratory of Satellite and Terrestrial Network Technology Research and Development, Xi’an Jiaotong University, Xi’an, 710049, China
Hanning Zhang
School of Continuing Education, Xi’an Jiaotong University, Xi’an, 710049, China
Bo Dong
National Engineering Lab for Big Data Analytics Xi’an Jiaotong University, Xi’an, 710049, China
Bo Dong

Authors

Hanning Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Bo Dong
View author publications
You can also search for this author inPubMed Google Scholar
Qinghua Zheng
View author publications
You can also search for this author inPubMed Google Scholar
Boqin Feng
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Bo Dong.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was partially supported by the National Science Foundation of China under Grant Nos. 62050194, 62037001, 61721002 and 62002282, the MOE Innovation Research Team No. IRT_17R86, and Project of XJTU-SERVYOU Joint Tax-AI Lab.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Dong, B., Zheng, Q. et al. Research on fast text recognition method for financial ticket image. Appl Intell 52, 18156–18166 (2022). https://doi.org/10.1007/s10489-022-03467-7

Download citation

Accepted: 04 March 2022
Published: 09 April 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s10489-022-03467-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on fast text recognition method for financial ticket image

Abstract

Access this article

Subscribe and save

Buy Now

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now