Abstract
Currently, deep learning methods have been widely applied and thus promoted the development of different fields. In the financial accounting field, the rapid increase in the number of financial tickets dramatically increases labor costs; hence, using a deep learning method to relieve the pressure on accounting is necessary. At present, a few works have applied deep learning methods to financial ticket recognition. However, first, their approaches only cover a few types of tickets. In addition, the precision and speed of their recognition models cannot meet the requirements of practical financial accounting systems. Moreover, none of the methods provides a detailed analysis of both the types and content of tickets. Therefore, this paper first analyzes the different features of 482 kinds of financial tickets, divides all kinds of financial tickets into three categories, and proposes different recognition patterns for each category. These recognition patterns can meet almost all types of financial ticket recognition needs. Second, regarding the fixed format types of financial tickets (accounting for 68.27% of the total types of tickets), we propose a simple yet efficient network named the Financial Ticket Faster Detection network (FTFDNet) based on a Faster R-CNN. Furthermore, according to the characteristics of the financial ticket text, in order to obtain higher recognition accuracy, the loss function, Region Proposal Network (RPN), and Non-Maximum Suppression (NMS) are improved to make FTFDNet focus more on text. Finally, we perform a comparison with the best ticket recognition model from the ICDAR2019 invoice competition. The experimental results prove the effectiveness of these improvements. The accuracy of this method reaches 97.4% and the recognition speed increases by 50%.





Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. Inproceedings of the IEEE international conference on computer vision, pp 2961–2969
Pramanik A, Pal SK, Maiti J, Mitra P (2021) Granulated rcnn and multi-class deep sort for multi-object detection and tracking IEEE Transactions on Emerging Topics in Computational Intelligence
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, pages 21–37. Springer
Zhao Q, Sheng T, Wang Y, Tang Z, Chen Y, Cai L, Ling H (2019) M2det: a single-shot object detector based on multi-level feature pyramid network. Inproceedings of the AAAI conference on artificial intelligence, pp 9259–9266
Tian Z, Huang W, He T, He P, Qiao Y (2016) Detecting text in natural image with connectionist text proposal network. In: European conference on computer vision, pages 56–72. Springer
Zhou X, Yao C, He W, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. Inproceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 5551–5560
Shi B, Wang X, Lyu P, Yao C, Bai X (2016) Robust scene text recognition with automatic rectification. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 4168–4176
Lyu P, Liao M, Yao C, Wenhao W, Bai X (2018) Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. Inproceedings of the European Conference on Computer Vision (ECCV), pp 67–83
Liu Y, Chen H, Shen C, He T, Jin L, Wang L (2020) Abcnet: Real-time scene text spotting with adaptive bezier-curve network. Inproceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 9809–9818
Shi B, Bai X, Yao C (2016) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition, vol 39
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need
Sun Y, Mao X, Hong S, Wenhua X, Gui G (2019) Template matching-based method for intelligent invoice information identification. IEEE Access 7:28392–28401
Zhang J, Ren F, Ni H, Zhang Z, Wang K (2019) Research on information recognition of vat invoice based on computer vision. In: IEEE 6th International Conference on Cloud Computing and Intelligence Systems (CCIS), pages 126–130, IEEE, 2019
Palm RB, Winther O, Laws F (2017) Cloudscan-a configuration-free invoice analysis system using recurrent neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), volume 1, pages 406–413. IEEE
Blanchard J, Belaïd Y, Belaïd A (2019) Automatic generation of a custom corpora for invoice analysis and recognition. In: International Conference on Document Analysis and Recognition Workshops (ICDARW), volume 7, pages 1–1, IEEE, 2019
Yi F, Zhao Y-F, Sheng G-Q, Xie K, Wen C, Tang X-G, Qi X (2019) Dual model medical invoices recognition. Sensors 19(20):4370
Zhang H, Zheng Q, Bo D, Feng B (2021) A financial ticket image intelligent recognition system based on deep learning. Knowl-Based Syst 222:106955
Zhang H, Wu C, Zhang Z, Zhu Y, Zhang Z, Lin H, Sun Y, He T, Mueller J, Manmatha R et al (2020) Resnest:, Split-attention networks. arXiv:2004.08955
Zhong Z, Jin L, Huang S (2017) Deeptext: A new approach for text proposal generation and text detection in natural images. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 1208–1212. IEEE
Xie S, Girshick R, Dollár P, Zhuowen T, He K (2017) Aggregated residual transformations for deep neural networks. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 1492–1500
Jain R, Gupta M, Taneja S, Jude Hemanth D (2021) Deep learning based detection and analysis of covid-19 on chest x-ray images. Appl Intell 51(3):1690–1700
Dai X, Yuan X, Wei X (2021) Tirnet: Object detection in thermal infrared images for autonomous driving. Appl Intell 51(3):1244–1261
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Ma J, Shao W, Ye H, Li W, Wang H, Zheng Y, Xue X (2018) Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia 20(11):3111–3122
Zhang H, Liu J, Chen T (2019) Scene text detection with inception text proposal generation module. Inproceedings of the 2019 11th International Conference on Machine Learning and Computing, pp 456–460
Rezatofighi H, Tsoi N, Gwak J, Sadeghian A, Reid I, Savarese S (2019) Generalized intersection over union: a metric and a loss for bounding box regression. Inproceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 658–666
Chen Z, Chen K, Lin W, See J, Yang C (2020) Piou loss: Towards accurate oriented object detection in complex environments. In: European Conference on Computer Vision (ECCV2020)
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-iou loss: Faster and better learning for bounding box regression
Lyu P, Yao C, Wenhao W, Yan S, Bai X (2018) Multi-oriented scene text detection via corner localization and region segmentation. Inproceedings of the IEEE conference on computer vision and pattern recognition, pp 7553–7563
Liu W, Yuan X, Zhang Y, Liu M, Xiao Z, Wu J (2020) An end to end method for taxi receipt automatic recognition based on neural network
Yang J, Gao Y, Ding Y, Sun Y, Meng Y, Zhang W (2019) Deep learning aided system design method for intelligent reimbursement robot. IEEE Access 7:96232–96239
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This research was partially supported by the National Science Foundation of China under Grant Nos. 62050194, 62037001, 61721002 and 62002282, the MOE Innovation Research Team No. IRT_17R86, and Project of XJTU-SERVYOU Joint Tax-AI Lab.
Rights and permissions
About this article
Cite this article
Zhang, H., Dong, B., Zheng, Q. et al. Research on fast text recognition method for financial ticket image. Appl Intell 52, 18156–18166 (2022). https://doi.org/10.1007/s10489-022-03467-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03467-7