ABSTRACT
Aiming at the problems of irregular text region and fuzzy text in picture, this paper proposes a text recognition method based on weakly supervised learning. The method is based on explicit rectify module, vision module, language module and fusion module. The vision module corrects the irregular text region through the correction module; the vision module extracts features and recognizes them through the convolution neural network and the location attention mechanism, and outputs the predicted strings; the language module learns sequence information through the attention mechanism and corrects the predicted strings by the vision module; finally, the output results of the vision module and the language module are combined according to the weight in the fusion module. Get the final prediction. The language module in this method prevents the direct interference of image blur and enhances the accuracy of text recognition, experiments on several common data sets demonstrate the effectiveness of the proposed method.
- Long, Jonathan, Shelhamer, Fully Convolutional Networks for Semantic Segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017Google Scholar
- Hochreiter, Sepp, and Jürgen Schmidhuber. "Bridging Long Time Lags by Weight Guessing and "Long Short Term Memory"." spatiotemporal models in biological & artificial systems (1996)Google Scholar
- Shi B, Xiang B, Cong Y. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2016, 39(11):2298-2304Google ScholarDigital Library
- Fang S, Xie H, Wang Y, Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition[J]. 2021Google Scholar
- Shi B, Yang M, Wang X, ASTER: An Attentional Scene Text Recognizer with Flexible Rectification[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2018, PP:1-1Google Scholar
- Vaswani A, Shazeer N, Parmar N, Attention is all you need [C]//Advances in Neural Information Processing Systems. 2017: 5998-6008Google Scholar
- Merity S, Xiong C, Bradbury J, Pointer Sentinel Mixture Models[C]// ICLR. 2017Google Scholar
- Yosinski J, Clune J, Bengio Y, How transferable are features in deep neural networks? [J]. MIT Press, 2014Google Scholar
- Long, Jonathan, Shelhamer, Fully Convolutional Networks for Semantic Segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017Google Scholar
- Zhou Z H. A Brief Introduction to Weakly Supervised Learning[J]. National Science Review, 2017(1):1Google ScholarCross Ref
- Luo C, Jin L, Sun Z. MORAN: A Multi-Object Rectified Attention Network for scene text recognition[J]. Pattern Recognition, 2019, 90Google Scholar
- Chung J, Gulcehre C, Cho K H, Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[J]. Eprint Arxiv, 2014Google Scholar
- Mnih V, Heess N, Graves A, Recurrent Models of Visual Attention[J]. Advances in Neural Information Processing Systems, 2014, 3Google Scholar
- Karatzas D, Shafait F, Uchida S, ICDAR 2013 robust reading competition[C]// Document Analysis and Recognition (ICDAR), 2013 12th International Conference on. IEEE Computer Society, 2013Google Scholar
- Mishra A, Alahari K, Jawahar C V. Scene Text Recognition using Higher Order Language Priors. 2012Google Scholar
- Kai W, Babenko B, Belongie S. End-to-end scene text recognition[C]// IEEE International Conference on Computer Vision. IEEE, 2012Google Scholar
- Risnumawan A, Shivakumara P, Chan C S, A robust arbitrary text detection system for natural scene images[J]. Expert Systems with Applications, 2014, 41(18):8027-8048Google ScholarCross Ref
- Karatzas D, Gomez-Bigorda L, Nicolaou A, ICDAR 2015 competition on Robust Reading[C]// International Conference on Document Analysis & Recognition. IEEE Computer Society, 2015Google Scholar
- Phan T Q, Shivakumara P, Tian S, Recognizing Text with Perspective Distortion in Natural Scenes[C]// IEEE International Conference on Computer Vision. IEEE, 2014Google Scholar
Index Terms
- Text Recognition Based on Weakly Supervised Learning
Comments