skip to main content
10.1145/3573428.3573462acmotherconferencesArticle/Chapter ViewAbstractPublication PageseitceConference Proceedingsconference-collections
research-article

Sequence Recognition of Scene Text Based on CRNN and CTPN Models

Published: 15 March 2023 Publication History

Abstract

Image-based sequence recognition has lately emerged as a prominent study subject in the science of computer vision, while text detection and identification in natural situations has emerged as an active research field. Based on scene text data, this paper addresses the theory of deep learning-based CRNN and CTPN models and the process of processing text. Using CRNN, text recognition can be turned into a time-dependent sequence learning issue, which is commonly employed for indeterminate-length text sequences. Contextual relationships between text images are learned using BLSTM and CTC, thus effectively improving text recognition accuracy and making the model more robust. It also excels in text recognition tests for wordless and lexical-based scenes, as it is not constrained by any predefined language. It produces a more efficient, but smaller, model that is more suited to real-world settings. CRNN recognition accuracy is lower for short texts with large morphological changes, such as artistic words, or texts with large changes in natural scenes. Because of the Anchor setting, CTPN can only detect horizontally distributed text, but a small improvement can detect vertical text by adding horizontal Anchor. As a result of the limitations of the framework, the irregularly inclined text can be detected very broadly.

References

[1]
Baoguang Shi, Xiang Bai, and Cong Yao. 2015. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition.
[2]
Zhi Tian, Weilin Huang, Tong He, Pan He, and Yu Qiao. 2016. Detecting Text in Natural Image with Connectionist Text Proposal Network. ECCV
[3]
Girshick, R.: Fast r-cnn. 2015, in IEEE International Conference on Computer Vision (ICCV)
[4]
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Reading text in the wild with convolutional neural networks. International Journal of Computer Vision (IJCV), 2016, Reading text in the wild with convolutional neural networks
[5]
K. Wang, B. Babenko, and S. Belongie. End-to-end scene text recognition. In ICCV, 2011.
[6]
A. Bissacco, M. Cummins, Y. Netzer, and H. Neven. Photoocr: Reading text in uncontrolled conditions. In ICCV, 2013.
[7]
Busta, M., Neumann, L., Matas, J.: Fastext: Efficient unconstrained scene text detector, 2015, in IEEE International Conference on Computer Vision (ICCV)
[8]
M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman. Synthetic data and artificial neural networks for natural scene text recognition. NIPS Deep Learning Workshop, 2014.
[9]
M. Jaderberg, K. Simonyan, A. Vedaldi, and A. Zisserman. Deep structured output learning for unconstrained text recognition. In ICLR, 2015.
[10]
Cheng, M., Zhang, Z., Lin, W., Torr, P.: Bing: Binarized normed gradients for objectness estimation at 300fps, 2014, in IEEE Computer Vision and Pattern Recognition (CVPR)

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering
October 2022
1999 pages
ISBN:9781450397148
DOI:10.1145/3573428
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CTPN
  2. LSTM
  3. OCR
  4. Sequence recognition

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

EITCE 2022

Acceptance Rates

Overall Acceptance Rate 508 of 972 submissions, 52%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 54
    Total Downloads
  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)3
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media