Abstract
Understanding text that appears in a natural scene is essential to a wide range of applications. This issue is still challenging in the community of document analysis and recognition because of the complexity of the natural scene images. In this paper, we propose a new method to effectively detect text regions by identifying the location of characters. The mainstay of our work is to concentrate on designing a network for text detection and a network for text recognition. For text detection, the proposed method directly predicts characters or text lines that appear in the full scene images, and the approach is able to work for text with arbitrary orientations and quadrilateral shapes. To do that, our model produces the score of character position and the score of character similarity. These scores are utilized to group each character into a single object. For the text recognition phase, the detected text is fed into a second network which is used to extract the features from the text images and to map the features to a sequence of characters. The experiments are performed on public datasets, and the obtained results show that the proposed approach gives competitive performance compared to state-of-the-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Khlif W, Nayef N, Burie J, Ogier J, Alimi A (2018) Learning text component features via convolutional neural networks for scene text detection. In: DAS
Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: CVPR
Lee J, Lee P, Lee S, Yuille A, Koch C (2011) Adaboost for text detection in natural scene. In: ICDAR
Text extraction from scene images by character appearance and structure modeling (2013) Comput Vis Image Underst
Gomez L, Karatzas D (2016) A fast hierarchical method for multi-script and arbitrary oriented scene text extraction
Zhang C, Yao C, Shi B, Bai X (2015) Automatic discrimination of text and non-text natural images. In: ICDAR
Zhu S, Zanibbi R (2016) A text detection system for natural scenes with convolutional feature learning and cascaded classification. In: CVPR
Zhu A, Uchida S (2017) Scene text relocation with guidance. In: ICDAR
Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: CVPR
Liao M, Shi B, Bai X, Wang X, Liu W (2016) Textboxes: a fast text detector with a single deep neural network
Liao M, Shi B, Bai X (2018) Textboxes++: a single-shot oriented scene text detector. IEEE Trans Image Process
Jianqi M, Shao W, Ye H, Wang L, Wang H, Zheng Y, Xue X (2017) Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans Multimed
Qin H, Zhang H, Wang H, Yan Y, Zhang M, Zhao W (2019) An algorithm for scene text detection using multibox and semantic segmentation. Appl Sci
Liu J, Liu X, Sheng J, Liang D, Li X, Liu Q (2019) Pyramid mask text detector. CoRR
Gupta A, Vedaldi A, Zisserman A (2020) Synthetic data for text localisation in natural images. In: CVPR
Wang P, Yang L, Li H, Deng Y, Shen C, Zhang Y (2019) A simple and robust convolutional-attention network for irregular text recognition. In: CVPR
Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: CVPR
Ren S, He K, Girshick R, Sun J (2017) Faster r-CNN: towards real-time object detection with region proposal networks. Trans PAMI
Nga P, Trang N, Phuc N, Quy T, Binh V (2017) Vietnamese text extraction from book covers. Tap chi Khoa hoc Dai hoc Da Lat
Zhang Z, Shen W, Yao C, Bai X (2015) Symmetry-based text line detection in natural scenes. In: CVPR
Buta M, Neumann L, Matas J (2015) Fastext: efficient unconstrained scene text detector. In: ICCV
Huang W, Qiao Y, Tang X (2014) Robust scene text detection with convolution neural network induced MSER trees. In: Comput Vis—ECCV 2014
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Shi B, Bai X, Yao C (2017) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans PAMI
Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: CVPR
Deng D, Liu H, Li X, Cai D (2018) Pixellink: Detecting scene text via instance segmentation
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Cu, V.L., Truong, X.V., Luu, T.D., Nguyen, H.V. (2022). Region Awareness for Identifying and Extracting Text in the Natural Scene. In: Yang, XS., Sherratt, S., Dey, N., Joshi, A. (eds) Proceedings of Sixth International Congress on Information and Communication Technology. Lecture Notes in Networks and Systems, vol 236. Springer, Singapore. https://doi.org/10.1007/978-981-16-2380-6_44
Download citation
DOI: https://doi.org/10.1007/978-981-16-2380-6_44
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-2379-0
Online ISBN: 978-981-16-2380-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)