ABSTRACT
Detecting text in natural scene images is a challenging task. In this paper, we propose a character-level end-to-end text detection algorithm in natural scene images. In general, text detection tasks are categorized into three parts: text localization, text segmentation, and text recognition. The proposed method aims not only to localize but also to recognize text. To do these tasks successfully, the proposed method consists of four steps: character candidate patch extraction, patch classification using ensemble of ResNets, non-character region elimination, and character region grouping via self-tuning spectral clustering. In the character candidate patch extraction step, character candidate patches are extracted from the image by using both edge information from multi-scale images and Maximally Stable Extremal Regions (MSERs). Then each patch is classified into either character patch or non-character patch by using the deep network that is composed of three ResNets with different hyper-parameters. Text regions are determined by filtering out non-character patches. In order to make further reduction of classification errors, character characteristics are employed to compensate classification results of the ensemble of ResNets. To evaluate the text detection performance, character regions are grouped via self-tuning spectral clustering. The proposed method shows competitive performance on the ICDAR 2013 dataset.
- H. Chen, S. S. Tsai, G. Schroth, D. M. Chen, R. Grzeszczuk, and B. Girod. 2011. Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In 18th IEEE International Conference on Image Processing. IEEE. Google ScholarCross Ref
- H. Cho, M. Sung, and B. Jun. 2016. Canny Text Detector: Fast and Robust Scene Text Localization Algorithm. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarCross Ref
- B. Epshtein, E. Ofek, and Y. Wexler. 2010. Detecting text in natural scenes with stroke width transform. In Computer Vision and Pattern Recognition (CVPR), IEEE Conference on. IEEE. Google ScholarCross Ref
- K. He, X. Zhang, S. Ren, and J. Sun. 2015. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015).Google Scholar
- T. He, W. Huang, Y. Qiao, and J. Yao. 2016. Text-attentional convolutional neural network for scene text detection. IEEE Transactions on Image Processing 25, 6 (2016), 2529--2541. Google ScholarDigital Library
- W. Huang, Y. Qiao, and X. Tang. 2014. Robust scene text detection with convolution neural network induced mser trees. In European Conference on Computer Vision. Springer. Google ScholarCross Ref
- D. Karatzas, F. Shafait, S. Uchida, M. Iwamura, L. G. i Bigorda, S. R. Mestre, J. Mas, David F. M., J. A. Almazan, and L. P. de las Heras. 2013. ICDAR 2013 robust reading competition. In 12th International Conference on Document Analysis and Recognition. IEEE. Google ScholarDigital Library
- H. I. Koo and D. H. Kim. 2013. Scene text detection via connected component clustering and nontext filtering. IEEE Transactions on Image Processing 22, 6 (2013), 2296--2305. Google ScholarDigital Library
- J. Matas, O. Chum, M. Urban, and T. Pajdla. 2004. Robust wide-baseline stereo from maximally stable extremal regions. Image and vision computing 22, 10 (2004), 761--767. Google ScholarCross Ref
- K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- T. Wang, D. J. Wu, A. Coates, and A. Y. Ng. 2012. End-to-end text recognition with convolutional neural networks. In Pattern Recognition (ICPR), 21st International Conference on. IEEE.Google Scholar
- L. Xu, C. Lu, Y. Xu, and J. Jia. 2011. Image smoothing via L0 gradient minimization. In ACM Transactions on Graphics (TOG), Vol. 30. ACM, 174.Google Scholar
- X. C. Yin, X. Yin, K. Huang, and H. W. Hao. 2014. Robust text detection in natural scene images. IEEE transactions on pattern analysis and machine intelligence 36, 5 (2014), 970--983. Google ScholarCross Ref
- Lihi Zelnik-Manor and Pietro Perona. 2005. Self-tuning spectral clustering. (2005).Google Scholar
- Zheng Zhang, Wei Shen, Cong Yao, and Xiang Bai. 2015. Symmetry-based text line detection in natural scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarCross Ref
- S. Zhu and R. Zanibbi. 2016. A Text Detection System for Natural Scenes with Convolutional Feature Learning and Cascaded Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarCross Ref
Index Terms
- A Robust Ensemble of ResNets for Character Level End-to-end Text Detection in Natural Scene Images
Recommendations
Text detection in chart images
Common OCR (Optical Character Recognition) systems fail to detect and recognize small text strings of few characters, in particular when a text line is not horizontal. Such text regions are typical for chart images. In this paper we present an algorithm ...
A novel machine learning approach for scene text extraction
AbstractImage based text extraction is a popular and challenging research field in computer vision in recent times. In this paper, an exigent aspect such as natural scene text identification and extraction has been investigated due to ...
Highlights- A novel method is proposed for scene text extraction, recognition and correction.
A text reading algorithm for natural images
Reading text in natural images has focused again the attention of many researchers during the last few years due to the increasing availability of cheap image-capturing devices in low-cost products like mobile phones. Therefore, as text can be found on ...
Comments