Abstract
In this paper, we propose an approach to scene text detection that leverages both the appearance and consensus of connected components. A component appearance is modeled with an SVM based dictionary classifier and the component consensus is represented with color and spatial layout features. Responses of the dictionary classifier are integrated with the consensus features into a discriminative model, where the importance of features is determined with a text level training procedure. In text detection, hypotheses are generated on component pairs and an iterative extension procedure is used to aggregate hypotheses into text objects. In the detection procedure, the discriminative model is used to perform classification as well as control the extension. Experiments show that the proposed approach reaches the state of the art in both detection accuracy and computational efficiency, and in particularly, it performs best when dealing with low-resolution text in clutter backgrounds.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
“Connected component” is shorted as “component” in the followings.
References
Liang, J., Doermann, D., Li, H.: Camera-based analysis of text and documents: a survey. Int. J. Doc. Anal. Recogn. 7, 84–104 (2005)
Merino-Gracia, C., Lenc, K., Mirmehdi, M.: A Head-Mounted device for recognizing text in natural scenes. In: Proceedings of Workshop on Camera-Based Document Analysis and Recognition, pp. 29–41 (2011)
Yi, C., Tian, Y.: Localizing text in scene images by boundary clustering, stroke segmentation and string fragment classification. IEEE Trans. Image Process. 21(9), 4256–4268 (2012)
Zhao, X., Lin, K.H., Fu, Y., Hu, Y., Liu, Y., Huang, T.S.: Text from corners: a novel approach to detect text and caption in videos. IEEE Trans. Image Process. 20(3), 790–799 (2011)
Phan, T.Q., Shivakumara, P., Tan, C.L.: Text detection in natural scenes using gradient vector flow-guided symmetry. In: Proceedings of the IEEE International Conference Pattern Recognition (2012)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of the IEEE International Conference, CVPR (2010)
Mosleh, A., Bouguila, N., Hamza, A.: Ben: image text detection using a bandlet-Based edge detector and stroke width transform. In: Proceedings of the British Machine Vision Conference (2012)
Neumann, L., Matas, J.: Text localization in real-world images using efficiently pruned exhaustive search. In: Proceedings of the International Conference on Document Analysis and Recognition (2011)
Chen, H., Tsai, S.S., Schroth, G., Chen, D.M., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: Proceedings of the IEEE International Conference on Image Processing (2011)
Neumann, L., Matas, J.: Real-time scene text location and recognition. In: Proceedings of the IEEE International Conference on CVPR (2012)
Koo, H., Kim, D.H.: Scene text detection via connected component clustering and non-text filtering. IEEE Trans. Image Process. 22(6), 2296–2305 (2013)
Pan, Y., Hou, X., Liu, C.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans., Image Process. 20(3), 800–813 (2011)
Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20(9), 2594–2605 (2011)
Shahab, A., Shafait, F., Dengel, A.: ICDAR 2011 robust reading competition challenge 2: reading text in scene images. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition (2011)
Lee, J., Lee, P., Lee, S., Yuille, A., Koch, C.: AdaBoost for text detection in natural scene. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition (2011)
Wang, K., Babenko, B., Belongie, S.: End-to-End scene text recognition. In: Proceedings of the IEEE International Conference on Computer Vision (2011)
Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Wu, D.J., Ng Andrew, Y.: Text detection and character recognition in scene images with unsupervised feature learning. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition (2011)
Wang, T., Wu, D. J., Coates, A., Andrew, Y.N.: End-to-end text recognition with convolution neural networks. In: Proceedings of the IEEE International Conference on Pattern Recognition (2012)
Nister, D., Stewenius, H.: Linear time maximally stable extremal regions. In: Proceedings of the European Conference on Computer Vision (2008)
Ye, Q., Han, Z., Jiao, J., Liu, J.: Human detection in images via piecewise linear support vector machines. IEEE Trans. Image Process. 22(2), 778–789 (2013)
Acknowledgement
The partial support of this research by DARPA through BBN/ DARPA Award HR0011-08-C-0004 under subcontract 9500009235, the US Government through NSF Award IIS-0812111 is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ye, Q., Doermann, D. (2014). Scene Text Detection via Integrated Discrimination of Component Appearance and Consensus. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2013. Lecture Notes in Computer Science(), vol 8357. Springer, Cham. https://doi.org/10.1007/978-3-319-05167-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-05167-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05166-6
Online ISBN: 978-3-319-05167-3
eBook Packages: Computer ScienceComputer Science (R0)