Abstract
In this paper, we propose a method leveraging context information for text detection in natural scene images. Most of the existing methods just utilize the hand-engineered features to describe the text area, but we focus on building a confidence map model by integrating the candidate appearance and the relationships with its adjacent candidates. Three layers of filtering strategy is designed to judge the category of the text candidates, which can remove abundant non-text regions. In order to retrieve the missing text regions, a context fusion step is performed. Finally, the remaining connected components (CCs) are grouped into text lines and are further verified, and then the text lines are broken into separate words. Experimental results on two benchmark datasets, i.e., ICDAR 2005, ICDAR 2013, demonstrate that the proposed approach has achieved the competitive performances with the state-of-the-art algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2963–2970 (2010)
Fabrizio, J., Marcotegui, B., Cord, M.: Text detection in street level images. Pattern Analysis and Applications 16(4), 519–533 (2013)
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9), 1627–1645 (2009)
Hanif, S., Prevost, L.: Text detection and localization in complex scene images using constrained adaboost algorithm. In: Proceeding of International Conference on Document Analysis and Recognition, pp. 1–5 (2009)
Zhang, J., Kasturi, R.: Text detection using edge gradient and graph spectrum. In: Proceedings of the International Conference on Pattern Recognition, pp. 3979–3982 (2010)
Karatzas, D., Shafait, F., Uchida, S., et al.: Icdar 2013 robust reading competition. In: Proceedings of the IEEE Conference on Document Analysis and Recognition, pp. 1484–1493 (2013)
Li, H., Doermann, D., Kia, O.: Automatic text detection and tracking in digital video. IEEE Transactions on Image Processing 9(1), 147–156 (2000)
Li, Y., Shen, C., Jia, W., van den Hengel, A.: Leveraging surrounding context for scene text detection. In: Proceedings of the IEEE International Conference on Image Processing, pp. 2264–2268 (2013)
Lucas, S.M.: Icdar 2005 text locating competition results. In: Proceeding of International Conference on Document Analysis and Recognition, pp. 80–84 (2005)
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: Icdar 2003 robust reading competitions. In: Proceeding of International Conference on Document Analysis and Recognition, pp. 682–687 (2003)
Meng, Q., Song, Y.: Text detection in natural scenes with salient region. In: Proceeding of the IAPR International Workshop on Document Analysis Systems, pp. 384–388 (2012)
Milyaev, S., Barinova, O., Novikova, T., Kohli, P., Lempitsky, V.S.: Image binarization for end-to-end text understanding in natural images. In: Proceeding of International Conference on Document Analysis and Recognition, pp. 128–132 (2013)
Minetto, R., Thome, N., Cord, M., Leite, N.J., Stolfi, J.: T-hog: An effective gradient-based descriptor for single line text regions. Pattern Recognition 46(3), 1078–1090 (2013)
Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S.: Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognition Letters 34(2), 107–116 (2013)
Shivakumara, P., Phan, T.Q., Tan, C.: A laplacian approach to multi-oriented text detection in video. IEEE Transactions on Pattern Analysis and Machine Intelligence 33(2), 412–419 (2011)
Shivakumara, P., Huang, W., Phan, T.Q., Tan, C.L.: Accurate video text detection through classification of low and high contrast images. Pattern Recognition 43(6), 2165–2185 (2010)
Wang, T., Wu, D., Coates, A., Ng, A.: End-to-end text recognition with convolutional neural networks. In: Proceeding of International Conference on Pattern Recognition, pp. 3304–3308 (2012)
Yao, J., Wang, Y., Weng, L., Yang, Y.: Locating text based on connected component and svm. In: Proceedings of the International Conference on Wavelet Analysis and Pattern Recognition, pp. 1418–1423 (2007)
Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Image and Vision Computing 23(6), 565–576 (2005)
Yi, C., Tian, Y.: Text extraction from scene images by character appearance and structure modeling. Computer Vision and Image Understanding 117(2), 182–194 (2013)
Yin, X.C., Yin, X., Huang, K., Hao, H.W.: Robust text detection in natural scene images. IEEE Transactions on Pattern Analysis and Machine Intelligence 36(5), 970–983 (2014)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, R., Sang, N., Gao, C., Kuang, X., Xiang, J. (2014). Text Detection in Natural Scene Images Leveraging Context Information. In: Li, S., Liu, C., Wang, Y. (eds) Pattern Recognition. CCPR 2014. Communications in Computer and Information Science, vol 484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45643-9_47
Download citation
DOI: https://doi.org/10.1007/978-3-662-45643-9_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45642-2
Online ISBN: 978-3-662-45643-9
eBook Packages: Computer ScienceComputer Science (R0)