Abstract
Text information in natural images is very important to cross-media retrieval, index and understanding. However, its detection is challenging due to varying backgrounds, low contrast between text and non-text regions, perspective distortion and other disturbing factors. In this paper, we propose a novel text line detection method which can detect text line aligned with a straight line in any direction. It is mainly composed of three steps. In the first step, we use the maximal stable extremal region detector with dam line constraint to detect candidate text regions, we then define a similarity measurement between two regions which combines sizes, absolute distance, relative distance, contextual information and color histograms. In the second step, we propose a text line identification algorithm based on the defined similarity measurement. The algorithm firstly searches three regions as the seeds of a line, and then expands to obtain all regions in the line. In the last step, we develop a filter to remove non-text lines. The filter uses a sparse classifier based on two dictionaries which are learned from feature vectors extracted from morphological skeletons of those candidate text lines. A comparative study using two datasets shows the excellent performance of the proposed method for accurate text line detection with horizontal or arbitrary consistent orientation.



















Similar content being viewed by others
References
Chen T (2008) Text localization using DWT fusion algorithm. In: 11th IEEE International Conference on Communication Technology (ICCT), pp. 722–725
Chen X, Yuille A (2004) Detecting and reading text in natural scenes. In: the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 366–373
Frey BJ et al (2007) Clustering by passing messages between data points. Science 315:972–976
Gllavata J, Ewerth R, Freisleben B (2004) Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR), vol.1, pp. 425–428
Grana C, Borghesani D, Cucchiara R (2011) Automatic segmentation of digitalized historical manuscripts. Multimed Tools Appl 55(3):483–506
Idris F, Panchanathan S (1997) Review of image and video indexing techniques. J Vis Commun Image Represent 8(2):146–166
Karatzas D, Antonacopoulos A (2004) Text Extraction from Web Images Based on a Split-and-Merge segmentation Method Using Color Perception. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR), vol.2, pp. 634–637
Kim W, Kim C (2009) A New approach for overlay text detection and extraction from complex video scene. IEEE Trans Image Process 18:401–411
Kimmel R, Zhang C, Bronstein AM, Bronstein MM (2011) Are MSER features really interesting? IEEE Trans Pattern Anal Mach Intell 33(11):2316–2320
Li Z, Liu G, Qian X, Wang C, Ma Y, Yang Y (2010) A Video Text Detection Method Based on Key Text Points. In: Processing of the 11th Pacific-Rim Conference on Advances in multimedia information processing (PCM), pp. 284–295
Liang J, Doermann D, Li H (2005) Camera based analysis of text and documents: a survey. Int J Doc Anal Recognit 7:84–104
Lienhart R (2000) Automatic text segmentation and text recognition for video indexing. Multimed Syst Mag 8:69–81
Liu X, Wang W (2012) Robustly extracting captions in videos based on stroke-like edges and spatio-temporal analysis. IEEE Trans Multimed 14(2):482–489
Lucas SM (2005) ICDAR 2005 text locating competition results. In: Proceeding of the 8th International Conference on Document Analysis Recognition, vol. 1, pp. 80–85
Lucas SM, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) ICDAR 2003 robust reading competitions. In: Proceeding of 7th International Conference on Document Analysis Recognition, pp. 682–687
Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A (2008) Discriminative learned dictionaries for local image anlysis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8
Matas J, Chum O, Urban M, Pajdla T (2002) Robust wide baseline stereo from maximally stable extremal regions. British Machine Vision Computing Conference, pp. 384–393
Ofek BEE, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. IEEE Conf Comput Vis Pattern Recognit pp. 2963–2970
Minetto R, Thome N, Cord M, Fabrizio J, Marcotegui B (2010) “Snoopertext: A multiresolution system for text detection in complex visual scenes”. In: 17th IEEE International Conference on Image Processing ICIP, pp. 3861–3864
Shahab A, Shafait F, Dengel A (2011) ICDAR 2011 Robust Reading Competition Challenge 2: Reading Text in Scene Images. In: Proceeding of International Conference on Document Analysis Recognition, pp. 1491–1496
Shivakumara P, Dutta A, Tan CL, Pal U (2010) A New Wavelet-Median-Moment based Method for Multi-Oriented Video Text Detection. In: Proceedings of the Ninth IAPR International Workshop on Document Analysis and Systems (DAS), pp. 279–288
Shivakumara P, Huang W, Phan TQ, Tan CL (2010) Accurate video text detection through classification of low and high contrast images. Pattern Recogn 43:2165–2185
Shivakumara P, Phan TQ, Tan CL (2010) New fourier-statistical features in RGB space for video text detection. IEEE Trans Circ Syst Video Technol 20(11):1520–1532
Shivakumara P, Phan TQ, Tan CL (2011) A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33:412–419
Shivkumara P, Huang W, Tan CL (2008) Efficient Video Text Detection using Edge Features. In: Proceedings of 19th international Conference on Pattern Recognition (ICPR), pp. 1–4
Wang F, Ngo C-W, Pong T-C (2008) Structuring low-quality videotaped lectures for cross-reference browsing by video text analysis. Pattern Recogn 41:3257–3269
Yanga M, Zhanga L, Fengb X, Zhang D (2011) Fisher Discrimination Dictionary Learning for Sparse Representation. In: Proceeding of IEEE International Conference on Computer Vision (ICCV), pp. 543–550
Yao C, Bai X, Liu W, Ma Y, Tu Z (2012 June) Detecting Texts of Arbitrary Orientations in Natural Images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Ye Q, Jiao J, Huang J, Hua Y (2007) Text detection and restoration in natural scene images. Vis Commun Image Represent 18:504–513
Yi J, Peng Y, Xiao J (2007) Color-based clustering for text detection and extraction in image. In: Proceedings of the 15th International Conference on Multimedia (MM), pp. 847–850
Yi C, Tian YL (2011) Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans Image Process 20(9):2594–2605
Zhang D, Islam M, Lu G (2012) A review on automatic image annotation techniques. Pattern Recogn 45:346–362
Zhao M, Li S, Kwok J (2010) Text detection in images using sparse representation with discriminative dictionaries. Image Vis Comput 28:1590–1599
Zhao X, Lin K-H, Fu Y, Hu Y, Liu Y, Huang TS (2011) Text from corners: a novel approach to detect text and caption in videos. IEEE Trans Image Process 20(3):790–799
Acknowledgments
We would like to express our sincere thanks to both the anonymous associate editor and reviewers for their constructive comments that have significantly improved the quality as well as readability of the paper.
This work is supported by the National Natural Science Foundation of China (No.60673088).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yuan, J., Wei, B., Liu, Y. et al. A method for text line detection in natural images. Multimed Tools Appl 74, 859–884 (2015). https://doi.org/10.1007/s11042-013-1702-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1702-7