Skip to main content
Log in

A method for text line detection in natural images

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Text information in natural images is very important to cross-media retrieval, index and understanding. However, its detection is challenging due to varying backgrounds, low contrast between text and non-text regions, perspective distortion and other disturbing factors. In this paper, we propose a novel text line detection method which can detect text line aligned with a straight line in any direction. It is mainly composed of three steps. In the first step, we use the maximal stable extremal region detector with dam line constraint to detect candidate text regions, we then define a similarity measurement between two regions which combines sizes, absolute distance, relative distance, contextual information and color histograms. In the second step, we propose a text line identification algorithm based on the defined similarity measurement. The algorithm firstly searches three regions as the seeds of a line, and then expands to obtain all regions in the line. In the last step, we develop a filter to remove non-text lines. The filter uses a sparse classifier based on two dictionaries which are learned from feature vectors extracted from morphological skeletons of those candidate text lines. A comparative study using two datasets shows the excellent performance of the proposed method for accurate text line detection with horizontal or arbitrary consistent orientation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  1. Chen T (2008) Text localization using DWT fusion algorithm. In: 11th IEEE International Conference on Communication Technology (ICCT), pp. 722–725

  2. Chen X, Yuille A (2004) Detecting and reading text in natural scenes. In: the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 366–373

  3. Frey BJ et al (2007) Clustering by passing messages between data points. Science 315:972–976

    Article  MATH  MathSciNet  Google Scholar 

  4. Gllavata J, Ewerth R, Freisleben B (2004) Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR), vol.1, pp. 425–428

  5. Grana C, Borghesani D, Cucchiara R (2011) Automatic segmentation of digitalized historical manuscripts. Multimed Tools Appl 55(3):483–506

    Article  Google Scholar 

  6. Idris F, Panchanathan S (1997) Review of image and video indexing techniques. J Vis Commun Image Represent 8(2):146–166

    Article  Google Scholar 

  7. Karatzas D, Antonacopoulos A (2004) Text Extraction from Web Images Based on a Split-and-Merge segmentation Method Using Color Perception. In: Proceedings of the 17th International Conference on Pattern Recognition (ICPR), vol.2, pp. 634–637

  8. Kim W, Kim C (2009) A New approach for overlay text detection and extraction from complex video scene. IEEE Trans Image Process 18:401–411

    Article  MathSciNet  Google Scholar 

  9. Kimmel R, Zhang C, Bronstein AM, Bronstein MM (2011) Are MSER features really interesting? IEEE Trans Pattern Anal Mach Intell 33(11):2316–2320

    Article  Google Scholar 

  10. Li Z, Liu G, Qian X, Wang C, Ma Y, Yang Y (2010) A Video Text Detection Method Based on Key Text Points. In: Processing of the 11th Pacific-Rim Conference on Advances in multimedia information processing (PCM), pp. 284–295

  11. Liang J, Doermann D, Li H (2005) Camera based analysis of text and documents: a survey. Int J Doc Anal Recognit 7:84–104

    Article  Google Scholar 

  12. Lienhart R (2000) Automatic text segmentation and text recognition for video indexing. Multimed Syst Mag 8:69–81

    Article  Google Scholar 

  13. Liu X, Wang W (2012) Robustly extracting captions in videos based on stroke-like edges and spatio-temporal analysis. IEEE Trans Multimed 14(2):482–489

    Article  Google Scholar 

  14. Lucas SM (2005) ICDAR 2005 text locating competition results. In: Proceeding of the 8th International Conference on Document Analysis Recognition, vol. 1, pp. 80–85

  15. Lucas SM, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) ICDAR 2003 robust reading competitions. In: Proceeding of 7th International Conference on Document Analysis Recognition, pp. 682–687

  16. Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A (2008) Discriminative learned dictionaries for local image anlysis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8

  17. Matas J, Chum O, Urban M, Pajdla T (2002) Robust wide baseline stereo from maximally stable extremal regions. British Machine Vision Computing Conference, pp. 384–393

  18. Ofek BEE, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. IEEE Conf Comput Vis Pattern Recognit pp. 2963–2970

  19. Minetto R, Thome N, Cord M, Fabrizio J, Marcotegui B (2010) “Snoopertext: A multiresolution system for text detection in complex visual scenes”. In: 17th IEEE International Conference on Image Processing ICIP, pp. 3861–3864

  20. Shahab A, Shafait F, Dengel A (2011) ICDAR 2011 Robust Reading Competition Challenge 2: Reading Text in Scene Images. In: Proceeding of International Conference on Document Analysis Recognition, pp. 1491–1496

  21. Shivakumara P, Dutta A, Tan CL, Pal U (2010) A New Wavelet-Median-Moment based Method for Multi-Oriented Video Text Detection. In: Proceedings of the Ninth IAPR International Workshop on Document Analysis and Systems (DAS), pp. 279–288

  22. Shivakumara P, Huang W, Phan TQ, Tan CL (2010) Accurate video text detection through classification of low and high contrast images. Pattern Recogn 43:2165–2185

    Article  Google Scholar 

  23. Shivakumara P, Phan TQ, Tan CL (2010) New fourier-statistical features in RGB space for video text detection. IEEE Trans Circ Syst Video Technol 20(11):1520–1532

    Article  Google Scholar 

  24. Shivakumara P, Phan TQ, Tan CL (2011) A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33:412–419

    Article  Google Scholar 

  25. Shivkumara P, Huang W, Tan CL (2008) Efficient Video Text Detection using Edge Features. In: Proceedings of 19th international Conference on Pattern Recognition (ICPR), pp. 1–4

  26. Wang F, Ngo C-W, Pong T-C (2008) Structuring low-quality videotaped lectures for cross-reference browsing by video text analysis. Pattern Recogn 41:3257–3269

    Article  Google Scholar 

  27. Yanga M, Zhanga L, Fengb X, Zhang D (2011) Fisher Discrimination Dictionary Learning for Sparse Representation. In: Proceeding of IEEE International Conference on Computer Vision (ICCV), pp. 543–550

  28. Yao C, Bai X, Liu W, Ma Y, Tu Z (2012 June) Detecting Texts of Arbitrary Orientations in Natural Images. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  29. Ye Q, Jiao J, Huang J, Hua Y (2007) Text detection and restoration in natural scene images. Vis Commun Image Represent 18:504–513

    Article  Google Scholar 

  30. Yi J, Peng Y, Xiao J (2007) Color-based clustering for text detection and extraction in image. In: Proceedings of the 15th International Conference on Multimedia (MM), pp. 847–850

  31. Yi C, Tian YL (2011) Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans Image Process 20(9):2594–2605

    Google Scholar 

  32. Zhang D, Islam M, Lu G (2012) A review on automatic image annotation techniques. Pattern Recogn 45:346–362

    Article  Google Scholar 

  33. Zhao M, Li S, Kwok J (2010) Text detection in images using sparse representation with discriminative dictionaries. Image Vis Comput 28:1590–1599

    Article  Google Scholar 

  34. Zhao X, Lin K-H, Fu Y, Hu Y, Liu Y, Huang TS (2011) Text from corners: a novel approach to detect text and caption in videos. IEEE Trans Image Process 20(3):790–799

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

We would like to express our sincere thanks to both the anonymous associate editor and reviewers for their constructive comments that have significantly improved the quality as well as readability of the paper.

This work is supported by the National Natural Science Foundation of China (No.60673088).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Yuan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yuan, J., Wei, B., Liu, Y. et al. A method for text line detection in natural images. Multimed Tools Appl 74, 859–884 (2015). https://doi.org/10.1007/s11042-013-1702-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1702-7

Keywords

Navigation