Skip to main content
Log in

An effective graph-cut scene text localization with embedded text segmentation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper presents an effective and efficient approach to extracting scene text from images. The approach first extracts the edge information by the local maximum difference filter (LMDF), and at the same time a given image is decomposed into a group of image layers by color clustering. Then, through combining the characteristics of geometric structure and spatial distribution of scene text with the edge map, the candidate text image layers are identified. Further, in character level, the candidate text connected components are identified using a set of heuristic rules. Finally, the graph-cut computation is utilized to identify and localize text lines with arbitrary directions. In the proposed approach, the segmentation of text pixels is efficiently embedded into the computation of text localization as a part. The comprehensive evaluation experiments are performed on four challenging datasets (ICDAR 2003, ICDAR 2011, MSRA-TD500 and The Street View Text (SVT)) to verify the validation of our approach. In the comparison experiments with many state-of-the-art methods, the results demonstrate that our approach can effectively handle scene text with diverse fonts, sizes, colors, different languages, as well as arbitrary orientations, and it is robust to the influence of illumination change.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Bhattacharya U, Parui SK, Mondal S (2009) Devanagari and bangla text extraction from natural scene images. In: Proceedings of the 10th international conference on document analysis and recognition (ICDAR). Catalonia, pp 171–175

  2. Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: Proceedings of the 23rd IEEE conference on computer vision and pattern recognition (CVPR). San Francisco, pp 2963–2970

  3. Fabrizio J, Marcotegui B, Cord M (2009) Text segmentation in natural scenes using toggle-mapping. In: Proceedings of the 16th IEEE international conference on image processing. Cairo, pp 2373–2376

  4. Hanif SM, Prevost L, Negri PA (2008) A cascade detector for text detection in natural scene images. In: Proceedings of the 19th international conference on pattern recognition (ICPR). Tampa, pp 1–4

  5. Junga C, Liu Q, Kim J (2008) A new approach for text segmentation using a stroke filter. Signal Proc 88(7):1907–1916

    Article  Google Scholar 

  6. Kumar M, Kim YC, Lee GS (2010) Text detection using multilayer separation in real scene images. In: Proceedings of the 10th IEEE international conference on computer and information technology. Bradford, pp 1413–1417

  7. Kumar M, Lee G (2010) Automatic text location from complex natural scene images. In: Proceedings of international conference on computer and automation engineering. Singapore, pp 594–597

  8. Lee JJ, Lee PH, Lee SW, Yuille A, Koch C (2011) Adaboost for text detection in natural scene. In: Proceedings of the 11th international conference on document analysis and recognition (ICDAR). Beijing, pp 429–434

  9. Li XJ, Wang WQ, Jiang SQ, Huang QM (2008) Fast and effective text detection. In: Proceedings of the 15th IEEE international conference on image processing. San Diego, pp 969–972

  10. Liu Q, Jung C, Kim S, Moon Y, Yeun Kim J (2006) Stroke filter for text localization in video images. In: Proceedings of the 26th IEEE conference on image processing (ICIP). Atlanta, pp 1473–1476

  11. Lu F, Xie M (2010) An efficient method of license plate location in complex scene. In: Proceedings of the 2nd international conference on computer modeling and simulation. Sanya Yuhai, pp 206–209

  12. Lucas SM (2005) Icdar 2005 text locating competition results. In: Proceedings of the 8th international conference on document analysis and recognition (ICDAR). Seoul, pp 80–84

  13. Lucas SM, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) Icdar 2003 robust reading competitions. In: Proceedings of the 7th international conference on document analysis and recognition (ICDAR). Edinburgh, pp 682–687

  14. Mancas-Thillou C, Gosselin B (2006) Spatial and color spaces combination for natural scene text extraction. In: Proceedings of the 13th international conference on image proceedings (ICIP). Atlanta, pp 985–988

  15. Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10):761–767

    Article  Google Scholar 

  16. Neumann L, Matas J (2010) A method for text localization and recognition in real-world images. In: Proceedings of the 10th Asian conference on computer vision (ACCV). New Zealand, pp 30–35

  17. Neumann L, Matas J (2011) Text localization in real-world images using efficiently pruned exhaustive search. In: Proceedings of the 11th international conference on document analysis and recognition (ICDAR). Beijing, pp 687–691

  18. Neumann L, Matas J (2012) Real-time scene text localization and recognition. In: Proceedings of the 25th IEEE conference on computer vision and pattern recognition (CVPR). Providence, pp 3538–3545

  19. Park J, Lee G, Kim E, Lim J, Kim S, Yang H, Lee M, Hwang S (2010) Automatic detection and recognition of korean text in outdoor signboard images. Pattern Recogn Lett 31(12):1728–1739

    Article  Google Scholar 

  20. Pazio M, Niedzwiecki M, Kowalik R, Lebiedz J (2007) Text detection system for the blind. In: Proceedings of the 15th European signal processing conference. Poznan, pp 272–276

  21. Shahab A, Shafait F, A. Dengel. (2011) ICDAR 2011 robust reading competition challenge 2: reading text in scene images. In: Proceedings of the 11th international conference on document analysis and recognition. pp 1491–1496

  22. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905

    Article  Google Scholar 

  23. Shivakumara P, Huang W, Phan TQ, Tan CL (2010) Accurate video text detection through classification of low and high contrast images. Pattern Recognit 43(6):2165–2185

    Article  Google Scholar 

  24. Shivakumara P, Phan TQ, Tan CL (2011) A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33(2):412–419

    Article  Google Scholar 

  25. Tang X, Gao X, Liu J, Zhang H (2002) A spatial-temporal approach for video caption detection and recognition. IEEE Trans Neural Netw 13(4):961–971

    Article  Google Scholar 

  26. Wang K, Babenko B, Belongie S (2011) End-to-end Scene Text Recognition. In: Proceedings of the 13th international conference on computer vision (ICCV). Barcelona, pp 1457–1464

  27. Wang K, Belongie S (2010) Word Spotting in the Wild. In: Proceedings of the 11th European conference on computer vision (ECCV). Heraklion, pp 591–604

  28. Yao C, Bai X, Liu W, Ma Y, Tu Z (2012) Detecting texts of arbitrary orientations in natural images. In: Proceedings of the 25th IEEE conference on computer vision and pattern recognition (CVPR). Providence, pp 1083–1090

  29. Yi C, Tian Y (2013) Text extraction from scene images by character apperance and structure modeling. Comp Vision Image Underst 117(2):182–194

    Article  Google Scholar 

  30. Zeng C, Jia W, He X (2011) An algorithm for colour-based natural scene text segmentation. In: Proceedings of the 4th international conference on camera-based document analysis and recognition. Beijing, pp 58–68

  31. Zhang J, Kasturi R (2010) Character energy and link energy-based text extraction in scene images. In: Proceedings of the 10th Asian conference on computer vision (ACCV). New Zealand, pp 308–320

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant No. 61232013, No. 61271434, No. 61175115.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoqian Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, X., Wang, W. An effective graph-cut scene text localization with embedded text segmentation. Multimed Tools Appl 74, 4891–4906 (2015). https://doi.org/10.1007/s11042-013-1848-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1848-3

Keywords

Navigation