Skip to main content
Log in

A robust model for salient text detection in natural scene images using MSER feature detector and Grabcut

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Visual attention models have been used to recognize the most prominent region in a natural scene. These regions are going to pull the human attention. The state-of-art models keep on under-predicting the significant image regions having text. These are specifically the regions with most noteworthy semantic significance in a natural scene and turn out to be useful for saliency-based applications like image classification and captioning. The text or character detection as a salient region in image remains a challenging research problem. Text contents within the scene convey vital information about the image. For example, signboard content conveys the important information for visually impaired person. In this paper, we have proposed a new model for salient text detection in a natural scene. In the proposed model, we integrate saliency model with the segmentation and text detection approach in a natural scene to generate the text saliency. The experimental outcomes in ROC curve and DET curves illustrate that the proposed model outperformed the state-of-art methods for detection of salient text content from a natural scene.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Achanta R, Süsstrunk S (2010) ‘Saliency detection using maximum symmetric surround’, In Image processing (ICIP), 2010 17th IEEE international conference on 2653–2656

  2. Achanta R, Hemami S, Estrada F, Susstrunk S, (2009) ‘Frequency-tuned salient region detection’, In Computer vision and pattern recognition, 2009, IEEE conference on 1597–1604

  3. Bylinskii Z, Recasens A, Borji A, Oliva A, Torralba A, Durand F (2016) ‘Where should saliency models look next?’, In European Conference on Computer Vision 809–824

  4. Chen H, Tsai SS, Schroth G, Chen DM, Grzeszczuk R, Girod B (2011) ‘Robust text detection in natural images with edge-enhanced maximally stable extremal regions’, In Image Processing (ICIP), 2011 18th IEEE International Conference on 2609–2612

  5. Gao R, Uchida S, Shahab A, Shafait F, Frinken V (2014) Visual saliency models for text detection in real world. PLoS One 9(12):e114539

    Article  Google Scholar 

  6. Gupta N, Jalal AS (2017). A comparison of visual attention models for the salient text content detection in natural scene. In Information and Communication Technology (CICT), 2017 Conference on (1–5). IEEE

  7. Harel J, Koch C, Perona P (2007) ‘Graph-based visual saliency’, In Advances in neural information processing systems 545–552

  8. Hou X, Zhang L (2007) ‘Saliency detection: A spectral residual approach’, In Computer Vision and Pattern Recognition, 2007, IEEE Conference on 1–8

  9. Huang, W., Qiao, Y. and Tang, X., (2014) ‘Robust scene text detection with convolution neural network induced mser trees’, In European Conference on Computer Vision (497–511)

  10. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259

    Article  Google Scholar 

  11. Jian M, Lam KM, Dong J (2014) Facial-feature detection and localization based on a hierarchical scheme. Inf Sci 262:1–14

    Article  Google Scholar 

  12. Jian M, Lam KM, Dong J, Shen L (2015) Visual-patch-attention-aware saliency detection. IEEE Trans Cybernet 45(8):1575–1586

    Article  Google Scholar 

  13. Jian M, Qi Q, Dong J, Sun X, Sun Y, Lam KM (2017). Saliency detection using quaternionic distance based weber local descriptor and level priors. Multimedia Tools and Applications, 1-18

  14. Jian M, Qi Q, Dong J, Yin Y, Lam KM (2018) Integrating QDWD with pattern distinctness and local contrast for underwater saliency detection. J Vis Commun Image Represent 53:31–41

    Article  Google Scholar 

  15. Judd T, Ehinger K, Durand F, Torralba A (2009) ‘Learning to predict where humans look’, In Computer Vision, 2009 IEEE 12th international conference on 2106–2113

  16. Malmer T (2010) Image segmentation using GrabCut. IEEE Trans Signal Process 5(1):1–7

    Google Scholar 

  17. Manke R, Jalal AS (2014) Poisson-distribution-based approach for salient region detection. Electron Lett 51(1):37–38

    Article  Google Scholar 

  18. Matas J, Chum O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10):761–767

    Article  Google Scholar 

  19. Nistér D, Stewénius H, (2008) ‘Linear time maximally stable extremal regions’, In European Conference on Computer Vision (pp. 183–196)

  20. Rahtu E, Heikkilä J (2009) ‘A simple and efficient saliency detector for background subtraction’. In Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on 1137–1144

  21. Rother C, Kolmogorov V, Blake A (2004) Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph (TOG) 23(3):309–314

    Article  Google Scholar 

  22. Torralba A, Oliva A, Castelhano MS, Henderson JM (2006) Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol Rev 113(4):766

    Article  Google Scholar 

  23. Yin XC, Yin X, Huang K, Hao HW (2014) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983

    Article  Google Scholar 

  24. Zhang L, Gu Z, Li H (2013) ‘SDSP: A novel saliency detection method by combining simple priors’, In Image Processing (ICIP), 2013 20th IEEE International Conference on pp. 171–175

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anand Singh Jalal.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gupta, N., Jalal, A.S. A robust model for salient text detection in natural scene images using MSER feature detector and Grabcut. Multimed Tools Appl 78, 10821–10835 (2019). https://doi.org/10.1007/s11042-018-6613-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6613-1

Keywords

Navigation