Skip to main content

Robust Scene Text Detection for Multi-script Languages Using Deep Learning

  • Conference paper
  • First Online:
Book cover MultiMedia Modeling (MMM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10132))

Included in the following conference series:

  • 3258 Accesses

Abstract

Text detection in natural images has been a high demand for a lot real-life applications such as image retrieval and self-navigation. This work deals with the problem of robust text detection especially for multi-script in natural scene images. Unlike the existing works that consider multi-script characters as groups of text fragments, we consider them as non-connected components. Specifically, we firstly propose a novel representation named Linked Extremal Regions (LER) to extract full characters instead of fragments of scene characters. Secondly, we propose a two-stage convolution neural networks for discriminating multi-script texts in clutter background images for more robust text detection. Experimental results on three well-known datasets, namely, ICDAR 2011, 2013 and MSRA-TD500, demonstrate that the proposed method outperforms the state-of-the-art methods, and is also language independent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: British Machine Vision Conference, pp. 384–393 (2002)

    Google Scholar 

  2. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2963–2970 (2010)

    Google Scholar 

  3. Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3538–3545 (2012)

    Google Scholar 

  4. Neumann, L., Matas, J.: Scene text localization and recognition with oriented stroke detection. In: IEEE International Conference on Computer Vision, pp. 97–104 (2013)

    Google Scholar 

  5. Huang, W., Qiao, Yu., Tang, X.: Robust scene text detection with convolution neural network induced MSER trees. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 497–511. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10593-2_33

    Google Scholar 

  6. Xu, H., Su, F.: A robust hierarchical detection method for scene text based on convolutional neural networks. In: IEEE International Conference on Multimedia and Expo, pp. 1–6 (2015)

    Google Scholar 

  7. Sung, M.C., Jun, B., Cho, H., Kim, D.: Scene text detection with robust character candidate extraction method. In: International Conference on Document Analysis and Recognition, pp. 426–430 (2015)

    Google Scholar 

  8. Zhang, Z., Shen, W., Yao, C., Bai, X.: Symmetry-based text line detection in natural scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4321–4329 (2015)

    Google Scholar 

  9. Yin, X.C., Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 36(5), 970–983 (2014)

    Article  Google Scholar 

  10. Yin, X.C., Pei, W.Y., Zhang, J., Hao, H.: Multi-orientation scene text detection with adaptive clustering. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 37(9), 1930–1937 (2015)

    Article  Google Scholar 

  11. Yao, C., Bai, X., Liu, W., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1083–1090 (2012)

    Google Scholar 

  12. Kang, L., Li, Y., Doermann, D.: Orientation robust text line detection in natural images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4034–4041 (2014)

    Google Scholar 

  13. Ren, S., Cao, X., Wei, Y., Sun, J.: Face alignment at 3000 FPS via regressing local binary features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1685–1692 (2014)

    Google Scholar 

Download references

Acknowledgments

The work described in this paper was supported by the Natural Science Foundation of China under Grant Nos. 61672273, 61272218 and 61321491, the Science Foundation for Distinguished Young Scholars of Jiangsu under Grant No. BK20160021.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tong Lu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Liu, RZ. et al. (2017). Robust Scene Text Detection for Multi-script Languages Using Deep Learning. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51811-4_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51810-7

  • Online ISBN: 978-3-319-51811-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics