Skip to main content

Text Localization with Hierarchical Multiple Feature Learning

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing -- PCM 2015 (PCM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9314))

Included in the following conference series:

  • 1815 Accesses

Abstract

In this paper, we focus on English text localization in natural scene images. We propose a hierarchical localization framework which goes from characters to strings to words. Different from existing methods which either bet on sophisticated hand-crafted features or rely on heavy learning models, our approach tends to design simple but effective features and learning models. In this study, we introduce a kind of two level character structure features in collaboration with the Histogram of Gradient (HOG) and the Convolutional Neural Network (CNN) features for character localization. In string localization, a nine-dimension string feature is proposed for discriminative verification after grouping characters. For the final word localization, we learn an optimal splitting strategy based on the interval cues to split strings into words. Experiments on the challenging ICDAR benchmark datasets demonstrate the effectiveness and superiority of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2106–2113. IEEE (2009)

    Google Scholar 

  2. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970. IEEE (2010)

    Google Scholar 

  3. Huang, W., Lin, Z., Yang, J., Wang, J.: Text localization in natural images using stroke feature transform and text covariance descriptors. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1241–1248. IEEE (2013)

    Google Scholar 

  4. Jain, A.K., Yu, B.: Automatic text location in images and video frames. Pattern Recogn. 31, 2055–2076 (1998)

    Article  Google Scholar 

  5. Jung, C., Liu, Q., Kim, J.: Accurate text localization in images based on SVM output scores. Image Vis. Comput. 27, 1295–1301 (2009)

    Article  Google Scholar 

  6. Chen, X., Yuille, A.L.: A time-efficient cascade for real-time object detection: with applications for the visually impaired. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, CVPR Workshops, pp. 28–28. IEEE (2005)

    Google Scholar 

  7. Liu, C.M., Wang, C.H., Dai, R.W.: Text detection in images based on unsupervised classification of edge-based features. Proceedings of the Eighth International Conference on Document Analysis and Recognition, vols. 1 and 2, pp. 610–614 (2005)

    Google Scholar 

  8. Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Wu, D.J., Ng, A.Y.: Text detection and character recognition in scene images with unsupervised feature learning. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 440–445. IEEE (2011)

    Google Scholar 

  9. Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 3304–3308. IEEE (2012)

    Google Scholar 

  10. Lucas, S.M.: ICDAR 2005 text locating competition results. In: 2005 Proceedings of the Eighth International Conference on Document Analysis and Recognition, pp. 80–84. IEEE (2005)

    Google Scholar 

  11. Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3538–3545. IEEE (2012)

    Google Scholar 

  12. Pan, Y.-F., Hou, X., Liu, C.-L.: Text localization in natural scene images based on conditional random field. In: 2009 10th International Conference on Document Analysis and Recognition. ICDAR 2009, pp. 6–10. IEEE (2009)

    Google Scholar 

  13. Koo, H.I., Kim, D.H.: Scene text detection via connected component clustering and nontext filtering. IEEE Trans. Image Process. 22, 2296–2305 (2013)

    Article  MathSciNet  Google Scholar 

  14. Huang, W., Qiao, Y., Tang, X.: Robust scene text detection with convolution neural network induced MSER trees. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part IV. LNCS, vol. 8692, pp. 497–511. Springer, Heidelberg (2014)

    Google Scholar 

  15. Yin, X.-C., Yin, X., Huang, K., Hao, H.-W.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36, 970–983 (2014)

    Article  Google Scholar 

  16. Hyvärinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Netw. 13, 411–430 (2000)

    Article  Google Scholar 

  17. Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: 2003 International Conference on Document Analysis and Recognition (ICDAR), pp. 682–682. IEEE Computer Society (2003)

    Google Scholar 

  18. Neumann, L., Matas, J.: Text localization in real-world images using efficiently pruned exhaustive search. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 687–691. IEEE (2011)

    Google Scholar 

  19. Yi, C., Tian, Y.: Text extraction from scene images by character appearance and structure modeling. Comput. Vis. Image Underst. 117, 182–194 (2013)

    Article  Google Scholar 

  20. Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S.: Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recogn. Lett. 34, 107–116 (2013)

    Article  Google Scholar 

  21. Gao, S., Wang, C., Xiao, B., Shi, C., Zhang, Y., Lv, Z., Shi, Y.: Adaptive Scene Text Detection Based on Transferring Adaboost. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 388–392. IEEE (2013)

    Google Scholar 

  22. Shahab, A., Shafait, F., Dengel, A.: ICDAR 2011 robust reading competition challenge 2: Reading text in scene images. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 1491–1496. IEEE (2011)

    Google Scholar 

  23. Neumann, L., Matas, J.: Scene text localization and recognition with oriented stroke detection. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 97–104. IEEE (2013)

    Google Scholar 

Download references

Acknowledgments

This work was supported by the National Natural Science Foundations of China under Grants 61373077, 61472334 and 61170179, the Natural Science Foundation of Fujian Province of China Under Grant 2013J01257, the Fundamental Research Funds for the Central Universities under Grant 20720130720, the 2014 national college students’ innovative and entrepreneurial training project, and the Scientific Research Foundation for the Introduction of Talent at Xiamen University of Technology YKJ12023R.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanyun Qu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Qu, Y., Lin, L., Liao, W., Liu, J., Wu, Y., Wang, H. (2015). Text Localization with Hierarchical Multiple Feature Learning. In: Ho, YS., Sang, J., Ro, Y., Kim, J., Wu, F. (eds) Advances in Multimedia Information Processing -- PCM 2015. PCM 2015. Lecture Notes in Computer Science(), vol 9314. Springer, Cham. https://doi.org/10.1007/978-3-319-24075-6_61

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24075-6_61

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24074-9

  • Online ISBN: 978-3-319-24075-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics