Skip to main content

Scene Text Recognition: No Country for Old Men?

  • Conference paper
  • First Online:
Computer Vision - ACCV 2014 Workshops (ACCV 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9009))

Included in the following conference series:

Abstract

It is a generally accepted fact that Off-the-shelf OCR engines do not perform well in unconstrained scenarios like natural scene imagery, where text appears among the clutter of the scene. However, recent research demonstrates that a conventional shape-based OCR engine would be able to produce competitive results in the end-to-end scene text recognition task when provided with a conveniently preprocessed image. In this paper we confirm this finding with a set of experiments where two off-the-shelf OCR engines are combined with an open implementation of a state-of-the-art scene text detection framework. The obtained results demonstrate that in such pipeline, conventional OCR solutions still perform competitively compared to other solutions specifically designed for scene text recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Word Lens and the Google Translate service are examples of a real applications of end-to-end scene text detection and recognition that have acquired market-level maturity.

  2. 2.

    http://code.google.com/p/tesseract-ocr/.

  3. 3.

    OCR Omnipage Professional, available at http://www.nuance.com/.

  4. 4.

    Word list is provided by the Microsoft Web N-Gram Service (http://webngram.research.microsoft.com/info/) with top 100k frequently searched words on the Bing search engine.

  5. 5.

    http://docs.opencv.org/trunk/modules/text/doc/erfilter.html.

  6. 6.

    http://code.google.com/p/tesseract-ocr/.

  7. 7.

    http://finereader.abbyy.com/.

References

  1. Bissacco, A., Cummins, M., Netzer, Y., Neven, H.: Photoocr: reading text in uncontrolled conditions. In: International Conference on Computer Vision (ICCV) (2013)

    Google Scholar 

  2. Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: Computer Vision and Pattern Recognition (CVPR) (2004)

    Google Scholar 

  3. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Computer Vision and Pattern Recognition (CVPR) (2010)

    Google Scholar 

  4. Fujisawa, H.: Forty years of research in character and document recognition an industrial perspective. Pattern Recogn. 41, 2435–2446 (2008)

    Article  Google Scholar 

  5. Gomez, L., Karatzas, D.: Multi-script text extraction from natural scenes. In: International Conference on Document Analysis and Recognition (ICDAR) (2013)

    Google Scholar 

  6. Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Gomez, L., Robles, S., Mas, J., Fernandez, D., Almazan, J., de las Heras, L.P.: ICDAR 2013 robust reading competition. In: International Conference on Document Analysis and Recognition (ICDAR) (2013)

    Google Scholar 

  7. Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: International Conference on Document Analysis and Recognition (ICDAR) (2003)

    Google Scholar 

  8. Milyaev, S., Barinova, O., Novikova, T., Kohli, P., Lempitsky, V.: Image binarization for end-to-end text understanding in natural images. In: International Conference on Document Analysis and Recognition (ICDAR) (2013)

    Google Scholar 

  9. Neumann, L., Matas, J.: A method for text localization and detection. In: Assian Conference on Computer Vision (ACCV) (2010)

    Google Scholar 

  10. Neumann, L., Matas, J.: Text localization in real-world images using efficiently pruned exhaustive search. In: International Conference on Document Analysis and Recognition (ICDAR) (2011)

    Google Scholar 

  11. Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: Computer Vision and Pattern Recognition (CVPR) (2012)

    Google Scholar 

  12. Neumann, L., Matas, J.: On combining multiple segmentations in scene text recognition. In: International Conference on Document Analysis and Recognition (ICDAR) (2013)

    Google Scholar 

  13. Neumann, L., Matas, J.: Scene text localization and recognition with oriented stroke detection. In: International Conference on Computer Vision (ICCV) (2013)

    Google Scholar 

  14. Novikova, T., Barinova, O., Kohli, P., Lempitsky, V.: Large-lexicon attribute-consistent text recognition in natural images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 752–765. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  15. Pan, Y.F., Hou, X., Liu, C.L.: Text localization in natural scene images based on conditional random field. In: International Conference on Document Analysis and Recognition (ICDAR) (2009)

    Google Scholar 

  16. Shahab, A., Shafait, F., Dengel, A.: ICDAR 2011 robust reading competition challenge 2: reading text in scene images. In: International Conference on Document Analysis and Recognition (ICDAR) (2011)

    Google Scholar 

  17. Smith, R.: An overview of the tesseract OCR engine. In: International Conference on Document Analysis and Recognition (ICDAR) (2007)

    Google Scholar 

  18. Smith, R.: Limits on the application of frequency-based language models to OCR. In: International Conference on Document Analysis and Recognition (ICDAR) (2011)

    Google Scholar 

  19. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Computer Vision and Pattern Recognition (CVPR) (2001)

    Google Scholar 

  20. Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: ICCV (2011)

    Google Scholar 

  21. Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In: International Conference on Pattern Recognition (ICPR) (2012)

    Google Scholar 

  22. Yao, C., Bai, X., Liu, W.: A unified framework for multi-oriented text detection and recognition. In: IEEE Transactions on Image Processing (TIP) (2014)

    Google Scholar 

  23. Yin, X.C., Yin, X., Huang, K., Hao, H.W.: Robust text detection in natural scene images. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2013)

    Google Scholar 

Download references

Acknowledgement

This project was supported by the Spanish project TIN2011-24631 the fellowship RYC-2009-05031, and the Catalan government scholarship 2013 FI1126. The authors want to thanks also Google Inc. for the support received through the GSoC project, as well as the OpenCV community, specially to Stefano Fabri and Vadim Pisarevsky, for their help in the implementation of the scene text detection module evaluated in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lluís Gómez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Gómez, L., Karatzas, D. (2015). Scene Text Recognition: No Country for Old Men?. In: Jawahar, C., Shan, S. (eds) Computer Vision - ACCV 2014 Workshops. ACCV 2014. Lecture Notes in Computer Science(), vol 9009. Springer, Cham. https://doi.org/10.1007/978-3-319-16631-5_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16631-5_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16630-8

  • Online ISBN: 978-3-319-16631-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics