A Method for Text Localization and Recognition in Real-World Images

Neumann, Lukas; Matas, Jiri

doi:10.1007/978-3-642-19318-7_60

Lukas Neumann¹⁹ &
Jiri Matas¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6494))

Included in the following conference series:

Asian Conference on Computer Vision

4057 Accesses
139 Citations

Abstract

A general method for text localization and recognition in real-world images is presented. The proposed method is novel, as it (i) departs from a strict feed-forward pipeline and replaces it by a hypotheses-verification framework simultaneously processing multiple text line hypotheses, (ii) uses synthetic fonts to train the algorithm eliminating the need for time-consuming acquisition and labeling of real-world training data and (iii) exploits Maximally Stable Extremal Regions (MSERs) which provides robustness to geometric and illumination conditions.

The performance of the method is evaluated on two standard datasets. On the Char74k dataset, a recognition rate of 72% is achieved, 18% higher than the state-of-the-art. The paper is first to report both text detection and recognition results on the standard and rather challenging ICDAR 2003 dataset. The text localization works for number of alphabets and the method is easily adapted to recognition of other scripts, e.g. cyrillics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wu, V., Manmatha, R., Riseman Sr., E.M.: Textfinder: An automatic system to detect and recognize text in images. IEEE Trans. Pattern Anal. Mach. Intell. (1999)
Google Scholar
Chen, X., Yang, J., Zhang, J., Waibel, A.: Automatic Detection and Recognition of Signs From Natural Scenes. IEEE Trans. on Image Processing 13, 87–99 (2004)
Article Google Scholar
Ezaki, N.: Text detection from natural scene images: towards a system for visually impaired persons. In: Int. Conf. on Pattern Recognition, pp. 683–686 (2004)
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: CVPR 2010: Proc. of the 2010 Conference on Computer Vision and Pattern Recognition (2010)
Google Scholar
Lin, X.: Reliable OCR solution for digital content re-mastering. In: Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series (2001)
Google Scholar
Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 366–373 (2004)
Google Scholar
Gao, J., Yang, J.: An adaptive algorithm for text detection from natural scenes. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, p. 84 (2001)
Google Scholar
Jain, A.K., Yu, B.: Automatic text location in images and video frames. In: International Conference on Pattern Recognition, vol. 2, p. 1497 (1998)
Google Scholar
Pan, Y.F., Hou, X., Liu, C.L.: A robust system to detect and localize texts in natural scene images. In: IAPR International Workshop on Document Analysis Systems, pp. 35–42 (2008)
Google Scholar
Kim, E., Lee, S., Kim, J.: Scene text extraction using focus of mobile camera. In: International Conference on Document Analysis and Recognition, pp. 166–170 (2009)
Google Scholar
Pan, Y.F., Hou, X., Liu, C.L.: Text localization in natural scene images based on conditional random field. In: ICDAR 2009: Proc. of the 2009 10th International Conference on Document Analysis and Recognition, pp. 6–10 (2009)
Google Scholar
de Campos, T.E., Babu, B.R., Varma, M.: Character recognition in natural images. In: VISAPP, February 05-08 (2009)
Google Scholar
Yokobayashi, M., Wakahara, T.: Segmentation and recognition of characters in scene images using selective binarization in color space and gat correlation. In: Proc. of the 8th International Conference on Document Analysis and Recognition, pp. 167–171 (2005)
Google Scholar
Weinman, J.J., Learned-Miller, E., Hanson, A.R.: Scene text recognition using similarity and a lexicon with sparse belief propagation. IEEE Trans. Pattern Anal. Mach. Intell. 31, 1733–1746 (2009)
Article Google Scholar
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image and Vision Computing 22, 761–767 (2004)
Article Google Scholar
Matas, J(G.), Zimmermann, K.: A new class of learnable detectors for categorisation. In: Kalviainen, H., Parkkinen, J., Kaarna, A. (eds.) SCIA 2005. LNCS, vol. 3540, pp. 541–550. Springer, Heidelberg (2005)
Chapter Google Scholar
Nistér, D., Stewénius, H.: Linear time maximally stable extremal regions. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 183–196. Springer, Heidelberg (2008)
Chapter Google Scholar
Cristianini, N., Shawe-Taylor, J.: An introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
MATH Google Scholar
Muller, K.R., Mika, S., Ratsch, G., Tsuda, K., Scholkopf, B.: An introduction to kernel-based learning algorithms. IEEE Trans. on Neural Networks 12, 181–201 (2001)
Article Google Scholar
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: Icdar 2003 robust reading competitions. In: ICDAR 2003: Proc. of the 7th International Conference on Document Analysis and Recognition, p. 682 (2003)
Google Scholar
Myers, G.K., Bolles, R.C., Luong, Q.T., Herson, J.A., Aradhye, H.: Rectification and recognition of text in 3-d scenes. IJDAR 7, 147–158 (2005)
Article Google Scholar
Liu, C.L., Nakashima, K., Sako, H., Fujisawa, H.: Handwritten digit recognition: investigation of normalization and feature extraction techniques. Pattern Recognition 37, 265–279 (2004)
Article MATH Google Scholar
Lucas, S.M.: Text locating competition results. In: International Conference on Document Analysis and Recognition, pp. 80–85 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Machine Perception, Czech Technical University, Prague, Czech Republic
Lukas Neumann & Jiri Matas

Authors

Lukas Neumann
View author publications
You can also search for this author in PubMed Google Scholar
Jiri Matas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Technion – Israel Institute of Technology, Department of Computer Science, 32000, Haifa, Israel
Ron Kimmel
The University of Auckland, 37 Kohimarama Road , Mission Bay, 1071, Auckland, New Zealand
Reinhard Klette
National Institute of Informatics, Chiyoda, 1018430, Tokyo, Japan
Akihiro Sugimoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Neumann, L., Matas, J. (2011). A Method for Text Localization and Recognition in Real-World Images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6494. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19318-7_60

Download citation

DOI: https://doi.org/10.1007/978-3-642-19318-7_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19317-0
Online ISBN: 978-3-642-19318-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics