A Real-Time Scene Text to Speech System

Neumann, Lukáš; Matas, Jiří

doi:10.1007/978-3-642-33885-4_66

Lukáš Neumann¹⁹ &
Jiří Matas¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7585))

Included in the following conference series:

European Conference on Computer Vision

4105 Accesses

Abstract

An end-to-end real-time scene text localization and recognition method is demonstrated. The method localizes textual content in images, a video or a webcam stream, performs character recognition (OCR) and “reads” it out loud using a text-to-speech engine. The method has been recently published, achieves state-of-the-art results on public datasets and is able to recognize different fonts and scripts including non-latin ones.

The real-time performance is achieved by posing the character detection problem as an efficient sequential selection from the set of Extremal Regions (ERs) which has a linear computation complexity in the number of pixels in the image. Robustness to blur, noise and illumination and color variations is also demonstrated. Finally, we show effects of various control parameters.

Download to read the full chapter text

Chapter PDF

Scene Text Detection and Tracking for Wearable Text-to-Speech Translation Camera

Automated Text Detection and Character Recognition in Natural Scenes Based on Local Image Features and Contour Processing Techniques

Text Localization and Recognition in Images and Video

References

Jung-Jin, L., Lee, P.H., Lee, S.W., Yuille, A., Koch, C.: Adaboost for text detection in natural scene. In: ICDAR 2011, pp. 429–434 (2011)
Google Scholar
Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: CVPR, vol. 2, pp. 366–373 (2004)
Google Scholar
Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. Circuits and Systems for Video Technology 12, 256–268 (2002)
Article Google Scholar
Pan, Y.F., Hou, X., Liu, C.L.: Text localization in natural scene images based on conditional random field. In: ICDAR 2009, pp. 6–10. IEEE Computer Society (2009)
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: CVPR 2010, pp. 2963–2970 (2010)
Google Scholar
Zhang, J., Kasturi, R.: Character Energy and Link Energy-Based Text Extraction in Scene Images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part II. LNCS, vol. 6493, pp. 308–320. Springer, Heidelberg (2011)
Chapter Google Scholar
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: CVPR 2012 (to appear, 2012)
Google Scholar
Neumann, L., Matas, J.: Text localization in real-world images using efficiently pruned exhaustive search. In: ICDAR 2011, pp. 687–691 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Machine Perception, Department of Cybernetics, Czech Technical University, Prague, Czech Republic
Lukáš Neumann & Jiří Matas

Authors

Lukáš Neumann
View author publications
You can also search for this author in PubMed Google Scholar
Jiří Matas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Ingegneria Elettrica, Gestionale e Meccanica (DIEGM), Università degli Studi di Udine, Via delle Scienze, 208, 33100, Udine, Italy
Andrea Fusiello
IIT Istituto Italiano di Tecnologia, Via Morego 30, 16163, Genoa, Italy
Vittorio Murino
Dipartimento di Ingegneria dell’Informazione, Università degli Studi di Modena e Reggio Emilia, Strada Vignolege, 905, 41125, Modena, Italy
Rita Cucchiara

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Neumann, L., Matas, J. (2012). A Real-Time Scene Text to Speech System. In: Fusiello, A., Murino, V., Cucchiara, R. (eds) Computer Vision – ECCV 2012. Workshops and Demonstrations. ECCV 2012. Lecture Notes in Computer Science, vol 7585. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33885-4_66

Download citation

DOI: https://doi.org/10.1007/978-3-642-33885-4_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33884-7
Online ISBN: 978-3-642-33885-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Real-Time Scene Text to Speech System

Abstract

Chapter PDF

Similar content being viewed by others

Scene Text Detection and Tracking for Wearable Text-to-Speech Translation Camera

Automated Text Detection and Character Recognition in Natural Scenes Based on Local Image Features and Contour Processing Techniques

Text Localization and Recognition in Images and Video

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Real-Time Scene Text to Speech System

Abstract

Chapter PDF

Similar content being viewed by others

Scene Text Detection and Tracking for Wearable Text-to-Speech Translation Camera

Automated Text Detection and Character Recognition in Natural Scenes Based on Local Image Features and Contour Processing Techniques

Text Localization and Recognition in Images and Video

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation