Text Recognition in Videos Using a Recurrent Connectionist Approach

Elagouni, Khaoula; Garcia, Christophe; Mamalet, Franck; Sébillot, Pascale

doi:10.1007/978-3-642-33266-1_22

Khaoula Elagouni^21,22,
Christophe Garcia²³,
Franck Mamalet²¹ &
…
Pascale Sébillot²²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7553))

Included in the following conference series:

International Conference on Artificial Neural Networks

3283 Accesses
5 Citations

Abstract

Most OCR (Optical Character Recognition) systems developed to recognize texts embedded in multimedia documents segment the text into characters before recognizing them. In this paper, we propose a novel approach able to avoid any explicit character segmentation. Using a multi-scale scanning scheme, texts extracted from videos are first represented by sequences of learnt features. Obtained representations are then used to feed a connectionist recurrent model specifically designed to take into account dependencies between successive learnt features and to recognize texts. The proposed video OCR evaluated on a database of TV news videos achieves very high recognition rates. Experiments also demonstrate that, for our recognition task, learnt feature representations perform better than hand-crafted features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Recognition of Cursive Caption Text Using Deep Learning - A Comparative Study on Recognition Units

Design and Implementation of a Hybrid Deep Learning Framework for Handwritten Text Recognition

Recognizing text lines in handwritten archival document images using octave convolutional and attention recurrent neural networks

Article 09 July 2024

References

Casey, R., Lecolinet, E.: A survey of methods and strategies in character segmentation. PAMI 18(7), 690–706 (2002)
Article Google Scholar
Chen, D., Odobez, J., Bourlard, H.: Text detection and recognition in images and video frames. PR 37(3), 595–608 (2004)
Google Scholar
Elagouni, K., Garcia, C., Mamalet, F., Sébillot, P.: Combining multi-scale character recognition and linguistic knowledge for natural scene text OCR. In: DAS, pp. 120–124 (2012)
Google Scholar
Elagouni, K., Garcia, C., Sébillot, P.: A comprehensive neural-based approach for text recognition in videos using natural language processing. In: ICMR (2011)
Google Scholar
Gers, F., Schraudolph, N., Schmidhuber, J.: Learning precise timing with lstm recurrent networks. JMLR 3(1), 115–143 (2003)
MATH MathSciNet Google Scholar
Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In: ICML, pp. 369–376 (2006)
Google Scholar
Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. PAMI 31(5), 855–868 (2009)
Article Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9(8) (1997)
Google Scholar
LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks. MIT Press (1995)
Google Scholar
Lienhart, R., Effelsberg, W.: Automatic text segmentation and text recognition for video indexing. Multimedia Systems 8(1), 69–81 (2000)
Article Google Scholar
Saidane, Z., Garcia, C.: Automatic scene text recognition using a convolutional neural network. In: ICBDAR, pp. 100–106 (2007)
Google Scholar
Yi, J., Peng, Y., Xiao, J.: Using multiple frame integration for the text recognition of video. In: ICDAR, pp. 71–75 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Orange Labs R&D, 35512, Cesson Sévigné, France
Khaoula Elagouni & Franck Mamalet
IRISA, INSA de Rennes, 35042, Rennes, France
Khaoula Elagouni & Pascale Sébillot
LIRIS, INSA de Lyon, 69621, Villeurbane, France
Christophe Garcia

Authors

Khaoula Elagouni
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Garcia
View author publications
You can also search for this author in PubMed Google Scholar
Franck Mamalet
View author publications
You can also search for this author in PubMed Google Scholar
Pascale Sébillot
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Neuro Heuristic Research Group, University of Lausanne, 1015, Lausanne, Switzerland
Alessandro E. P. Villa
Department of Informatics, Nicolaus Copernicus University, 87-100, Toruń, Poland
Włodzisław Duch
Center for Complex Systems Studies, Kalamazoo College, 49006, Kalamazoo, MI, USA
Péter Érdi
Dipartimento di Informatica e Scienze dell’Informazione, Università di Genova, 16146, Genoa, Italy
Francesco Masulli
Institut für Neuroinformatik, Universität Ulm, 89069, Ulm, Germany
Günther Palm

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Elagouni, K., Garcia, C., Mamalet, F., Sébillot, P. (2012). Text Recognition in Videos Using a Recurrent Connectionist Approach. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds) Artificial Neural Networks and Machine Learning – ICANN 2012. ICANN 2012. Lecture Notes in Computer Science, vol 7553. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33266-1_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-33266-1_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33265-4
Online ISBN: 978-3-642-33266-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics