Multimodal Interactive Transcription of Ancient Text Images

Romero, Verónica; Sánchez, Joan Andreu; Toselli, Alejandro H.; Vidal, Enrique

doi:10.1007/978-3-642-27978-2_6

Verónica Romero²,
Joan Andreu Sánchez²,
Alejandro H. Toselli² &
…
Enrique Vidal²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 247))

Included in the following conference series:

International Workshop on Multimedia for Cultural Heritage

613 Accesses

Abstract

The amount of digitized legacy documents has been rising dramatically over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. On one hand, the vast majority of these documents remain waiting to be transcribed into a textual electronic format (such as ASCII or PDF) that would provide historians and other researchers new ways of indexing, consulting and querying these documents. On the other hand, in some cases, adequate transcriptions of the handwritten text images are already available. This drives an increasing need to align images and transcriptions in order to make it more comfortable the consulting of these documents. In this work two systems are presented to deal with these issues. The first one aims at transcribing these documents using a interactive-predictive approach, which integrates user corrective-feedback actions in the proper recognition process. The second one presents an alignment method based on the Viterbi algorithm to find mappings between word images of a given handwritten document and their respective (ASCII) words on its given transcription.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bazzi, I., Schwartz, R., Makhoul, J.: An Omnifont Open-Vocabulary OCR System for English and Arabic. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(6), 495–504 (1999)
Article Google Scholar
Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press (1998)
Google Scholar
Kavallieratou, E., Fakotakis, N., Kokkinakis, G.: An unconstrained handwriting recognition system. International Journal on Document Analysis and Recognition 4(4), 226–242 (2002)
Article Google Scholar
Kavallieratou, E., Stamatatos, E.: Improving the quality of degraded document images. In: Proceedings of the Second International Conference on Document Image Analysis for Libraries (DIAL 2006), pp. 340–349. IEEE Computer Society, Washington, DC, USA (2006)
Chapter Google Scholar
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1995), vol. 1, pp. 181–184. IEEE Computer Society, Los Alamitos (1995)
Chapter Google Scholar
Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. International Journal on Document Analysis and Recognition 9(2), 123–138 (2007)
Article Google Scholar
Lorigo, L., Govindaraju, V.: Offline Arabic handwriting recognition: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(5), 712–724 (2006)
Article Google Scholar
Marti, U.V., Bunke, H.: Using a Statistical Language Model to improve the preformance of an HMM-Based Cursive Handwriting Recognition System. Int. Journal of Pattern Recognition and Artificial Intelligence 15(1), 65–90 (2001)
Article Google Scholar
Romero, V., Levia, L.A., Toselli, A.H., Vidal, E.: Interactive multimodal transcription of text imagse using a web-based demo system. In: Procedings of the International Conference on Intelligent User Interfaces, Sanibel Island, Florida, pp. 477–478 (February 2009)
Google Scholar
Romero, V., Toselli, A.H., Rodríguez, L., Vidal, E.: Computer Assisted Transcription for Ancient Text Images. In: Kamel, M.S., Campilho, A. (eds.) ICIAR 2007. LNCS, vol. 4633, pp. 1182–1193. Springer, Heidelberg (2007)
Chapter Google Scholar
Romero, V., Toselli, A.H., Vidal, E.: Using mouse feedback in computer assisted transcription of handwritten text images. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR). IEEE Computer Society, Barcelona (2009)
Google Scholar
Toselli, A.H., Juan, A., Keysers, D., González, J., Salvador, I., Ney, H., Vidal, E., Casacuberta, F.: Integrated Handwriting Recognition and Interpretation using Finite-State Models. International Journal of Pattern Recognition and Artificial Intelligence 18(4), 519–539 (2004)
Article Google Scholar
Toselli, A.H., Romero, V., Vidal, E.: Viterbi Based alignment between Text Images and their Transcripts. In: Language Technology for Cultural Heritage Data (LaTeCH 2007), Prague, Czech Republic, pp. 9–16 (June 2007)
Google Scholar
Toselli, A.H., Romero, V., Vidal, E.: Computer Assisted Transcription of Text Images and Multimodal Interaction. In: Popescu-Belis, A., Stiefelhagen, R. (eds.) MLMI 2008. LNCS, vol. 5237, pp. 296–308. Springer, Heidelberg (2008)
Chapter Google Scholar
Toselli, A.H., Romero, V., Pastor, M., Vidal, E.: Multimodal interactive transcription of text images. Pattern Recognition 43(5), 1814–1825 (2009)
Article MATH Google Scholar
Zimmermann, M., Bunke, H.: Automatic segmentation of the iam off-line database for handwritten english text. In: Proceedings of the 16th International Conference on Pattern Recognition, vol. 4, pp. 35–39 (2000)
Google Scholar
Zimmermann, M., Chappelier, J.C., Bunke, H.: Offline grammar-based recognition of handwritten sentences. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 818–821 (2006), member-Horst Bunke
Article Google Scholar

Download references

Author information

Authors and Affiliations

Instituto Tecnológico de Informática (ITI), Universidad Politécnica de Valencia, Spain
Verónica Romero, Joan Andreu Sánchez, Alejandro H. Toselli & Enrique Vidal

Authors

Verónica Romero
View author publications
You can also search for this author in PubMed Google Scholar
Joan Andreu Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro H. Toselli
View author publications
You can also search for this author in PubMed Google Scholar
Enrique Vidal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Ingegneria dell’Informazione, Università degli Studi di Modena e Reggio Emilia, Via Vignolese 905/b, 41125, Modena, Italy
Costantino Grana & Rita Cucchiara &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Romero, V., Sánchez, J.A., Toselli, A.H., Vidal, E. (2012). Multimodal Interactive Transcription of Ancient Text Images. In: Grana, C., Cucchiara, R. (eds) Multimedia for Cultural Heritage. MM4CH 2011. Communications in Computer and Information Science, vol 247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27978-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-27978-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27977-5
Online ISBN: 978-3-642-27978-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics