Abstract
The amount of digitized legacy documents has been rising dramatically over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. On one hand, the vast majority of these documents remain waiting to be transcribed into a textual electronic format (such as ASCII or PDF) that would provide historians and other researchers new ways of indexing, consulting and querying these documents. On the other hand, in some cases, adequate transcriptions of the handwritten text images are already available. This drives an increasing need to align images and transcriptions in order to make it more comfortable the consulting of these documents. In this work two systems are presented to deal with these issues. The first one aims at transcribing these documents using a interactive-predictive approach, which integrates user corrective-feedback actions in the proper recognition process. The second one presents an alignment method based on the Viterbi algorithm to find mappings between word images of a given handwritten document and their respective (ASCII) words on its given transcription.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bazzi, I., Schwartz, R., Makhoul, J.: An Omnifont Open-Vocabulary OCR System for English and Arabic. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(6), 495–504 (1999)
Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press (1998)
Kavallieratou, E., Fakotakis, N., Kokkinakis, G.: An unconstrained handwriting recognition system. International Journal on Document Analysis and Recognition 4(4), 226–242 (2002)
Kavallieratou, E., Stamatatos, E.: Improving the quality of degraded document images. In: Proceedings of the Second International Conference on Document Image Analysis for Libraries (DIAL 2006), pp. 340–349. IEEE Computer Society, Washington, DC, USA (2006)
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1995), vol. 1, pp. 181–184. IEEE Computer Society, Los Alamitos (1995)
Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. International Journal on Document Analysis and Recognition 9(2), 123–138 (2007)
Lorigo, L., Govindaraju, V.: Offline Arabic handwriting recognition: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(5), 712–724 (2006)
Marti, U.V., Bunke, H.: Using a Statistical Language Model to improve the preformance of an HMM-Based Cursive Handwriting Recognition System. Int. Journal of Pattern Recognition and Artificial Intelligence 15(1), 65–90 (2001)
Romero, V., Levia, L.A., Toselli, A.H., Vidal, E.: Interactive multimodal transcription of text imagse using a web-based demo system. In: Procedings of the International Conference on Intelligent User Interfaces, Sanibel Island, Florida, pp. 477–478 (February 2009)
Romero, V., Toselli, A.H., Rodríguez, L., Vidal, E.: Computer Assisted Transcription for Ancient Text Images. In: Kamel, M.S., Campilho, A. (eds.) ICIAR 2007. LNCS, vol. 4633, pp. 1182–1193. Springer, Heidelberg (2007)
Romero, V., Toselli, A.H., Vidal, E.: Using mouse feedback in computer assisted transcription of handwritten text images. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR). IEEE Computer Society, Barcelona (2009)
Toselli, A.H., Juan, A., Keysers, D., González, J., Salvador, I., Ney, H., Vidal, E., Casacuberta, F.: Integrated Handwriting Recognition and Interpretation using Finite-State Models. International Journal of Pattern Recognition and Artificial Intelligence 18(4), 519–539 (2004)
Toselli, A.H., Romero, V., Vidal, E.: Viterbi Based alignment between Text Images and their Transcripts. In: Language Technology for Cultural Heritage Data (LaTeCH 2007), Prague, Czech Republic, pp. 9–16 (June 2007)
Toselli, A.H., Romero, V., Vidal, E.: Computer Assisted Transcription of Text Images and Multimodal Interaction. In: Popescu-Belis, A., Stiefelhagen, R. (eds.) MLMI 2008. LNCS, vol. 5237, pp. 296–308. Springer, Heidelberg (2008)
Toselli, A.H., Romero, V., Pastor, M., Vidal, E.: Multimodal interactive transcription of text images. Pattern Recognition 43(5), 1814–1825 (2009)
Zimmermann, M., Bunke, H.: Automatic segmentation of the iam off-line database for handwritten english text. In: Proceedings of the 16th International Conference on Pattern Recognition, vol. 4, pp. 35–39 (2000)
Zimmermann, M., Chappelier, J.C., Bunke, H.: Offline grammar-based recognition of handwritten sentences. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 818–821 (2006), member-Horst Bunke
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Romero, V., Sánchez, J.A., Toselli, A.H., Vidal, E. (2012). Multimodal Interactive Transcription of Ancient Text Images. In: Grana, C., Cucchiara, R. (eds) Multimedia for Cultural Heritage. MM4CH 2011. Communications in Computer and Information Science, vol 247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27978-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-27978-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27977-5
Online ISBN: 978-3-642-27978-2
eBook Packages: Computer ScienceComputer Science (R0)