Skip to main content

Multimodal Interactive Transcription of Ancient Text Images

  • Conference paper
Multimedia for Cultural Heritage (MM4CH 2011)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 247))

Included in the following conference series:

  • 613 Accesses

Abstract

The amount of digitized legacy documents has been rising dramatically over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. On one hand, the vast majority of these documents remain waiting to be transcribed into a textual electronic format (such as ASCII or PDF) that would provide historians and other researchers new ways of indexing, consulting and querying these documents. On the other hand, in some cases, adequate transcriptions of the handwritten text images are already available. This drives an increasing need to align images and transcriptions in order to make it more comfortable the consulting of these documents. In this work two systems are presented to deal with these issues. The first one aims at transcribing these documents using a interactive-predictive approach, which integrates user corrective-feedback actions in the proper recognition process. The second one presents an alignment method based on the Viterbi algorithm to find mappings between word images of a given handwritten document and their respective (ASCII) words on its given transcription.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bazzi, I., Schwartz, R., Makhoul, J.: An Omnifont Open-Vocabulary OCR System for English and Arabic. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(6), 495–504 (1999)

    Article  Google Scholar 

  2. Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press (1998)

    Google Scholar 

  3. Kavallieratou, E., Fakotakis, N., Kokkinakis, G.: An unconstrained handwriting recognition system. International Journal on Document Analysis and Recognition 4(4), 226–242 (2002)

    Article  Google Scholar 

  4. Kavallieratou, E., Stamatatos, E.: Improving the quality of degraded document images. In: Proceedings of the Second International Conference on Document Image Analysis for Libraries (DIAL 2006), pp. 340–349. IEEE Computer Society, Washington, DC, USA (2006)

    Chapter  Google Scholar 

  5. Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1995), vol. 1, pp. 181–184. IEEE Computer Society, Los Alamitos (1995)

    Chapter  Google Scholar 

  6. Likforman-Sulem, L., Zahour, A., Taconet, B.: Text line segmentation of historical documents: a survey. International Journal on Document Analysis and Recognition 9(2), 123–138 (2007)

    Article  Google Scholar 

  7. Lorigo, L., Govindaraju, V.: Offline Arabic handwriting recognition: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(5), 712–724 (2006)

    Article  Google Scholar 

  8. Marti, U.V., Bunke, H.: Using a Statistical Language Model to improve the preformance of an HMM-Based Cursive Handwriting Recognition System. Int. Journal of Pattern Recognition and Artificial Intelligence 15(1), 65–90 (2001)

    Article  Google Scholar 

  9. Romero, V., Levia, L.A., Toselli, A.H., Vidal, E.: Interactive multimodal transcription of text imagse using a web-based demo system. In: Procedings of the International Conference on Intelligent User Interfaces, Sanibel Island, Florida, pp. 477–478 (February 2009)

    Google Scholar 

  10. Romero, V., Toselli, A.H., Rodríguez, L., Vidal, E.: Computer Assisted Transcription for Ancient Text Images. In: Kamel, M.S., Campilho, A. (eds.) ICIAR 2007. LNCS, vol. 4633, pp. 1182–1193. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Romero, V., Toselli, A.H., Vidal, E.: Using mouse feedback in computer assisted transcription of handwritten text images. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR). IEEE Computer Society, Barcelona (2009)

    Google Scholar 

  12. Toselli, A.H., Juan, A., Keysers, D., González, J., Salvador, I., Ney, H., Vidal, E., Casacuberta, F.: Integrated Handwriting Recognition and Interpretation using Finite-State Models. International Journal of Pattern Recognition and Artificial Intelligence 18(4), 519–539 (2004)

    Article  Google Scholar 

  13. Toselli, A.H., Romero, V., Vidal, E.: Viterbi Based alignment between Text Images and their Transcripts. In: Language Technology for Cultural Heritage Data (LaTeCH 2007), Prague, Czech Republic, pp. 9–16 (June 2007)

    Google Scholar 

  14. Toselli, A.H., Romero, V., Vidal, E.: Computer Assisted Transcription of Text Images and Multimodal Interaction. In: Popescu-Belis, A., Stiefelhagen, R. (eds.) MLMI 2008. LNCS, vol. 5237, pp. 296–308. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  15. Toselli, A.H., Romero, V., Pastor, M., Vidal, E.: Multimodal interactive transcription of text images. Pattern Recognition 43(5), 1814–1825 (2009)

    Article  MATH  Google Scholar 

  16. Zimmermann, M., Bunke, H.: Automatic segmentation of the iam off-line database for handwritten english text. In: Proceedings of the 16th International Conference on Pattern Recognition, vol. 4, pp. 35–39 (2000)

    Google Scholar 

  17. Zimmermann, M., Chappelier, J.C., Bunke, H.: Offline grammar-based recognition of handwritten sentences. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 818–821 (2006), member-Horst Bunke

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Romero, V., Sánchez, J.A., Toselli, A.H., Vidal, E. (2012). Multimodal Interactive Transcription of Ancient Text Images. In: Grana, C., Cucchiara, R. (eds) Multimedia for Cultural Heritage. MM4CH 2011. Communications in Computer and Information Science, vol 247. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27978-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-27978-2_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-27977-5

  • Online ISBN: 978-3-642-27978-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics