Skip to main content

Processing Handwritten Words by Intelligent Use of OCR Results

  • Conference paper
Book cover Advances in Data Mining. Applications and Theoretical Aspects (ICDM 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6171))

Included in the following conference series:

Abstract

About 3.5 million dried plants on paper sheets are deposited in the Botanical Museum Berlin in Germany. Frequently they have handwritten annotations (see figure 1). So a procedure had to be developed in order to process the handwriting on the sheet. In the present work an approach tries to identify the writer by handwritten words and to read handwritten keywords. Therefore the word is cut out and transformed into a 6-dimensional time series and compared e.g. by means of DTW-method. A recognition rate of 98.6% is achieved with 12 different words (1200 samples). All herbar documents contain several printed tokens which indicate more information about the plant. With the token it is possible to get information who has found this plant, where this plant was found (country and sometimes the town), what kind of plant it is and so on. By using the local connections of the text it is possible to get more information from the herbar document, e.g. to find and recognize handwritten text in a defined area.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bensefia, A., Paquet, T., Heutte, L.: A writer identification and verification system. Pattern Recognition Letters 26(13), 2080–2092 (2005)

    Article  Google Scholar 

  2. Rath, T.M., Manmatha, M.: Word Image Matching Using Dynamic Time Warping. In: CVPR 2003, pp. 521–527 (2003)

    Google Scholar 

  3. Marti, U., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. Journal of Pattern Recognition and Artificial Intelligence 15, 65–90 (2001)

    Article  Google Scholar 

  4. Marti, U.V., Messerli, R., Bunke, H.: Writer Identification Using Text Line Based Features. In: Proc. of the 6th International Conference on Document Analysis and Recognition, Seattle, USA, pp. 101–105 (2001)

    Google Scholar 

  5. Niels, R., Grootjen, F., Vuurpijl, L.: Writer identification through information retrieval: the allograph weight vector. In: Proceedings of the 11. Int. Conference on Frontiers in Handwriting Recognition, Montreal (2008)

    Google Scholar 

  6. Srihari, S., Arora, S.H., Lee, S.: Individuality of handwriting. J. of Forensic Sciences 47(4), 1–17 (2002)

    Google Scholar 

  7. Schlapbach, A., Bunke, H.: Off-line Handwriting Identification Using HMM Based Recognizers. Publications Uni Bern (2004)

    Google Scholar 

  8. Schomaker, L., Bulacu, M.: Automatic Writer Identification Using Connected-Component Contours and Edge-Based Features of Uppercase Western Script. IEEE Transactions of Pattern Analysis and Machine Intelligence 26(6), 787–798 (2004)

    Article  Google Scholar 

  9. Steinke, K.-H., Dzido, R., Gehrke, M., Prätel, K.: Feature recognition for herbarium specimens (Herbar-Digital). In: Proceedings of TDWG, Perth (2008)

    Google Scholar 

  10. Steinke, K.-H.: Recognition of Writers by Handwriting Images. In: Duff, M. (ed.) Conference on Pattern Recognition 1981, Oxford (1980)

    Google Scholar 

  11. Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. of the IEEE 77, 257–286 (1989)

    Article  Google Scholar 

  12. Siddiqi, I., Vincent, N.: Combining global and local features for writer identification. In: Proceedings of the 11. Int. Conference on Frontiers in Handwriting Recognition, Montreal (2008)

    Google Scholar 

  13. Steinke, K.-H.: Lokalisierung von Schrift in komplexer Umgebung, Tagungsband der Jahrestagung der deutschen  Gesellschaft für Photogrammetrie, Jena März (2009)

    Google Scholar 

  14. Sakoe, H., Chiba, S.: Dynamic Programming algorithm optimasation for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 159–165 (1978)

    Google Scholar 

  15. Steinke, K.-H., Gehrke, M., Dzido, R.: Writer Recognition by Combining Local and Global Methods. In: International Congress on Image and Signal Processing, Tianjin China (October 2009)

    Google Scholar 

  16. Heidorn, P.B., Qin, W.Y., Beaman, R., Cellinese, N.: Learning by Example: Machine Learning and Herbarium Label Digitization. In: Joint Plant Science and Conference Botany 2007, Chicago Illinois, July 7-11 (2007)

    Google Scholar 

  17. Mund, B.: Diploma thesis: Datamining in OCR Datenbanken, University of Applied Sciences and Arts, Hanover, Hannover (January 2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mund, B., Steinke, KH. (2010). Processing Handwritten Words by Intelligent Use of OCR Results. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2010. Lecture Notes in Computer Science(), vol 6171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14400-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14400-4_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14399-1

  • Online ISBN: 978-3-642-14400-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics