Abstract
About 3.5 million dried plants on paper sheets are deposited in the Botanical Museum Berlin in Germany. Frequently they have handwritten annotations (see figure 1). So a procedure had to be developed in order to process the handwriting on the sheet. In the present work an approach tries to identify the writer by handwritten words and to read handwritten keywords. Therefore the word is cut out and transformed into a 6-dimensional time series and compared e.g. by means of DTW-method. A recognition rate of 98.6% is achieved with 12 different words (1200 samples). All herbar documents contain several printed tokens which indicate more information about the plant. With the token it is possible to get information who has found this plant, where this plant was found (country and sometimes the town), what kind of plant it is and so on. By using the local connections of the text it is possible to get more information from the herbar document, e.g. to find and recognize handwritten text in a defined area.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bensefia, A., Paquet, T., Heutte, L.: A writer identification and verification system. Pattern Recognition Letters 26(13), 2080–2092 (2005)
Rath, T.M., Manmatha, M.: Word Image Matching Using Dynamic Time Warping. In: CVPR 2003, pp. 521–527 (2003)
Marti, U., Bunke, H.: Using a statistical language model to improve the performance of an HMM-based cursive handwriting recognition system. Int. Journal of Pattern Recognition and Artificial Intelligence 15, 65–90 (2001)
Marti, U.V., Messerli, R., Bunke, H.: Writer Identification Using Text Line Based Features. In: Proc. of the 6th International Conference on Document Analysis and Recognition, Seattle, USA, pp. 101–105 (2001)
Niels, R., Grootjen, F., Vuurpijl, L.: Writer identification through information retrieval: the allograph weight vector. In: Proceedings of the 11. Int. Conference on Frontiers in Handwriting Recognition, Montreal (2008)
Srihari, S., Arora, S.H., Lee, S.: Individuality of handwriting. J. of Forensic Sciences 47(4), 1–17 (2002)
Schlapbach, A., Bunke, H.: Off-line Handwriting Identification Using HMM Based Recognizers. Publications Uni Bern (2004)
Schomaker, L., Bulacu, M.: Automatic Writer Identification Using Connected-Component Contours and Edge-Based Features of Uppercase Western Script. IEEE Transactions of Pattern Analysis and Machine Intelligence 26(6), 787–798 (2004)
Steinke, K.-H., Dzido, R., Gehrke, M., Prätel, K.: Feature recognition for herbarium specimens (Herbar-Digital). In: Proceedings of TDWG, Perth (2008)
Steinke, K.-H.: Recognition of Writers by Handwriting Images. In: Duff, M. (ed.) Conference on Pattern Recognition 1981, Oxford (1980)
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proc. of the IEEE 77, 257–286 (1989)
Siddiqi, I., Vincent, N.: Combining global and local features for writer identification. In: Proceedings of the 11. Int. Conference on Frontiers in Handwriting Recognition, Montreal (2008)
Steinke, K.-H.: Lokalisierung von Schrift in komplexer Umgebung, Tagungsband der Jahrestagung der deutschen Gesellschaft für Photogrammetrie, Jena März (2009)
Sakoe, H., Chiba, S.: Dynamic Programming algorithm optimasation for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 159–165 (1978)
Steinke, K.-H., Gehrke, M., Dzido, R.: Writer Recognition by Combining Local and Global Methods. In: International Congress on Image and Signal Processing, Tianjin China (October 2009)
Heidorn, P.B., Qin, W.Y., Beaman, R., Cellinese, N.: Learning by Example: Machine Learning and Herbarium Label Digitization. In: Joint Plant Science and Conference Botany 2007, Chicago Illinois, July 7-11 (2007)
Mund, B.: Diploma thesis: Datamining in OCR Datenbanken, University of Applied Sciences and Arts, Hanover, Hannover (January 2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mund, B., Steinke, KH. (2010). Processing Handwritten Words by Intelligent Use of OCR Results. In: Perner, P. (eds) Advances in Data Mining. Applications and Theoretical Aspects. ICDM 2010. Lecture Notes in Computer Science(), vol 6171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14400-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-14400-4_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14399-1
Online ISBN: 978-3-642-14400-4
eBook Packages: Computer ScienceComputer Science (R0)