Semi-automatic Training Sets Acquisition for Handwriting Recognition

Sas, Jerzy; Markowska-Kaczmar, Urszula

doi:10.1007/978-3-540-74272-2_66

Jerzy Sas¹ &
Urszula Markowska-Kaczmar¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4673))

Included in the following conference series:

International Conference on Computer Analysis of Images and Patterns

1807 Accesses
3 Citations

Abstract

In this paper, a method of semi-automatic training set acquisition for character classifiers used in cursive handwriting recognition is described. The training set consists of character samples extracted from a training corpus by segmentation. The method first splits the word images from the corpus into a sequence of graphemes. Then, the set of candidate segmentation variants is elicited with an evolutionary algorithm, where the segmentation variant determines subdivision of grapheme sequences of words into subsequences corresponding to consecutive letters. Segmentation variants are modeled by a chromosome population. Next, each segmentation variant from the final population is tuned in an iterative process and the best chromosome is selected. Then character samples resulting from application of the segmentation modeled by the selected chromosome are grouped into sets corresponding to letters from the alphabet. Finally, the most outstanding samples are rejected so as to maximize the accuracy of words recognition obtained with a character classifier trained with the reduced samples set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Schomaker, L., Teulings, H.: Unsupervised learning of prototype allographs in cursive script using invariant handwritting features. In: Simon, J.C., Impedovo, S. (eds.) From Pixels to Features III, North-Holland, Amsterdam (1992)
Google Scholar
Mestetskii, L., Reyer, I., Sederberg, T.: Continuous approach to segmentation of handwritten text. In: Eighth International Workshop on Frontiers in Handwriting Recognition, pp. 440–444 (2002)
Google Scholar
Mackowiak, J., Schomaker, L., Vuurpijl, L.: Semi-automatic determination of allograph duration and position in on-line handwriting words based on the expected number of strokes. In: Progress in Handwriting Recognition, World Scientific, London (1997)
Google Scholar
Yulyakov, S., Govindaraju, V.: Probabilistic model for segmentation based word recognition with lexicon. In: Proc. of the Sixth International Conference on Document Analysis and Recognition, pp. 164–167. IEEE Press, Orlando, Florida, USA (2001)
Chapter Google Scholar
Sadri, J., Suen, C., Bui, T.D.: A genetic framework using contextual knowledge for segmentation and recognition of handwritten numeral strings. Pattern Recognition 40, 898–919 (2007)
Article MATH Google Scholar
Lamprier, S., Amghar, T., Levrat, B., Saubion, F.: Seggen: a genetic algorithm for linear text segmentation. In: Proceedings of IJCAI, pp. 1647–1652 (2007)
Google Scholar
Connel, S., Jain, A.: Writer adaptation for online handwriting recognition. IEEE Trans. on PAMI 24, 329–346 (2002)
Google Scholar
Arica, N., Yarman-Vural, F.: Optical character recognition for cursive handwriting. IEEE Trans. on PAMI 24, 801–813 (2002)
Google Scholar
Knjazew, D.: OmeGA: A Competent Genetic Algorithm for Solving Permutation and Scheduling Problems. Kluwer Academic Publishers, Boston, MA (2002)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Wroclaw University of Technology, Applied Informatics Institute, Wyb Wyspianskiego 27, Wroclaw, Poland
Jerzy Sas & Urszula Markowska-Kaczmar

Authors

Jerzy Sas
View author publications
You can also search for this author in PubMed Google Scholar
Urszula Markowska-Kaczmar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Walter G. Kropatsch Martin Kampel Allan Hanbury

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sas, J., Markowska-Kaczmar, U. (2007). Semi-automatic Training Sets Acquisition for Handwriting Recognition. In: Kropatsch, W.G., Kampel, M., Hanbury, A. (eds) Computer Analysis of Images and Patterns. CAIP 2007. Lecture Notes in Computer Science, vol 4673. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74272-2_66

Download citation

DOI: https://doi.org/10.1007/978-3-540-74272-2_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74271-5
Online ISBN: 978-3-540-74272-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics