Skip to main content

Semi-automatic Training Sets Acquisition for Handwriting Recognition

  • Conference paper
Computer Analysis of Images and Patterns (CAIP 2007)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4673))

Included in the following conference series:

Abstract

In this paper, a method of semi-automatic training set acquisition for character classifiers used in cursive handwriting recognition is described. The training set consists of character samples extracted from a training corpus by segmentation. The method first splits the word images from the corpus into a sequence of graphemes. Then, the set of candidate segmentation variants is elicited with an evolutionary algorithm, where the segmentation variant determines subdivision of grapheme sequences of words into subsequences corresponding to consecutive letters. Segmentation variants are modeled by a chromosome population. Next, each segmentation variant from the final population is tuned in an iterative process and the best chromosome is selected. Then character samples resulting from application of the segmentation modeled by the selected chromosome are grouped into sets corresponding to letters from the alphabet. Finally, the most outstanding samples are rejected so as to maximize the accuracy of words recognition obtained with a character classifier trained with the reduced samples set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Schomaker, L., Teulings, H.: Unsupervised learning of prototype allographs in cursive script using invariant handwritting features. In: Simon, J.C., Impedovo, S. (eds.) From Pixels to Features III, North-Holland, Amsterdam (1992)

    Google Scholar 

  2. Mestetskii, L., Reyer, I., Sederberg, T.: Continuous approach to segmentation of handwritten text. In: Eighth International Workshop on Frontiers in Handwriting Recognition, pp. 440–444 (2002)

    Google Scholar 

  3. Mackowiak, J., Schomaker, L., Vuurpijl, L.: Semi-automatic determination of allograph duration and position in on-line handwriting words based on the expected number of strokes. In: Progress in Handwriting Recognition, World Scientific, London (1997)

    Google Scholar 

  4. Yulyakov, S., Govindaraju, V.: Probabilistic model for segmentation based word recognition with lexicon. In: Proc. of the Sixth International Conference on Document Analysis and Recognition, pp. 164–167. IEEE Press, Orlando, Florida, USA (2001)

    Chapter  Google Scholar 

  5. Sadri, J., Suen, C., Bui, T.D.: A genetic framework using contextual knowledge for segmentation and recognition of handwritten numeral strings. Pattern Recognition 40, 898–919 (2007)

    Article  MATH  Google Scholar 

  6. Lamprier, S., Amghar, T., Levrat, B., Saubion, F.: Seggen: a genetic algorithm for linear text segmentation. In: Proceedings of IJCAI, pp. 1647–1652 (2007)

    Google Scholar 

  7. Connel, S., Jain, A.: Writer adaptation for online handwriting recognition. IEEE Trans. on PAMI 24, 329–346 (2002)

    Google Scholar 

  8. Arica, N., Yarman-Vural, F.: Optical character recognition for cursive handwriting. IEEE Trans. on PAMI 24, 801–813 (2002)

    Google Scholar 

  9. Knjazew, D.: OmeGA: A Competent Genetic Algorithm for Solving Permutation and Scheduling Problems. Kluwer Academic Publishers, Boston, MA (2002)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Walter G. Kropatsch Martin Kampel Allan Hanbury

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sas, J., Markowska-Kaczmar, U. (2007). Semi-automatic Training Sets Acquisition for Handwriting Recognition. In: Kropatsch, W.G., Kampel, M., Hanbury, A. (eds) Computer Analysis of Images and Patterns. CAIP 2007. Lecture Notes in Computer Science, vol 4673. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74272-2_66

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74272-2_66

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74271-5

  • Online ISBN: 978-3-540-74272-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics