Skip to main content
Log in

Lexicon-driven HMM decoding for large vocabulary handwriting recognition with multiple character models

  • OriginalPaper
  • Published:
Document Analysis and Recognition Aims and scope Submit manuscript

Abstract.

This paper presents a handwriting recognition system that deals with unconstrained handwriting and large vocabularies. The system is based on the segmentation-recognition paradigm where words are first loosely segmented into characters or pseudocharacters and the final segmentation is obtained during the recognition process, which is carried out with a lexicon. Characters are modeled by multiple hidden Markov models (HMMs), which are concatenated to build up word models. The lexicon is organized as a tree structure, and during the decoding words with similar prefixes share the same computation steps. To avoid an explosion of the search space due to the presence of multiple character models, a lexicon-driven level building algorithm (LDLBA) is used to decode the lexical tree and to choose at each level the more likely models. Bigram probabilities related to the variation of writing styles within the words are inserted between the levels of the LDLBA to improve the recognition accuracy. To further speed up the recognition process, some constraints are added to limit the search efforts to the more likely parts of the search space. Experimental results on a dataset of 4674 unconstrained words show that the proposed recognition system achieves recognition rates from 98% for a 10-word vocabulary to 71% for a 30,000-word vocabulary and recognition times from 9 ms to 18.4 s, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alessandro L. Koerich.

Additional information

Received: 8 July 2002, Accepted: 1 July 2003, Published online: 12 September 2003

Correspondence to: Alessandro L. Koerich

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koerich, A.L., Sabourin, R. & Suen, C.Y. Lexicon-driven HMM decoding for large vocabulary handwriting recognition with multiple character models. IJDAR 6, 126–144 (2003). https://doi.org/10.1007/s10032-003-0113-0

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-003-0113-0

Keywords:

Navigation