Skip to main content
Log in

Integration of lexical and syntactical knowledge in a handwriting-recognition system

  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

This article presents a research project carried out with the aim of investigating the improvements in recognition performances that result from the use of linguistic information in a handwriting-recognition system. The purpose of the study was to design a postprocessor that would enhance an existing handwriting-recognition system by identifying and correcting words it did not recognize initially. This was done by integrating linguistic information (both lexical and syntactical) into the system. Every sentence containing one or more incorrect words is parsed and all possible grammatical classes for each incorrect word are listed. Then, a lexical enquiry searches for words in the lexicon corresponding to the grammatical class of the word in question. Finally, a string-comparison algorithm selects only the words in the lexicon that are close to the incorrect word. The results of this experimentation show that such a system is more efficient in correcting words (even highly distorted ones) than conventional systems that only integrate lexical information. In conclusion, the integration of linguistic information to correct words not recognized by a handwritingrecognition system is shown to be an effective approach, and one that might be worth pursuing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aho A, Ullman J (1972) The theory of parsing, translation and compiling. Parsing, Series in Automatic Computation, Prentice-Hall, Englewood Cliffs, N.J., vol. 1

    Google Scholar 

  • Bahl LR, Jelinek F, Mercer RL (1983) A maximum likelihood approach to continuous speech recognition. IEEE Trans Patt Anal Machine Intell 5:179–90

    Google Scholar 

  • Baker JK (1975) Stochastic modeling for automatic speech understanding. In Reddy DR (ed.) Speech Recognition, Academic Press New York, pp 521–542

    Google Scholar 

  • Barrière C (1991) Exploration de l'approche par réseaux neuronnaux pour la reconnaissance de symboles manuscrits. M.Sc.A. Thesis, Montreal School of Polytechnics

  • Barriére C, Plamondon R (1992a) Recognizing sequences of letters in mixed-script handwriting. Proceedings Vision Interface, pp 83–91

  • Barriére C, Plamondon R (1992b) Réseaux neuronaux et mesure de similarité pour la reconnaissance d'écriture cursive. Bigre n.80-CNED'92: Conference on Handwriting and Documents, Nancy, France, pp 178–186

  • Chomsky N (1957) Syntactic structures. Mouton, The Hague

    Google Scholar 

  • Catach N (1984) Les listes orthographiques de base du franÇais (LOB): les mots les plus fréquents et leurs formes fléchies les plus fréquentes. Nathan F (ed.) Paris

  • Church KW (1989) Stochastic parts program and noun phrase parser for unrestricted text. Internation Conference on Acoustics, Speech and Signal Processing '89, pp 695–698

  • Clergeau S (1993) Intégration de connaissances lexicales et syntaxiques à un système de reconnaissance d'écriture manuscrite. M.Sc. A. Thesis, Montreal School of Polytechnics

  • Corraza A, De Mori R, Gretter G, Satta G (1991) Computation of probabilities for an island-driven parser. IEEE Trans Patt Anal Machine Intell 13:936–950

    Google Scholar 

  • Derouault AM, Merialdo B (1984) Natural language modelling for phoneme-to-text transcription. IEEE Trans Patt Anal Machine Intell 8:742–749

    Google Scholar 

  • Dumouchel P, Gupta V, Lennig L, Mermelstein P (1988) Three probabilistic language models for a large-vocabulary speech recognizer. International Conference on Acoustics, Speech and Signal Processing '88, pp 513–516

  • Ford DM, Higgins CA (1990) A tree-based dictionary search technique and comparison with n-gram letter graph reduction. In: Plamondon R, Leedham G (eds.) Computer Processing of Handwriting pp 291–312, World Scientific Pub., Singapore

    Google Scholar 

  • Goshtasby A, Ehrich RW (1988) Contextual word recognition using probabilistic relaxation labelling. Patt Recogn 21:455–462

    Google Scholar 

  • Hull JJ, Srihari SN (1982) Experiments in text recognition with binary n-grams and viterbi algorithms. IEEE Trans Patt Anal Machine Intell, 4:520–530

    Google Scholar 

  • Jelinek F (1991) Up from trigrams! The struggle for improved language models. Eurospeech 1991, Continuous Speech Recognition Group

  • Jelinek F and Lafferty JD (1990) Computation of the probability of initial substring generation by stochastic context-free grammars. Internal Report, Continuous Speech Recognition Group, IBM Research, T.J. Watson Research Center, Yorktown Heights, NY

    Google Scholar 

  • Jelinek F, Lafferty JD, Mercer RL (1991) Basic method of probabilistic context-free grammars. Internal Report, T.J. Watson Research Center, Yorktown Heights, NY

    Google Scholar 

  • Jones A, Story A, Ballard W (1991) Integrating multiple knowledge sources in a bayesian OCR postprocessor. International Conference on Document Analysis and Recognition, St-Malo, France pp 925–933

  • Keenan FG, Evett LJ, Whitrow RJ (1991) A large vocabulary stochastic syntax analyzer for handwriting recognition. International Conference on Document Analysis and Recognition, St-Malo, France, pp 794–802

  • Lowerre B (1980) The HARPY speech understanding system. In: Les WA (ed.) Trends in speech recognition, Prentice-Hall

  • Mergel D, Peaseler A (1987) Construction of language models for spoken database queries. Proceedings of the International Conference on Acoustics, Speech and Signal Processing, Dallas, pp 844–847

  • Parisse C, Rosenthal V, Imadache A, Andreewsky E, Cochu F (1990) A task oriented approach to reading and to handwritten text recognition. In: Plamondon R & Leedham CG (eds) Computer Processing of Handwriting. World Scientific Publishing, pp 313–335

  • Plamondon R, Clergeau-Tournemire S., Barrière C (1994) Handwritten sentence recognition: From signal to syntax. Proceedings of the 12th International Conference on Pattern Recognition, Jerusalem, pp 117–122

  • Préfontaine R, Répertoire du vocabulaire oral des 6-12 ans: évaluation de l'étendue du vocabulaire oral et écrit. Le Sablier, Boucherville, Canada

  • Proximity Technology (1987) PF474 Developer Toolkit

  • Quinton P (1977) Utilisation d'un analyseur syntatxique pour la reconnaissanece de la parole continue. Annales des Télécommunications, vol. 32

  • Sabah G (1988) Traitement des non-attendus. L'intelligence artificielle et le langage, pp 152–184

  • Seneff S (1989) TINA: a probabilistic syntactic parser for speech understanding systems. International Conference on Acoustics, Speech and Signal Processing '89, pp 711–714

  • Ters F (1986) Les 1000 mots fondamentaux de l'école élémentaire: échelle Dubois-Buyse, vocabulaire actif. In: Orgeval, MDI (ed)

  • Ullmann JR (1975) A binary n-gram technique for automatic correction of substitution, deletion, insertion and reversal errors in words. Comp J, vol. 20

  • Viterbi AJ (1967) Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Trans Inform Theory 13:260–269

    Google Scholar 

  • Wagner RA, Fischer MJ (1974) The string to string correction problem. J ACM 21:168–173

    Google Scholar 

  • Wells CJ, Evett LJ, Whitby PE, Whitrow RJ (1991) Word look-up for script recognition — choosing a candidate. International Conference on Document Analysis and Recognition, St-Malo, France, pp 620–628

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stéphanie Clergeau-Tournemire.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Clergeau-Tournemire, S., Plamondon, R. Integration of lexical and syntactical knowledge in a handwriting-recognition system. Machine Vis. Apps. 8, 249–259 (1995). https://doi.org/10.1007/BF01219593

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01219593

Key words

Navigation