Integration of lexical and syntactical knowledge in a handwriting-recognition system

Clergeau-Tournemire, Stéphanie; Plamondon, Réjean

doi:10.1007/BF01219593

Integration of lexical and syntactical knowledge in a handwriting-recognition system

Published: July 1995

Volume 8, pages 249–259, (1995)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Stéphanie Clergeau-Tournemire¹ &
Réjean Plamondon¹

46 Accesses
3 Citations
3 Altmetric
Explore all metrics

Abstract

This article presents a research project carried out with the aim of investigating the improvements in recognition performances that result from the use of linguistic information in a handwriting-recognition system. The purpose of the study was to design a postprocessor that would enhance an existing handwriting-recognition system by identifying and correcting words it did not recognize initially. This was done by integrating linguistic information (both lexical and syntactical) into the system. Every sentence containing one or more incorrect words is parsed and all possible grammatical classes for each incorrect word are listed. Then, a lexical enquiry searches for words in the lexicon corresponding to the grammatical class of the word in question. Finally, a string-comparison algorithm selects only the words in the lexicon that are close to the incorrect word. The results of this experimentation show that such a system is more efficient in correcting words (even highly distorted ones) than conventional systems that only integrate lexical information. In conclusion, the integration of linguistic information to correct words not recognized by a handwritingrecognition system is shown to be an effective approach, and one that might be worth pursuing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Main Problems of the Automation of Written Language

Using the Google Web 1T 5-Gram Corpus for OCR Error Correction

Information Retrieval with Porter Stemmer: A New Version for English

References

Aho A, Ullman J (1972) The theory of parsing, translation and compiling. Parsing, Series in Automatic Computation, Prentice-Hall, Englewood Cliffs, N.J., vol. 1
Google Scholar
Bahl LR, Jelinek F, Mercer RL (1983) A maximum likelihood approach to continuous speech recognition. IEEE Trans Patt Anal Machine Intell 5:179–90
Google Scholar
Baker JK (1975) Stochastic modeling for automatic speech understanding. In Reddy DR (ed.) Speech Recognition, Academic Press New York, pp 521–542
Google Scholar
Barrière C (1991) Exploration de l'approche par réseaux neuronnaux pour la reconnaissance de symboles manuscrits. M.Sc.A. Thesis, Montreal School of Polytechnics
Barriére C, Plamondon R (1992a) Recognizing sequences of letters in mixed-script handwriting. Proceedings Vision Interface, pp 83–91
Barriére C, Plamondon R (1992b) Réseaux neuronaux et mesure de similarité pour la reconnaissance d'écriture cursive. Bigre n.80-CNED'92: Conference on Handwriting and Documents, Nancy, France, pp 178–186
Chomsky N (1957) Syntactic structures. Mouton, The Hague
Google Scholar
Catach N (1984) Les listes orthographiques de base du franÇais (LOB): les mots les plus fréquents et leurs formes fléchies les plus fréquentes. Nathan F (ed.) Paris
Church KW (1989) Stochastic parts program and noun phrase parser for unrestricted text. Internation Conference on Acoustics, Speech and Signal Processing '89, pp 695–698
Clergeau S (1993) Intégration de connaissances lexicales et syntaxiques à un système de reconnaissance d'écriture manuscrite. M.Sc. A. Thesis, Montreal School of Polytechnics
Corraza A, De Mori R, Gretter G, Satta G (1991) Computation of probabilities for an island-driven parser. IEEE Trans Patt Anal Machine Intell 13:936–950
Google Scholar
Derouault AM, Merialdo B (1984) Natural language modelling for phoneme-to-text transcription. IEEE Trans Patt Anal Machine Intell 8:742–749
Google Scholar
Dumouchel P, Gupta V, Lennig L, Mermelstein P (1988) Three probabilistic language models for a large-vocabulary speech recognizer. International Conference on Acoustics, Speech and Signal Processing '88, pp 513–516
Ford DM, Higgins CA (1990) A tree-based dictionary search technique and comparison with n-gram letter graph reduction. In: Plamondon R, Leedham G (eds.) Computer Processing of Handwriting pp 291–312, World Scientific Pub., Singapore
Google Scholar
Goshtasby A, Ehrich RW (1988) Contextual word recognition using probabilistic relaxation labelling. Patt Recogn 21:455–462
Google Scholar
Hull JJ, Srihari SN (1982) Experiments in text recognition with binary n-grams and viterbi algorithms. IEEE Trans Patt Anal Machine Intell, 4:520–530
Google Scholar
Jelinek F (1991) Up from trigrams! The struggle for improved language models. Eurospeech 1991, Continuous Speech Recognition Group
Jelinek F and Lafferty JD (1990) Computation of the probability of initial substring generation by stochastic context-free grammars. Internal Report, Continuous Speech Recognition Group, IBM Research, T.J. Watson Research Center, Yorktown Heights, NY
Google Scholar
Jelinek F, Lafferty JD, Mercer RL (1991) Basic method of probabilistic context-free grammars. Internal Report, T.J. Watson Research Center, Yorktown Heights, NY
Google Scholar
Jones A, Story A, Ballard W (1991) Integrating multiple knowledge sources in a bayesian OCR postprocessor. International Conference on Document Analysis and Recognition, St-Malo, France pp 925–933
Keenan FG, Evett LJ, Whitrow RJ (1991) A large vocabulary stochastic syntax analyzer for handwriting recognition. International Conference on Document Analysis and Recognition, St-Malo, France, pp 794–802
Lowerre B (1980) The HARPY speech understanding system. In: Les WA (ed.) Trends in speech recognition, Prentice-Hall
Mergel D, Peaseler A (1987) Construction of language models for spoken database queries. Proceedings of the International Conference on Acoustics, Speech and Signal Processing, Dallas, pp 844–847
Parisse C, Rosenthal V, Imadache A, Andreewsky E, Cochu F (1990) A task oriented approach to reading and to handwritten text recognition. In: Plamondon R & Leedham CG (eds) Computer Processing of Handwriting. World Scientific Publishing, pp 313–335
Plamondon R, Clergeau-Tournemire S., Barrière C (1994) Handwritten sentence recognition: From signal to syntax. Proceedings of the 12th International Conference on Pattern Recognition, Jerusalem, pp 117–122
Préfontaine R, Répertoire du vocabulaire oral des 6-12 ans: évaluation de l'étendue du vocabulaire oral et écrit. Le Sablier, Boucherville, Canada
Proximity Technology (1987) PF474 Developer Toolkit
Quinton P (1977) Utilisation d'un analyseur syntatxique pour la reconnaissanece de la parole continue. Annales des Télécommunications, vol. 32
Sabah G (1988) Traitement des non-attendus. L'intelligence artificielle et le langage, pp 152–184
Seneff S (1989) TINA: a probabilistic syntactic parser for speech understanding systems. International Conference on Acoustics, Speech and Signal Processing '89, pp 711–714
Ters F (1986) Les 1000 mots fondamentaux de l'école élémentaire: échelle Dubois-Buyse, vocabulaire actif. In: Orgeval, MDI (ed)
Ullmann JR (1975) A binary n-gram technique for automatic correction of substitution, deletion, insertion and reversal errors in words. Comp J, vol. 20
Viterbi AJ (1967) Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Trans Inform Theory 13:260–269
Google Scholar
Wagner RA, Fischer MJ (1974) The string to string correction problem. J ACM 21:168–173
Google Scholar
Wells CJ, Evett LJ, Whitby PE, Whitrow RJ (1991) Word look-up for script recognition — choosing a candidate. International Conference on Document Analysis and Recognition, St-Malo, France, pp 620–628

Download references

Author information

Authors and Affiliations

Département de génie électrique et de génie informatique, école Polytechnique de Montréal, Laboratoire Scribens, Succursale Centre-Ville, Case postale 6079, H3C 3A7, Montréal, Québec, Canada
Stéphanie Clergeau-Tournemire & Réjean Plamondon

Authors

Stéphanie Clergeau-Tournemire
View author publications
You can also search for this author in PubMed Google Scholar
Réjean Plamondon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stéphanie Clergeau-Tournemire.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Clergeau-Tournemire, S., Plamondon, R. Integration of lexical and syntactical knowledge in a handwriting-recognition system. Machine Vis. Apps. 8, 249–259 (1995). https://doi.org/10.1007/BF01219593

Download citation

Issue Date: July 1995
DOI: https://doi.org/10.1007/BF01219593

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integration of lexical and syntactical knowledge in a handwriting-recognition system

Abstract

Access this article

Similar content being viewed by others

The Main Problems of the Automation of Written Language

Using the Google Web 1T 5-Gram Corpus for OCR Error Correction

Information Retrieval with Porter Stemmer: A New Version for English

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Key words

Navigation

Integration of lexical and syntactical knowledge in a handwriting-recognition system

Abstract

Access this article

Similar content being viewed by others

The Main Problems of the Automation of Written Language

Using the Google Web 1T 5-Gram Corpus for OCR Error Correction

Information Retrieval with Porter Stemmer: A New Version for English

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation