Skip to main content

Word-class bigram statistics language model for a hand-written chinese character recognizer

  • Poster Session II
  • Conference paper
  • First Online:
Computer Vision — ACCV'98 (ACCV 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1352))

Included in the following conference series:

  • 2674 Accesses

Abstract

A word-class bigram statistics language model supported by a huge Chinese lexicon of 87,326 word entries has been investigated on its effectiveness in upgrading the accuracy of a hand-written Chinese character recognizer. The concept of the homogeneity of a word-class is introduced in classifying words into word-classes. On the average, the bigram statistics language model upgrades the recognition rate by 12.4% so that the overall system performance reaches 89.6%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K.T. Lua, “From Character to Word — An Application of Information Theory”, Computer Processing of Chinese and Oriental Languages, Vol. 4, No. 4, pp. 304–313, March 1990.

    Google Scholar 

  2. S.L. Leung, P.C. Chee, Q. Huo and C. Chan; “Contextual Vector Quantization Modeling of Hand-printed Chinese Character Recognition”; Procs. of IEEE International Conference on Image Processing, pp. 432–435, Washington, D.C., October 1995.

    Google Scholar 

  3. ”WORDDATA”, Chinese Knowledge Information Processing Group, Technical Report No. 93-05, Institute of Information Science, Academic Sinica, Taiwan.

    Google Scholar 

  4. P.K. Wong and C. Chan, “Chinese Word Segmentation based on Maximum Matching and Word Binding Force”, Procs. of COLING'96, Vol. 1, pp. 200–203, Copenhagen, August 1996.

    Google Scholar 

  5. Y. Liu, Q. Tan and K.X. Shen, “The Word Segmentation Rules and Automatic Word Segmentation Methods for Chinese Information Processing (in Chinese)”, Tsinghua University Press and Guangxi Science and Technology Press, page 36, 1994.

    Google Scholar 

  6. W. Eckert, F. Gallwitz and H. Niemann; “Combining Stochastic and Linguistic Language Models for Recognition of Spontaneous Speech”; Procs. of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 423–426, Atlanta, May 1996.

    Google Scholar 

  7. L.S. Lee et al; “Golden Mandarin (II)–An Intelligent Mandarin Dictation Machine for Chinese Character Input with Adaptation/ Learning Functions”, Procs. of IEEE Int. Sym. on Speech, Image Processing and Neural Networks, pp.155–159, Hong Kong, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Roland Chin Ting-Chuen Pong

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wong, PK., Chan, C. (1997). Word-class bigram statistics language model for a hand-written chinese character recognizer. In: Chin, R., Pong, TC. (eds) Computer Vision — ACCV'98. ACCV 1998. Lecture Notes in Computer Science, vol 1352. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63931-4_206

Download citation

  • DOI: https://doi.org/10.1007/3-540-63931-4_206

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63931-2

  • Online ISBN: 978-3-540-69670-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics