Abstract
A word-class bigram statistics language model supported by a huge Chinese lexicon of 87,326 word entries has been investigated on its effectiveness in upgrading the accuracy of a hand-written Chinese character recognizer. The concept of the homogeneity of a word-class is introduced in classifying words into word-classes. On the average, the bigram statistics language model upgrades the recognition rate by 12.4% so that the overall system performance reaches 89.6%.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
K.T. Lua, “From Character to Word — An Application of Information Theory”, Computer Processing of Chinese and Oriental Languages, Vol. 4, No. 4, pp. 304–313, March 1990.
S.L. Leung, P.C. Chee, Q. Huo and C. Chan; “Contextual Vector Quantization Modeling of Hand-printed Chinese Character Recognition”; Procs. of IEEE International Conference on Image Processing, pp. 432–435, Washington, D.C., October 1995.
”WORDDATA”, Chinese Knowledge Information Processing Group, Technical Report No. 93-05, Institute of Information Science, Academic Sinica, Taiwan.
P.K. Wong and C. Chan, “Chinese Word Segmentation based on Maximum Matching and Word Binding Force”, Procs. of COLING'96, Vol. 1, pp. 200–203, Copenhagen, August 1996.
Y. Liu, Q. Tan and K.X. Shen, “The Word Segmentation Rules and Automatic Word Segmentation Methods for Chinese Information Processing (in Chinese)”, Tsinghua University Press and Guangxi Science and Technology Press, page 36, 1994.
W. Eckert, F. Gallwitz and H. Niemann; “Combining Stochastic and Linguistic Language Models for Recognition of Spontaneous Speech”; Procs. of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 423–426, Atlanta, May 1996.
L.S. Lee et al; “Golden Mandarin (II)–An Intelligent Mandarin Dictation Machine for Chinese Character Input with Adaptation/ Learning Functions”, Procs. of IEEE Int. Sym. on Speech, Image Processing and Neural Networks, pp.155–159, Hong Kong, 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wong, PK., Chan, C. (1997). Word-class bigram statistics language model for a hand-written chinese character recognizer. In: Chin, R., Pong, TC. (eds) Computer Vision — ACCV'98. ACCV 1998. Lecture Notes in Computer Science, vol 1352. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63931-4_206
Download citation
DOI: https://doi.org/10.1007/3-540-63931-4_206
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63931-2
Online ISBN: 978-3-540-69670-4
eBook Packages: Springer Book Archive