Word-class bigram statistics language model for a hand-written chinese character recognizer

Wong, Pak-Kwong; Chan, Chorkin

doi:10.1007/3-540-63931-4_206

Pak-Kwong Wong¹ &
Chorkin Chan¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1352))

Included in the following conference series:

Asian Conference on Computer Vision

2676 Accesses

Abstract

A word-class bigram statistics language model supported by a huge Chinese lexicon of 87,326 word entries has been investigated on its effectiveness in upgrading the accuracy of a hand-written Chinese character recognizer. The concept of the homogeneity of a word-class is introduced in classifying words into word-classes. On the average, the bigram statistics language model upgrades the recognition rate by 12.4% so that the overall system performance reaches 89.6%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Scene Character Recognition via Bag-of-Words Model: A Comprehensive Study

Recognition of Off-line Handwritten Uyghur Words Using Bayesian Networks with Grapheme Nodes

Article 04 September 2020

Online Character Recognition in Multi-lingual Framework

References

K.T. Lua, “From Character to Word — An Application of Information Theory”, Computer Processing of Chinese and Oriental Languages, Vol. 4, No. 4, pp. 304–313, March 1990.
Google Scholar
S.L. Leung, P.C. Chee, Q. Huo and C. Chan; “Contextual Vector Quantization Modeling of Hand-printed Chinese Character Recognition”; Procs. of IEEE International Conference on Image Processing, pp. 432–435, Washington, D.C., October 1995.
Google Scholar
”WORDDATA”, Chinese Knowledge Information Processing Group, Technical Report No. 93-05, Institute of Information Science, Academic Sinica, Taiwan.
Google Scholar
P.K. Wong and C. Chan, “Chinese Word Segmentation based on Maximum Matching and Word Binding Force”, Procs. of COLING'96, Vol. 1, pp. 200–203, Copenhagen, August 1996.
Google Scholar
Y. Liu, Q. Tan and K.X. Shen, “The Word Segmentation Rules and Automatic Word Segmentation Methods for Chinese Information Processing (in Chinese)”, Tsinghua University Press and Guangxi Science and Technology Press, page 36, 1994.
Google Scholar
W. Eckert, F. Gallwitz and H. Niemann; “Combining Stochastic and Linguistic Language Models for Recognition of Spontaneous Speech”; Procs. of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 423–426, Atlanta, May 1996.
Google Scholar
L.S. Lee et al; “Golden Mandarin (II)–An Intelligent Mandarin Dictation Machine for Chinese Character Input with Adaptation/ Learning Functions”, Procs. of IEEE Int. Sym. on Speech, Image Processing and Neural Networks, pp.155–159, Hong Kong, 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong
Pak-Kwong Wong & Chorkin Chan

Authors

Pak-Kwong Wong
View author publications
You can also search for this author in PubMed Google Scholar
Chorkin Chan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Roland Chin Ting-Chuen Pong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wong, PK., Chan, C. (1997). Word-class bigram statistics language model for a hand-written chinese character recognizer. In: Chin, R., Pong, TC. (eds) Computer Vision — ACCV'98. ACCV 1998. Lecture Notes in Computer Science, vol 1352. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63931-4_206

Download citation

DOI: https://doi.org/10.1007/3-540-63931-4_206
Published: 29 July 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63931-2
Online ISBN: 978-3-540-69670-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics