Skip to main content

A Three Level Cache-Based Adaptive Chinese Language Model

  • Conference paper
  • 1577 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3248))

Abstract

Even if n-grams were proved to be very powerful and robust in various tasks involving language models, they have a certain handicap that the dependency is limited to very short local context because of the Markov assumption. This article presents an improved cache based approach to Chinese statistical language modeling. We extend this model by introducing the Chinese concept lexicon into it. The cache of the extended language model contains not only the words occurred recently but also the semantically related words. Experiments have shown that the performance of the adaptive model has been improved greatly.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kuhn, R., De Mori, R.: A Cache-Based Natural Language Model for Speech Reproduction. IEEE Transactions on Pattern Analysis and Machine Intelligence (1990)

    Google Scholar 

  2. Kuhn, R., De Mori, R.: Corrections to.A Cache-Based Natural Language Model for Speech Reproduction’. IEEE Transactions on Pattern Analysis and Machine Intelligence (1992)

    Google Scholar 

  3. Iyer, R., Ostendorf, M.: Modeling Long Distance Dependencies in Language: Topic Mixtures vs. Dynamic Cache Models. In: Proceedings International Conference on Spoken Language Processing, Philadelphia, USA (1996)

    Google Scholar 

  4. Jelinek, F., Merialdo, B., Roukos, S., Strauss, M.: A Dynamic Language Model for Speech Recognition. In: Proceedings of Speech and Natural Language DARPA Workshop (1991)

    Google Scholar 

  5. Clarkson, P., Robinson, A.: Language model adaption using mixture and an exponentially decaying cache. In: Boc. ICASSP-97 (1997)

    Google Scholar 

  6. JiaJu, M., YiMing, Z.: TongYiCi Ci Lin. ShangHai□ ShangHai Dictionary Publication (1983)

    Google Scholar 

  7. Yang, K.C., Ho, T.H., Chien, L.F., Lee, L.S.: Statistics-based segment pattern lexicon. a new direction for Chinese language modeling. In: Proc. IEEE 1998 International Conference on Acoustic, Speech, Signal Processing, Seattle, WA, pp. 169–172 (1998)

    Google Scholar 

  8. Witten, I., Bell, T.: The zero-frequency problem: Estimating the probabilities of Novel Events in adaptive text compression. IEEE Transactions on Information theory 37(4) (1991)

    Google Scholar 

  9. Dempster, P., Laivd, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B 39, 1–38 (1977)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, J., Sun, L., Qu, W., Du, L., Sun, Y. (2005). A Three Level Cache-Based Adaptive Chinese Language Model. In: Su, KY., Tsujii, J., Lee, JH., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2004. IJCNLP 2004. Lecture Notes in Computer Science(), vol 3248. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30211-7_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30211-7_51

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24475-2

  • Online ISBN: 978-3-540-30211-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics