Skip to main content

A New Word Clustering Method for Building N-Gram Language Models in Continuous Speech Recognition Systems

  • Conference paper
New Frontiers in Applied Artificial Intelligence (IEA/AIE 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5027))

Abstract

In this paper a new method for automatic word clustering is presented. We used this method for building n-gram language models for Persian continuous speech recognition (CSR) systems. In this method, each word is specified by a feature vector that represents the statistics of parts of speech (POS) of that word. The feature vectors are clustered by k-means algorithm. Using this method causes a reduction in time complexity which is a defect in other automatic clustering methods. Also, the problem of high perplexity in manual clustering methods is abated. The experimental results are based on "Persian Text Corpus" which contains about 9 million words. The extracted language models are evaluated by the perplexity criterion and the results show that a considerable reduction in perplexity has been achieved. Also reduction in word error rate of CSR system is about 16% compared with a manual clustering method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Huang, X., Alleva, F., Hon, H., Hwang, M., Lee, K., Rosenfield, R.: The SPHINX-II Speech Recognition System: An Overview. Computer Speech and Langauge 2, 137–148 (1993)

    Article  Google Scholar 

  2. Young, S.J., Jansen, J., Odell, J.J., Ollason, D., Woodland, P.C.: The HTK Hidden Markov Model Toolkit Book (1995)

    Google Scholar 

  3. Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall, New Jersey (1993)

    Google Scholar 

  4. Heeman, P.A.: POS tagging versus Classes in Language Modeling, Proc. 6th Workshop on Very Large Corpora, August 1998, pp. 179–187 (1998)

    Google Scholar 

  5. Brown, P., Della Pietra, V., de Souza, P., Lai, J., Mercer, R.L.: Class-based n-gram models of natural language. Computational Linguistics 18(4), 467–479 (1992)

    Google Scholar 

  6. Martin, S., Liermann, J., Ney, H.: Algorithms for bigram and trigram word clustering. Speech Communication 24, 19–37 (1998)

    Article  Google Scholar 

  7. Korkmaz, E.E., Ucoluk, G.: A Method for Improving Automatic Word Categorization, Workshop on Computational Natural Language Learning, Madrid, Spain, pp. 43–49 (1997)

    Google Scholar 

  8. Harper, M.P., Jamieson, L.H., Mitchell, C.D., Ying, G.: Integrating Language Models with Speech Recognition. In: AAAI-94 Workshop on the Integration of Natural Language and Speech Processing, August 1994, pp. 139–146 (1994)

    Google Scholar 

  9. Babaali, B., Sameti, H.: The Sharif Speaker-Independent Large Vocabulary Speech Recognition System. In: The 2nd Workshop on Information Technology & Its Disciplines, Kish Island, Iran, February 24-26 (2004)

    Google Scholar 

  10. Ney, H., Haeb-Umbach, R., Tran, B.H., Oerder, M.: Improvements in Beam Search for 10000-Word Continuous Speech Recognition, IEEE Int. In: Conf. on Acoustics, Speech and Signal Processing, pp. 13–16 (1992)

    Google Scholar 

  11. Bijankhan, M.: FARSDAT-The Speech Database of Farsi Spoken Language. In: Proc. The 5th Australian Int. Conf. on Speech Science and Tech., Perth, vol. 2 (1994)

    Google Scholar 

  12. Bahrani, M., Samet, H., Hafezi, N., Movasagh, H.: Building and Incorporating Language Models for Persian Continuous Speech Recognition Systems. In: Proc. 5th international conference on Language Resources and Evaluation, Genoa, Italy, pp. 101–104 (2006)

    Google Scholar 

  13. BijanKhan, M.: Persian Text Corpus, Technical report, Research Center of Intelligent Signal Processing (2004)

    Google Scholar 

  14. Fritzke, B.: Some competitive learning methods, System Biophysics Institute for Neural Computation Ruhr-Universität Bochum (1997), ftp://ftp.neuroinformatik.ruhr-unibochum.de/pub/software/NN/DemoGNG/sclm.ps.gz

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ngoc Thanh Nguyen Leszek Borzemski Adam Grzech Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bahrani, M., Sameti, H., Hafezi, N., Momtazi, S. (2008). A New Word Clustering Method for Building N-Gram Language Models in Continuous Speech Recognition Systems. In: Nguyen, N.T., Borzemski, L., Grzech, A., Ali, M. (eds) New Frontiers in Applied Artificial Intelligence. IEA/AIE 2008. Lecture Notes in Computer Science(), vol 5027. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69052-8_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69052-8_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69045-0

  • Online ISBN: 978-3-540-69052-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics