Skip to main content

Competitive Learning Methods for Efficient Vector Quantizations in a Speech Recognition Environment

  • Conference paper
MICAI 2000: Advances in Artificial Intelligence (MICAI 2000)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1793))

Included in the following conference series:

Abstract

This paper presents a comparison of three competitive learning methods for vector quantizations of speech data in an efficient way. The analyzed algorithms were two batch methods (the Lloyd LBG algorithm and the Neural Gas method) and one on-line technique (K-means algorithm). These methods obtain reduced subsets of codewords for representing bigger data sets. The experiments were designed for speaker dependent and independent tests and consisted in evaluating the reduced training files for speech recognition purposes. In all the studied cases, the results shown a reduction of learning patterns of near 2 orders of magnitude respect to the original training sets without heavily affecting the speech recognition accuracy. The savings in time after using these quantization techniques, made us to consider this reduction results as excellent since they help to approximate the speech matching responses to almost real time. The main contribution of this work refers to an original use of competitive learning techniques for efficient vector quantization of speech data and so, for reducing the memory size and computational costs of a speech recognizer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Mohri, M., Riley, M.: Weighted Determination and Minimization for Large Vocabulary Speech Recognition. In: Proc. of Eurospeech 1997, Rhodes, Greece, vol. 1, pp. 131–134 (1997)

    Google Scholar 

  2. Padmanabhan, M., Bahl, L.R., Nahamoo, D., de Souza, P.: Decision-Tree Based Quantization of the Feature Space of a Speech Recognizer. In: Proc. of Eurospeech 1997, Rhodes, Greece, vol. 1, pp. 147–150 (1997)

    Google Scholar 

  3. Ravishankar, M., Bisiani, R., Thayer, E.: Sub-Vector Clustering to Improve Memory and Speed Performance of Acoustic Likelihood Computation. In: Proc. of Euro speech 1997, Rhodes, Greece, vol. 1, pp. 151–154 (1997)

    Google Scholar 

  4. Paliwal, K., Atal, B.: Efficient Vector Quantization of LPC Parameters at 24 Bits/Frame. IEEE Transactions on Speech and Audio Processing 1(1), 3–14 (1993)

    Article  Google Scholar 

  5. Linde, Y., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Transactions on Communication 28, 84–95 (1980)

    Article  Google Scholar 

  6. Martinetz, T.M., Schulten, K.J.: A Neural Gas Network Learns Topologies. In: Kohonen, T., Maklsara, K., Simula, O., Kangas, J. (eds.) Artificial Neural Networks, pp. 397–402. North Holland, Amsterdam (1991)

    Google Scholar 

  7. Mac Queen, J.: Some methods for classification and analysis of multivariate observations. In: Proc. of the Fifth Berkeley Symposium on Mathematical statistics and probability, Berkeley, pp. 281–297 (1967)

    Google Scholar 

  8. Voronoi, M.G.: Nouvelles applications des parametres continus a la theorie des formes quadratiques. J. Reine u. Angew. Math. 134, 198–287 (1908)

    Article  MATH  Google Scholar 

  9. Cherkassky, V., Mulier, F.: Learning from data: Concepts, theory and methods. John Wiley and Sons, Chichester

    Google Scholar 

  10. B. Fritzke.: Some Competitive Learning Methods, Technical Report, Institute for Neural Computation, Ruhr-Universitat Bochum (1997)

    Google Scholar 

  11. Hermansky, H.: Perceptual Linear Predictive (PLP) Analysis of Speech. Journal of Acoust. Soc. Am., 1738–1752 (April 1990)

    Google Scholar 

  12. Robinson, A.: SHORTEN: Simple lossless and near-lossless waveform compression, Technical Report, CUED/F-INFENG/TR 156 Cambridge University, U.K. (December 1994)

    Google Scholar 

  13. SPK Isolated Digit Database. ELRA-IRST. Istituto per la Ricerca Scientifica e Tecnologica

    Google Scholar 

  14. Robinson, A.: An Application of Recurrent Nets to Phone Probability Estimation. IEEE Transactions on Neural Networks 5(2), 298–305 (1994)

    Article  Google Scholar 

  15. Curatelli, F., Mayora-Ibarra, O., Carotenuto, D.: SPEAR, A Modular Tool for Speech Signal Processing and Recognition. In: Proceeding of WISP 1999, Budapest, Hungary (1999)

    Google Scholar 

  16. Curatelli, F., Mayora-Ibarra, O.: An Hybrid Parallel Associative Memory / DTW Based System for Speech Recognition. In: Recent Advances in Signal Processing and Communications, pp. 140–144. World Scientific Engineering Society, Singapore (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Curatelli, F., Mayora-Ibarra, O. (2000). Competitive Learning Methods for Efficient Vector Quantizations in a Speech Recognition Environment. In: Cairó, O., Sucar, L.E., Cantu, F.J. (eds) MICAI 2000: Advances in Artificial Intelligence. MICAI 2000. Lecture Notes in Computer Science(), vol 1793. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10720076_10

Download citation

  • DOI: https://doi.org/10.1007/10720076_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67354-5

  • Online ISBN: 978-3-540-45562-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics