Skip to main content

A Speech Recognizer Based on Multiclass SVMs with HMM-Guided Segmentation

  • Conference paper
Nonlinear Analyses and Algorithms for Speech Processing (NOLISP 2005)

Abstract

Automatic Speech Recognition (ASR) is essentially a problem of pattern classification, however, the time dimension of the speech signal has prevented to pose ASR as a simple static classification problem. Support Vector Machine (SVM) classifiers could provide an appropriate solution, since they are very well adapted to high-dimensional classification problems. Nevertheless, the use of SVMs for ASR is by no means straightforward, mainly because SVM classifiers require an input of fixed-dimension. In this paper we study the use of a HMM-based segmentation as a mean to get the fixed-dimension input vectors required by SVMs, in a problem of isolated-digit recognition. Different configurations for all the parameters involved have been tested. Also, we deal with the problem of multi-class classification (as SVMs are initially binary classifers), studying two of the most popular approaches: 1-vs-all and 1-vs-1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Sakoe, H., Isotani, R., Yoshida, K., Iso, K., Watanabe, T.: Speaker-independent word recognition using dynamic programming neural networks. In: Proc. ICASSP 1989, pp. 29–32 (1989)

    Google Scholar 

  2. Iso, K., Watanabe, T.: Speaker-independent word recognition using a neural prediction model. In: Proc. ICASSP 1990, pp. 441–444 (1990)

    Google Scholar 

  3. Tebelskis, J., Waibel, A., Petek, B., Schmidbauer, O.: Continuous speech recognition using predictive neural networks. In: Proc. ICASSP-1991, pp. 61–64 (1991)

    Google Scholar 

  4. Bengio, Y.: Neural networks for speech and sequence recognition. London International Thomson Computer Press (1995)

    Google Scholar 

  5. Bourlard, H.A., Morgan, N.: Connectionist speech recognition: a hybrid approach. Kluwer Academic Publishers, Dordrecht (1994)

    Google Scholar 

  6. Schölkopf, B., Smola, A.: Learning with kernels. MIT Press, Cambridge (2001)

    Google Scholar 

  7. Vapnik, V.: Statistical learning theory. Wiley, Chichester (1998)

    MATH  Google Scholar 

  8. Clarkson, P., Moreno, P.J.: On the use of support vector machines for phonetic classification. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 585–588 (1999)

    Google Scholar 

  9. Ganapathiraju, A.: Support vector machines for speech recognition. PhD Thesis, Mississipi State Universisty (2002)

    Google Scholar 

  10. Smith, N.D., Gales, M.J.F.: Using SVMs and discriminative models for speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2002)

    Google Scholar 

  11. García-Cabellos, J.M., Peláez-Moreno, C., Gallardo-Antolín, A., Pérez-Cruz, F., Díaz-de-María, F.: SVM classifiers for ASR: A discussion about parameterization. In: Proceedings of EUSIPCO 2004, pp. 2067–2070 (2004)

    Google Scholar 

  12. Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research 1, 113–141 (2000)

    Article  MathSciNet  Google Scholar 

  13. Hsu, C.-W., Lin, C.-J.: A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks 13 (2002)

    Google Scholar 

  14. Huang, T.K., Weng, R.C., Lin, C.J.: A generalized bradley-terry model: From group competition to individual skill (2004). [on-line], http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/

  15. Chih-Chung, Ch., Chih-Jen, L.: LIBSVM: a library for support vector machines. [on-line] (2004), http://www.csie.ntu.edu.tw/~cjlin/libsvm/

  16. Young, S., et al.: HTK-Hidden Markov Model toolkit (ver 2.1). Cambridge University Press, Cambridge (1995)

    Google Scholar 

  17. Varga, A.P., Steenneken, J.M., Tomlinson, M., Jones, D.: The NOISEX-92 study on the effect of additive noise on automatic speech recognition. Tech. Rep. DRA Speech Research Unit (1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Martín-Iglesias, D., Bernal-Chaves, J., Peláez-Moreno, C., Gallardo-Antolín, A., Díaz-de-María, F. (2006). A Speech Recognizer Based on Multiclass SVMs with HMM-Guided Segmentation. In: Faundez-Zanuy, M., Janer, L., Esposito, A., Satue-Villar, A., Roure, J., Espinosa-Duro, V. (eds) Nonlinear Analyses and Algorithms for Speech Processing. NOLISP 2005. Lecture Notes in Computer Science(), vol 3817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11613107_22

Download citation

  • DOI: https://doi.org/10.1007/11613107_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31257-4

  • Online ISBN: 978-3-540-32586-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics