Abstract
Automatic Speech Recognition (ASR) is essentially a problem of pattern classification, however, the time dimension of the speech signal has prevented to pose ASR as a simple static classification problem. Support Vector Machine (SVM) classifiers could provide an appropriate solution, since they are very well adapted to high-dimensional classification problems. Nevertheless, the use of SVMs for ASR is by no means straightforward, mainly because SVM classifiers require an input of fixed-dimension. In this paper we study the use of a HMM-based segmentation as a mean to get the fixed-dimension input vectors required by SVMs, in a problem of isolated-digit recognition. Different configurations for all the parameters involved have been tested. Also, we deal with the problem of multi-class classification (as SVMs are initially binary classifers), studying two of the most popular approaches: 1-vs-all and 1-vs-1.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Sakoe, H., Isotani, R., Yoshida, K., Iso, K., Watanabe, T.: Speaker-independent word recognition using dynamic programming neural networks. In: Proc. ICASSP 1989, pp. 29–32 (1989)
Iso, K., Watanabe, T.: Speaker-independent word recognition using a neural prediction model. In: Proc. ICASSP 1990, pp. 441–444 (1990)
Tebelskis, J., Waibel, A., Petek, B., Schmidbauer, O.: Continuous speech recognition using predictive neural networks. In: Proc. ICASSP-1991, pp. 61–64 (1991)
Bengio, Y.: Neural networks for speech and sequence recognition. London International Thomson Computer Press (1995)
Bourlard, H.A., Morgan, N.: Connectionist speech recognition: a hybrid approach. Kluwer Academic Publishers, Dordrecht (1994)
Schölkopf, B., Smola, A.: Learning with kernels. MIT Press, Cambridge (2001)
Vapnik, V.: Statistical learning theory. Wiley, Chichester (1998)
Clarkson, P., Moreno, P.J.: On the use of support vector machines for phonetic classification. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 585–588 (1999)
Ganapathiraju, A.: Support vector machines for speech recognition. PhD Thesis, Mississipi State Universisty (2002)
Smith, N.D., Gales, M.J.F.: Using SVMs and discriminative models for speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2002)
García-Cabellos, J.M., Peláez-Moreno, C., Gallardo-Antolín, A., Pérez-Cruz, F., Díaz-de-María, F.: SVM classifiers for ASR: A discussion about parameterization. In: Proceedings of EUSIPCO 2004, pp. 2067–2070 (2004)
Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing multiclass to binary: a unifying approach for margin classifiers. Journal of Machine Learning Research 1, 113–141 (2000)
Hsu, C.-W., Lin, C.-J.: A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks 13 (2002)
Huang, T.K., Weng, R.C., Lin, C.J.: A generalized bradley-terry model: From group competition to individual skill (2004). [on-line], http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/
Chih-Chung, Ch., Chih-Jen, L.: LIBSVM: a library for support vector machines. [on-line] (2004), http://www.csie.ntu.edu.tw/~cjlin/libsvm/
Young, S., et al.: HTK-Hidden Markov Model toolkit (ver 2.1). Cambridge University Press, Cambridge (1995)
Varga, A.P., Steenneken, J.M., Tomlinson, M., Jones, D.: The NOISEX-92 study on the effect of additive noise on automatic speech recognition. Tech. Rep. DRA Speech Research Unit (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martín-Iglesias, D., Bernal-Chaves, J., Peláez-Moreno, C., Gallardo-Antolín, A., Díaz-de-María, F. (2006). A Speech Recognizer Based on Multiclass SVMs with HMM-Guided Segmentation. In: Faundez-Zanuy, M., Janer, L., Esposito, A., Satue-Villar, A., Roure, J., Espinosa-Duro, V. (eds) Nonlinear Analyses and Algorithms for Speech Processing. NOLISP 2005. Lecture Notes in Computer Science(), vol 3817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11613107_22
Download citation
DOI: https://doi.org/10.1007/11613107_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31257-4
Online ISBN: 978-3-540-32586-4
eBook Packages: Computer ScienceComputer Science (R0)