Abstract
One of the basic problems in the speaker verification applications is presence of environmental noise. State-of-art speaker verification models based on Support Vector Machine (SVM) show significant vulnerability to high noise level. This paper presents a SVM/GMM classifier for text independent speaker verification which shows additional robustness. Two techniques for training GMM models are applied, providing different results depending on the values of environmental noise. The recognition phase was tested with Serbian speakers at different Signal-to-Noise Ratio (SNR).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Processing Letters 13, 308–311 (2006)
Ortega-Garcia, J., Gonzalez-Rodriguez, L.: Overview of speech enhancement techniques for automatic speaker recognition. In: Proc. 4th International Conference on Spoken Language Processing, Philadelphia, PA, pp. 929–932 (1996)
Suhadi, S., Stan, S., Fingscheidt, T., Beaugeant, C.: An evaluation of VTS and IMM for speaker verification in noise. In: Proceedings of the 9th European Conference on Speech Communication and Technology (EuroSpeech 2003), Geneva, Switzerland, pp. 1669–1672 (2003)
Gales, M.J.F., Young, S.: HMM recognition in noise using parallel model combination. In: Proceedings of the 9th European Conference on Speech Communication and Technology (EuroSpeech 1993), Berlin, Germany, pp. 837–840 (1993)
Matsui, T., Kanno, T., Furui, S.: Speaker recognition using HMM composition in noisy environments. Comput. Speech Lang. 10, 107–116 (1996)
Wong, L.P., Russell, M.: Text-dependent speaker verification under noisy conditions using parallel model combination. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2001), Salt Lake City, UT, pp. 457–460 (2001)
Sagayama, S., Yamaguchi, Y., Takahashi, S., Takahashi, J.: Jacobian approach to fast acoustic model adaptation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 1997), Munich, Germany, pp. 835–838 (1997)
Cerisara, C., Rigaziob, L., Junqua, J.-C.: Alpha-Jacobian environmental adaptation. Speech Commun. 42, 25–41 (2004)
Gonzalez-Rodriguez, L., Ortega-Garcia, J.: Robust speaker recognition through acoustic array processing and spectral normalization. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 1997), Munich, Germany, pp. 1103–1106 (1997)
McCowan, I., Pelecanos, J., Scridha, S.: Robust speaker recognition using microphone arrays. In: Proc. A Speaker Odyssey-The Speaker Recognition Workshop, Crete, Greece, pp. 101–106 (2001)
Hu, Y., Loizou, P.C.: A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Trans. Speech and Audio Processing 11(4), 334–341 (2003)
Kundu, A., Chatterjee, S., Murthy, A.S., Sreenivas, T.V.: GMM based Bayesian approach to speech enhancement in signal/transform domain. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), Las Vegas, NE, pp. 4893–4896 (2008)
Campbell, W.M., Quatieri, T.F., Campbell, J.P., Weinstein, C.J.: Multimodal Speaker Authentication using Nonacoustic Sensors. In: Proceedings of the International Workshop on Multimodal User Authentication, Santa Barbara, CA, pp. 215–222 (2003)
Zhu, B., Hazen, T.J., Glass, J.R.: Multimodal Speech Recognition with Ultrasonic Sensors. In: Proceedings of the 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, vol. 4, pp. 662–665 (2007)
Subramanya, A., Zhang, Z., Liu, Z., Droppo, J., Acero, A.: A Graphical Model for Multi-Sensory Speech Processing in Air-and-Bone Conductive Microphones. In: Proceedings of the 9th European Conference on Speech Communication and Technology (EuroSpeech 2005), Lisbon, Portugal, pp. 2361–2364 (2005)
Cirovic, Z., Milosavljevic, M., Banjac, Z.: Multimodal Speaker Verification Based on Electroglottograph Signal and Glottal Activity Detection. EURASIP Journal on Advances in Signal Processing 2010, 930376 (2010)
Kim, K., Young Kim, M.: Robust Speaker Recognition against Background Noise in an Enhanced Multi-Condition Domain. IEEE Transactions on Consumer Electronics 56(3), 1684–1688 (2010)
Zao, L., Coelho, R.: Colored Noise Based Multi-condition Training Technique for Robust Speaker Identification. IEEE Signal Processing Letters 18(11), 675–678 (2011)
Asbai, N., Amrouche, A., Debyeche, M.: Performances Evaluation of GMM-UBM and GMM-SVM for Speaker Recognition in Realistic World. In: Lu, B.-L., Zhang, L., Kwok, J. (eds.) ICONIP 2011, Part II. LNCS, vol. 7063, pp. 284–291. Springer, Heidelberg (2011)
Davis, S.B., Mermelstein, P.: Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE Transactions on Acoustic, Speech and Signal Processing 28(4), 357–366 (1980)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10(1-3), 19–41 (2000)
Xuan, G., Zhang, W., Chai, P.: EM algorithms of Gaussian mixture model and hidden Markov model. In: Proceedings of International Conference on Image Processing, ICIP 2001, Thessaloniki, Greece, vol. 1, pp. 145–148 (2001)
Burges, C.: A Tutorial on Support Vector Machines for Pattern Recognition. In: Fayyad, U. (ed.) Data Mining and Knowledge Discovery, vol. 2, pp. 121–167. Kluwer Academic Publishers, Boston (1998)
Jovicic, S.T., Kasic, Z., Dordevic, M., Rajkovic, M.: Serbian emotional speech database: Design, processing and evaluation. In: Proceedings of the 11th International Conference Speech and Computer (SPECOM 2004), St. Petersburg, Russia, pp. 77–81 (2004)
Cirovic, Z., Banjac, Z.: Jedna primena SVM klasifikatora u verifikaciji govornika nezavisno od teksta. In: Proceedings of Conference Infoteh, Jahorina, Bosnia and Herzegovina, pp. 833–836 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Cirovic, Z., Cirovic, N. (2014). A Robust SVM/GMM Classifier for Speaker Verification. In: Ronzhin, A., Potapova, R., Delic, V. (eds) Speech and Computer. SPECOM 2014. Lecture Notes in Computer Science(), vol 8773. Springer, Cham. https://doi.org/10.1007/978-3-319-11581-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-11581-8_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11580-1
Online ISBN: 978-3-319-11581-8
eBook Packages: Computer ScienceComputer Science (R0)