A Robust SVM/GMM Classifier for Speaker Verification

Cirovic, Zoran; Cirovic, Natasa

doi:10.1007/978-3-319-11581-8_9

Zoran Cirovic²² &
Natasa Cirovic²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8773))

Included in the following conference series:

International Conference on Speech and Computer

1311 Accesses
2 Citations

Abstract

One of the basic problems in the speaker verification applications is presence of environmental noise. State-of-art speaker verification models based on Support Vector Machine (SVM) show significant vulnerability to high noise level. This paper presents a SVM/GMM classifier for text independent speaker verification which shows additional robustness. Two techniques for training GMM models are applied, providing different results depending on the values of environmental noise. The recognition phase was tested with Serbian speakers at different Signal-to-Noise Ratio (SNR).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Processing Letters 13, 308–311 (2006)
Article Google Scholar
Ortega-Garcia, J., Gonzalez-Rodriguez, L.: Overview of speech enhancement techniques for automatic speaker recognition. In: Proc. 4th International Conference on Spoken Language Processing, Philadelphia, PA, pp. 929–932 (1996)
Google Scholar
Suhadi, S., Stan, S., Fingscheidt, T., Beaugeant, C.: An evaluation of VTS and IMM for speaker verification in noise. In: Proceedings of the 9th European Conference on Speech Communication and Technology (EuroSpeech 2003), Geneva, Switzerland, pp. 1669–1672 (2003)
Google Scholar
Gales, M.J.F., Young, S.: HMM recognition in noise using parallel model combination. In: Proceedings of the 9th European Conference on Speech Communication and Technology (EuroSpeech 1993), Berlin, Germany, pp. 837–840 (1993)
Google Scholar
Matsui, T., Kanno, T., Furui, S.: Speaker recognition using HMM composition in noisy environments. Comput. Speech Lang. 10, 107–116 (1996)
Article Google Scholar
Wong, L.P., Russell, M.: Text-dependent speaker verification under noisy conditions using parallel model combination. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2001), Salt Lake City, UT, pp. 457–460 (2001)
Google Scholar
Sagayama, S., Yamaguchi, Y., Takahashi, S., Takahashi, J.: Jacobian approach to fast acoustic model adaptation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 1997), Munich, Germany, pp. 835–838 (1997)
Google Scholar
Cerisara, C., Rigaziob, L., Junqua, J.-C.: Alpha-Jacobian environmental adaptation. Speech Commun. 42, 25–41 (2004)
Article Google Scholar
Gonzalez-Rodriguez, L., Ortega-Garcia, J.: Robust speaker recognition through acoustic array processing and spectral normalization. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 1997), Munich, Germany, pp. 1103–1106 (1997)
Google Scholar
McCowan, I., Pelecanos, J., Scridha, S.: Robust speaker recognition using microphone arrays. In: Proc. A Speaker Odyssey-The Speaker Recognition Workshop, Crete, Greece, pp. 101–106 (2001)
Google Scholar
Hu, Y., Loizou, P.C.: A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Trans. Speech and Audio Processing 11(4), 334–341 (2003)
Article Google Scholar
Kundu, A., Chatterjee, S., Murthy, A.S., Sreenivas, T.V.: GMM based Bayesian approach to speech enhancement in signal/transform domain. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), Las Vegas, NE, pp. 4893–4896 (2008)
Google Scholar
Campbell, W.M., Quatieri, T.F., Campbell, J.P., Weinstein, C.J.: Multimodal Speaker Authentication using Nonacoustic Sensors. In: Proceedings of the International Workshop on Multimodal User Authentication, Santa Barbara, CA, pp. 215–222 (2003)
Google Scholar
Zhu, B., Hazen, T.J., Glass, J.R.: Multimodal Speech Recognition with Ultrasonic Sensors. In: Proceedings of the 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, vol. 4, pp. 662–665 (2007)
Google Scholar
Subramanya, A., Zhang, Z., Liu, Z., Droppo, J., Acero, A.: A Graphical Model for Multi-Sensory Speech Processing in Air-and-Bone Conductive Microphones. In: Proceedings of the 9th European Conference on Speech Communication and Technology (EuroSpeech 2005), Lisbon, Portugal, pp. 2361–2364 (2005)
Google Scholar
Cirovic, Z., Milosavljevic, M., Banjac, Z.: Multimodal Speaker Verification Based on Electroglottograph Signal and Glottal Activity Detection. EURASIP Journal on Advances in Signal Processing 2010, 930376 (2010)
Article Google Scholar
Kim, K., Young Kim, M.: Robust Speaker Recognition against Background Noise in an Enhanced Multi-Condition Domain. IEEE Transactions on Consumer Electronics 56(3), 1684–1688 (2010)
Article Google Scholar
Zao, L., Coelho, R.: Colored Noise Based Multi-condition Training Technique for Robust Speaker Identification. IEEE Signal Processing Letters 18(11), 675–678 (2011)
Article Google Scholar
Asbai, N., Amrouche, A., Debyeche, M.: Performances Evaluation of GMM-UBM and GMM-SVM for Speaker Recognition in Realistic World. In: Lu, B.-L., Zhang, L., Kwok, J. (eds.) ICONIP 2011, Part II. LNCS, vol. 7063, pp. 284–291. Springer, Heidelberg (2011)
Chapter Google Scholar
Davis, S.B., Mermelstein, P.: Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE Transactions on Acoustic, Speech and Signal Processing 28(4), 357–366 (1980)
Article Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10(1-3), 19–41 (2000)
Article Google Scholar
Xuan, G., Zhang, W., Chai, P.: EM algorithms of Gaussian mixture model and hidden Markov model. In: Proceedings of International Conference on Image Processing, ICIP 2001, Thessaloniki, Greece, vol. 1, pp. 145–148 (2001)
Google Scholar
Burges, C.: A Tutorial on Support Vector Machines for Pattern Recognition. In: Fayyad, U. (ed.) Data Mining and Knowledge Discovery, vol. 2, pp. 121–167. Kluwer Academic Publishers, Boston (1998)
Google Scholar
Jovicic, S.T., Kasic, Z., Dordevic, M., Rajkovic, M.: Serbian emotional speech database: Design, processing and evaluation. In: Proceedings of the 11th International Conference Speech and Computer (SPECOM 2004), St. Petersburg, Russia, pp. 77–81 (2004)
Google Scholar
Cirovic, Z., Banjac, Z.: Jedna primena SVM klasifikatora u verifikaciji govornika nezavisno od teksta. In: Proceedings of Conference Infoteh, Jahorina, Bosnia and Herzegovina, pp. 833–836 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Computer Engineering of Applied Studies, Belgrade, Serbia
Zoran Cirovic
Faculty of Electrical Engineering, University of Belgrade, Serbia
Natasa Cirovic

Authors

Zoran Cirovic
View author publications
You can also search for this author in PubMed Google Scholar
Natasa Cirovic
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Speech and Multimodal Interfaces Laboratory, St. Petersburg Institute of Informatics and Automation of the Russian Academy of Sciences, 39, 14th line, 199178, St. Petersburg, Russia
Andrey Ronzhin
Institute of Applied and Mathematical Linguistics, Moscow State Linguistic University, 38, Ostozhenka, 119034, Moscow, Russia
Rodmonga Potapova
Faculty of Technical Sciences, University of Novi Sad, 6, Trg Dositeja Obradovića, 21000, Novi Sad, Serbia
Vlado Delic

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cirovic, Z., Cirovic, N. (2014). A Robust SVM/GMM Classifier for Speaker Verification. In: Ronzhin, A., Potapova, R., Delic, V. (eds) Speech and Computer. SPECOM 2014. Lecture Notes in Computer Science(), vol 8773. Springer, Cham. https://doi.org/10.1007/978-3-319-11581-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-11581-8_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11580-1
Online ISBN: 978-3-319-11581-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics