Skip to main content

A Robust SVM/GMM Classifier for Speaker Verification

  • Conference paper
Speech and Computer (SPECOM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8773))

Included in the following conference series:

Abstract

One of the basic problems in the speaker verification applications is presence of environmental noise. State-of-art speaker verification models based on Support Vector Machine (SVM) show significant vulnerability to high noise level. This paper presents a SVM/GMM classifier for text independent speaker verification which shows additional robustness. Two techniques for training GMM models are applied, providing different results depending on the values of environmental noise. The recognition phase was tested with Serbian speakers at different Signal-to-Noise Ratio (SNR).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Signal Processing Letters 13, 308–311 (2006)

    Article  Google Scholar 

  2. Ortega-Garcia, J., Gonzalez-Rodriguez, L.: Overview of speech enhancement techniques for automatic speaker recognition. In: Proc. 4th International Conference on Spoken Language Processing, Philadelphia, PA, pp. 929–932 (1996)

    Google Scholar 

  3. Suhadi, S., Stan, S., Fingscheidt, T., Beaugeant, C.: An evaluation of VTS and IMM for speaker verification in noise. In: Proceedings of the 9th European Conference on Speech Communication and Technology (EuroSpeech 2003), Geneva, Switzerland, pp. 1669–1672 (2003)

    Google Scholar 

  4. Gales, M.J.F., Young, S.: HMM recognition in noise using parallel model combination. In: Proceedings of the 9th European Conference on Speech Communication and Technology (EuroSpeech 1993), Berlin, Germany, pp. 837–840 (1993)

    Google Scholar 

  5. Matsui, T., Kanno, T., Furui, S.: Speaker recognition using HMM composition in noisy environments. Comput. Speech Lang. 10, 107–116 (1996)

    Article  Google Scholar 

  6. Wong, L.P., Russell, M.: Text-dependent speaker verification under noisy conditions using parallel model combination. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2001), Salt Lake City, UT, pp. 457–460 (2001)

    Google Scholar 

  7. Sagayama, S., Yamaguchi, Y., Takahashi, S., Takahashi, J.: Jacobian approach to fast acoustic model adaptation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 1997), Munich, Germany, pp. 835–838 (1997)

    Google Scholar 

  8. Cerisara, C., Rigaziob, L., Junqua, J.-C.: Alpha-Jacobian environmental adaptation. Speech Commun. 42, 25–41 (2004)

    Article  Google Scholar 

  9. Gonzalez-Rodriguez, L., Ortega-Garcia, J.: Robust speaker recognition through acoustic array processing and spectral normalization. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 1997), Munich, Germany, pp. 1103–1106 (1997)

    Google Scholar 

  10. McCowan, I., Pelecanos, J., Scridha, S.: Robust speaker recognition using microphone arrays. In: Proc. A Speaker Odyssey-The Speaker Recognition Workshop, Crete, Greece, pp. 101–106 (2001)

    Google Scholar 

  11. Hu, Y., Loizou, P.C.: A generalized subspace approach for enhancing speech corrupted by colored noise. IEEE Trans. Speech and Audio Processing 11(4), 334–341 (2003)

    Article  Google Scholar 

  12. Kundu, A., Chatterjee, S., Murthy, A.S., Sreenivas, T.V.: GMM based Bayesian approach to speech enhancement in signal/transform domain. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008), Las Vegas, NE, pp. 4893–4896 (2008)

    Google Scholar 

  13. Campbell, W.M., Quatieri, T.F., Campbell, J.P., Weinstein, C.J.: Multimodal Speaker Authentication using Nonacoustic Sensors. In: Proceedings of the International Workshop on Multimodal User Authentication, Santa Barbara, CA, pp. 215–222 (2003)

    Google Scholar 

  14. Zhu, B., Hazen, T.J., Glass, J.R.: Multimodal Speech Recognition with Ultrasonic Sensors. In: Proceedings of the 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, vol. 4, pp. 662–665 (2007)

    Google Scholar 

  15. Subramanya, A., Zhang, Z., Liu, Z., Droppo, J., Acero, A.: A Graphical Model for Multi-Sensory Speech Processing in Air-and-Bone Conductive Microphones. In: Proceedings of the 9th European Conference on Speech Communication and Technology (EuroSpeech 2005), Lisbon, Portugal, pp. 2361–2364 (2005)

    Google Scholar 

  16. Cirovic, Z., Milosavljevic, M., Banjac, Z.: Multimodal Speaker Verification Based on Electroglottograph Signal and Glottal Activity Detection. EURASIP Journal on Advances in Signal Processing 2010, 930376 (2010)

    Article  Google Scholar 

  17. Kim, K., Young Kim, M.: Robust Speaker Recognition against Background Noise in an Enhanced Multi-Condition Domain. IEEE Transactions on Consumer Electronics 56(3), 1684–1688 (2010)

    Article  Google Scholar 

  18. Zao, L., Coelho, R.: Colored Noise Based Multi-condition Training Technique for Robust Speaker Identification. IEEE Signal Processing Letters 18(11), 675–678 (2011)

    Article  Google Scholar 

  19. Asbai, N., Amrouche, A., Debyeche, M.: Performances Evaluation of GMM-UBM and GMM-SVM for Speaker Recognition in Realistic World. In: Lu, B.-L., Zhang, L., Kwok, J. (eds.) ICONIP 2011, Part II. LNCS, vol. 7063, pp. 284–291. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  20. Davis, S.B., Mermelstein, P.: Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE Transactions on Acoustic, Speech and Signal Processing 28(4), 357–366 (1980)

    Article  Google Scholar 

  21. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10(1-3), 19–41 (2000)

    Article  Google Scholar 

  22. Xuan, G., Zhang, W., Chai, P.: EM algorithms of Gaussian mixture model and hidden Markov model. In: Proceedings of International Conference on Image Processing, ICIP 2001, Thessaloniki, Greece, vol. 1, pp. 145–148 (2001)

    Google Scholar 

  23. Burges, C.: A Tutorial on Support Vector Machines for Pattern Recognition. In: Fayyad, U. (ed.) Data Mining and Knowledge Discovery, vol. 2, pp. 121–167. Kluwer Academic Publishers, Boston (1998)

    Google Scholar 

  24. Jovicic, S.T., Kasic, Z., Dordevic, M., Rajkovic, M.: Serbian emotional speech database: Design, processing and evaluation. In: Proceedings of the 11th International Conference Speech and Computer (SPECOM 2004), St. Petersburg, Russia, pp. 77–81 (2004)

    Google Scholar 

  25. Cirovic, Z., Banjac, Z.: Jedna primena SVM klasifikatora u verifikaciji govornika nezavisno od teksta. In: Proceedings of Conference Infoteh, Jahorina, Bosnia and Herzegovina, pp. 833–836 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Cirovic, Z., Cirovic, N. (2014). A Robust SVM/GMM Classifier for Speaker Verification. In: Ronzhin, A., Potapova, R., Delic, V. (eds) Speech and Computer. SPECOM 2014. Lecture Notes in Computer Science(), vol 8773. Springer, Cham. https://doi.org/10.1007/978-3-319-11581-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11581-8_9

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11580-1

  • Online ISBN: 978-3-319-11581-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics