Bayes-Optimal Estimation of GMM Parameters for Speaker Recognition

Garcia, Guillermo; Jung, Sung-Kyo; Eriksson, Thomas

doi:10.1007/978-3-540-74122-0_13

Guillermo Garcia¹,
Sung-Kyo Jung¹ &
Thomas Eriksson¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4441))

1293 Accesses

Abstract

In text-independent speaker recognition, Gaussian Mixture Models (GMMs) are widely employed as statistical models of the speakers. It is assumed that the Expectation Maximization (EM) algorithm can estimate the optimal model parameters such as weight, mean and variance of each Gaussian model for each speaker. However, this is not entirely true since there are practical limitations, such as limited size of the training database and uncertainties in the model parameters. As is well known in the literature, limited-size databases is one of the largest challenges in speaker recognition research. In this paper, we investigate methods to overcome the database and parameter uncertainty problem. By reformulating the GMM estimation problem in a Bayesian-optimal way (as opposed to ML-optimal, as with the EM algorithm), we are able to change the GMM parameters to better cope with limited database size and other parameter uncertainties. Experimental results show the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Furui, S.: Recent advances in speaker recognition. Acoustics, Speech, and Signal Processing, ICASSP-89 1, 429–440 (1989)
Google Scholar
Atal, B.S.: Automatic recognition of speakers from their voices. Proceedings of the IEEE 64(4), 460–475 (1976)
Article Google Scholar
Davis, P., Mermelstein, S.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. Acoustics, Speech, and Signal Processing, IEEE Transactions on 28(4), 357–366 (1980)
Article Google Scholar
Premakanthan, P., Mikhael, W.B.: Speaker verification/recognition and the importance of selective feature extraction: review. Circuits and Systems, 2001. MWSCAS 2001. In: Proceedings of the 44th IEEE 2001 Midwest Symposium, vol. 1, pp. 57–61 (2001)
Google Scholar
Mammone, R.P., Xiaoyu Zhang Ramachandran, R.J.: Robust speaker recognition: a feature-based approach. Signal Processing Magazine, IEEE 13(5), 58–71 (1996)
Article Google Scholar
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. Speech and Audio Processing, IEEE Transactions 3, 72–83 (1995)
Article Google Scholar
Zhang, Y., Alder, M., Togneri, R.: Using Gaussian Mixture Modeling in Speech Recognition. Acoustics, Speech, and Signal Processing, 1994. IEEE International Conference (ICASSP-94) i, 613–616 (1994)
Google Scholar
Campbell, J.P.: Speaker recognition: A tutorial. Proceedings of the IEEE 85, 1437–1462 (1997)
Article Google Scholar
Eriksson, T., Kim, S., Kang, H.-G., Lee, C.: An information-theoretic perspective on feature selection in speaker recognition. IEEE Signal Processing Letters 12(7), 500–503 (2005)
Article Google Scholar
Douglas, R.: Experimental evaluation of features for robust speaker identification. IEEE Transactions on Speech and Audio Processing 2(4), 639–643 (1994)
Article Google Scholar
Roberts, S.J., Husmeier, D., Rezek, I., Penny, W.D.: Bayesian approaches to gaussian mixture modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(11), 1133–1142 (1998)
Article Google Scholar
Kay, S.M.: Fundamentals of Statistical Signal Processing, Estimation Theory, Prentice Hall Signal Processing Series, 2nd edn (1993)
Google Scholar
Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn. Wiley-Interscience Publishers, Chichester (2001)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Communication System Group, Department of Signals and Systems, Chalmers University of Technology, 412 96 Göteborg, Sweden
Guillermo Garcia, Sung-Kyo Jung & Thomas Eriksson

Authors

Guillermo Garcia
View author publications
You can also search for this author in PubMed Google Scholar
Sung-Kyo Jung
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Eriksson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Christian Müller

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Garcia, G., Jung, SK., Eriksson, T. (2007). Bayes-Optimal Estimation of GMM Parameters for Speaker Recognition. In: Müller, C. (eds) Speaker Classification II. Lecture Notes in Computer Science(), vol 4441. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74122-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-74122-0_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74121-3
Online ISBN: 978-3-540-74122-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics