Abstract
Although numerous speech representations have been reported to be useful in speaker recognition, there is much less agreement on which speech representation provides a perfect representation of speaker-specific information. In this paper, we charaterize a speaker’s identity through the simultaneous use of various speech representations of his/her voice. We present a parametric statistical model, generalized Gaussian mixture model, and develop an EM algorithm for parameter estimation. Our approach has been applied to speaker recognition and comparative results on KING corpus demonstrate its effectiveness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Huang, X.D., Acero, A., Hon, H.W.: Spoken Language Processing. Wiley Inter-Science, Chichester (2000)
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture models. IEEE Transactions on Speech Audio Processing 3(1), 72–83 (1995)
Chen, K.: A connectionist method for pattern classification with diverse features. Pattern Recognition Letters 19(7), 545–558 (1998)
Chen, K., Wang, L., Chi, H.: Methods of combining multiple classifiers with different features and their applications to text-independent speaker identification. International Journal of Pattern Recognition and Artificial Intelligence 11(3), 417–445 (1997)
Chen, K., Xie, D., Chi, H.: A modified HME architecture for text-dependent speaker identification. IEEE Transactions on Neural Networks 7(5), 1309–1313 (1996)
Chen, K., Xu, L., Chi, H.: Improved learning algorithms for mixture of experts in multiclass classification. Neural Networks 12(9), 1225–1252 (1999)
Chen, K.: On the use of different speech representations for speaker modeling. IEEE Transactions on Systems, Man and Cybernetics (Part C) 34 (2004) (Special issue on Biometric Systems, in press)
McLanchlan, G., Peel, D.: Finite Mixture Models. Wiley Inter-Science, Chichester (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, K. (2004). Speaker Modeling with Various Speech Representations. In: Zhang, D., Jain, A.K. (eds) Biometric Authentication. ICBA 2004. Lecture Notes in Computer Science, vol 3072. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25948-0_81
Download citation
DOI: https://doi.org/10.1007/978-3-540-25948-0_81
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22146-3
Online ISBN: 978-3-540-25948-0
eBook Packages: Springer Book Archive