Abstract
Speaker modeling technique with sparse training data is an active branch of robust speaker recognition research. This paper presents a novel modeling approach named Multi-EigenSpace modeling technique based on Regression Class (RC-MES), which integrates the common eigenspace technique and the regression class (RC) idea of Maximum Likelihood Linear Regression (MLLR). RC-MES not only solves the problem of prior knowledge limitation of Gaussian Mixture Models (GMM) but also remedies the shortcoming of common eigenspace that confuses speaker differences and phoneme differences. The eigenvoice analysis in RC can provide better discrimination ability between different speakers. The experimental results on speaker identification of 75 males show that, when enrolment data is sparse, RC-MES provides significant improvement over GMM, and the number of eigenvoices in RC-MES is fewer than that in common eigenspace.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Reynolds, D.A.: Speaker identification and verification using Gaussian mixture speaker models. Speech Communication 17(1-2), 91–108 (1995)
Thyes, O., Kuhn, R., Nguyen, P., Junqua, J.-C.: Speaker identification and verification using eigenvoices. In: ICSLP 2000, Beijing-China, vol. 2, pp. 242–246 (October 2000)
Wang, N.J.-C., Tsai, W.-H., Lee, L.-S.: Eigen-MLLR coefficients as new feature parameters for speaker identification. Eurospeech 2, 1385–1388 (2001)
Tadj, C., Gabrea, M., et al.: Towards robustness in speaker verification: enhancement and adapataion. In: The 2002 45th Midwest Symposium on Circuits and Systems, vol. 3, pp. 320–323 (August 2002)
Leggetter, C.J., Woodland, P.C.: Maximum likelihood linear regression for speaker adaptation of Continuous Density Hidden Markov Models. Computer Speech and Language 9, 171–185 (1995)
Campbell Jr, J.P.: Speaker recognition: a tutorial. In: Proceedings of the IEEE, vol. 85(9) (September 1997)
Kuhn, R., Junqua, J.-C., Nguyen, P., Niedzielski, N.: Rapid speaker adaptation in Eigenvoice space. IEEE Trans. On Speech and Audio Processing 8(6), 695–706 (2000)
Young, S.J., Kershaw, D., Odell, J., Woodland, P.: The HTK Book (for HTK Version 3.0) (2000), http://htk.eng.cam.ac.uk/docs.shtml
Garofolo, J., et al.: DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CD-ROM. National Institute of Standards and Technology (1993)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fu, Z., Zhao, R. (2004). Speaker Modeling Technique Based on Regression Class for Speaker Identification with Sparse Training. In: Li, S.Z., Lai, J., Tan, T., Feng, G., Wang, Y. (eds) Advances in Biometric Person Authentication. SINOBIOMETRICS 2004. Lecture Notes in Computer Science, vol 3338. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30548-4_70
Download citation
DOI: https://doi.org/10.1007/978-3-540-30548-4_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24029-7
Online ISBN: 978-3-540-30548-4
eBook Packages: Computer ScienceComputer Science (R0)