Abstract
In conventional Gaussian Mixture Model – Universal Background Model (GMM-UBM) text-independent speaker verification applications, the discriminability between speaker models and the universal background model (UBM) is crucial to system’s performance. In this paper, we present a method based on heteroscedastic linear discriminant analysis (HLDA) that can enhance the discriminability between speaker models and UBM. This technique aims to discriminate the individual Gaussian distributions of the feature space. After the discriminative transformation, the overlapped parts of Gaussian distributions can be reduced. As a result, some Gaussian components of a target speaker model can be adapted more sufficiently during Maximum a Posteriori (MAP) adaptation, and these components will have more discriminative capability over the UBM. Results are presented on NIST 2004 Speaker Recognition data corpora where it is shown that this method provides significant performance improvements over the baseline system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Reynolds, D.A., Quatieri, T., Dunn, R.: Speaker Verification Using Adapted Mixture Models. Digital Signal Processing 10, 181–202 (2000)
Kumar, N.: Investigation of Silicon-Auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition. Ph.d. thesis, John Hopkins University, Baltimore, USA (1997)
Gales, M.J.F.: Maximum likelihood multiple projection schemes for hidden Markov models. Technical Report CUED/F-INFENG/TR.365, Cambridge University, UK (1999)
Hermansky, H., Morgan, N.: RASTA Processing of Speech. IEEE Trans. on Speech and Audio Processing 2, 578–589 (1994)
Reynolds, D.A.: Channel robust speaker verification via feature mapping. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 53–56 (2003)
The NIST (2004), Speaker Recognition Evaluation Plan, Available: http://www.nist.gov/speech/tests/spk/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, H., Dong, Y., Zhao, X., Zhao, J., Wang, H. (2006). Discriminative Transformation for Sufficient Adaptation in Text-Independent Speaker Verification. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_58
Download citation
DOI: https://doi.org/10.1007/11939993_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)