Discriminative Transformation for Sufficient Adaptation in Text-Independent Speaker Verification

Yang, Hao; Dong, Yuan; Zhao, Xianyu; Zhao, Jian; Wang, Haila

doi:10.1007/11939993_58

Hao Yang²³,
Yuan Dong^22,23,
Xianyu Zhao²²,
Jian Zhao²³ &
…
Haila Wang²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4274))

Included in the following conference series:

International Symposium on Chinese Spoken Language Processing

1567 Accesses
3 Citations

Abstract

In conventional Gaussian Mixture Model – Universal Background Model (GMM-UBM) text-independent speaker verification applications, the discriminability between speaker models and the universal background model (UBM) is crucial to system’s performance. In this paper, we present a method based on heteroscedastic linear discriminant analysis (HLDA) that can enhance the discriminability between speaker models and UBM. This technique aims to discriminate the individual Gaussian distributions of the feature space. After the discriminative transformation, the overlapped parts of Gaussian distributions can be reduced. As a result, some Gaussian components of a target speaker model can be adapted more sufficiently during Maximum a Posteriori (MAP) adaptation, and these components will have more discriminative capability over the UBM. Results are presented on NIST 2004 Speaker Recognition data corpora where it is shown that this method provides significant performance improvements over the baseline system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Reynolds, D.A., Quatieri, T., Dunn, R.: Speaker Verification Using Adapted Mixture Models. Digital Signal Processing 10, 181–202 (2000)
Article Google Scholar
Kumar, N.: Investigation of Silicon-Auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition. Ph.d. thesis, John Hopkins University, Baltimore, USA (1997)
Google Scholar
Gales, M.J.F.: Maximum likelihood multiple projection schemes for hidden Markov models. Technical Report CUED/F-INFENG/TR.365, Cambridge University, UK (1999)
Google Scholar
Hermansky, H., Morgan, N.: RASTA Processing of Speech. IEEE Trans. on Speech and Audio Processing 2, 578–589 (1994)
Article Google Scholar
Reynolds, D.A.: Channel robust speaker verification via feature mapping. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 53–56 (2003)
Google Scholar
The NIST (2004), Speaker Recognition Evaluation Plan, Available: http://www.nist.gov/speech/tests/spk/

Download references

Author information

Authors and Affiliations

France Telecom Research & Development Center, Beijing, 100083
Yuan Dong, Xianyu Zhao & Haila Wang
Beijing University of Posts and Telecommunications, Beijing, 100876
Hao Yang, Yuan Dong & Jian Zhao

Authors

Hao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Xianyu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jian Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Haila Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, The University of Hong Kong, Hong Kong
Qiang Huo
Human Language Technology Department, Institute for Infocomm Research (I2R), 119613, Singapore
Bin Ma
School of Computer Engineering, Nanyang Technological University (NTU), 639798, Singapore
Eng-Siong Chng
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, 119613, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, H., Dong, Y., Zhao, X., Zhao, J., Wang, H. (2006). Discriminative Transformation for Sufficient Adaptation in Text-Independent Speaker Verification. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_58

Download citation

DOI: https://doi.org/10.1007/11939993_58
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics