Combining cohort and UBM models in open set speaker detection

Brew, Anthony; Cunningham, Pádraig

doi:10.1007/s11042-009-0381-x

Combining cohort and UBM models in open set speaker detection

Published: 14 October 2009

Volume 48, pages 141–159, (2010)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Anthony Brew¹ &
Pádraig Cunningham¹

165 Accesses
Explore all metrics

Abstract

In speaker detection it is important to build an alternative model against which to compare scores from the ‘target’ speaker model. Two alternative strategies for building an alternative model are to build a single global model by sampling from a pool of training data, the Universal Background (UBM), or to build a cohort of models from selected individuals in the training data for the target speaker. The main contribution in this paper is to show that these approaches can be unified by using a Support Vector Machine (SVM) to learn a decision rule in the score space made up of the output scores of the client, cohort and UBM model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

Linguistic Data Consortium, http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC95S22

References

Ariyaeeinia AM, Sivakumaran P (1997) Analysis and comparison of score normalisation methods for text-dependent speaker verification. In: Eurospeech
Auckenthaler R, Carey M, Lloyd-Thomas H (2000) Score normalization for text-independent speaker verification systems. Digit Signal Process 10(1–3):42–54
Article Google Scholar
Bengio S, Mariethoz J (2001) Learning the decision function for speaker verification. In: ICASSP, IEEE international conference on acoustics, speech and signal processing, vol 1, pp 425–428
Bimbot F, Bonastre J, Fredouille C, Gravier G, Magrin-Chagnolleau I, Meignier S, Merlin T, Ortega-Garcia J, Petrovska-Delacretaz D, Reynolds D (2004) A tutorial on text-independent speaker verification. EURASIP J Appl Signal Process 4:430–451
Google Scholar
Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2):121–167
Article Google Scholar
Campbell J, Reynolds D (1999) Corpora for the evaluation of speaker recognition systems. In: ICASSP, IEEE international conference on acoustics, speech and signal processing, vol 2, pp 829–832
Campbell W, Reynolds D, Campbell J (2004) Fusing discriminative and generative methods for speaker recognition: experiments on switchboard and nfi/tno field data. In: Odyssey: the speaker and language recognition workshop, ISCA
Charlet D, Zhao X, Dong Y (2008) Convergence between SVM-based and distance-based paradigms for speaker recognition. In: Interspeech
Higgins A, Bahler L, Porter J (1991) Speaker verification using randomized phrase prompting. Digit Signal Process 1(2):89–106
Article Google Scholar
Ho P, Vasconcelos N (2004) A kullback-leibler divergence based kernel for SVM classification in multimedia applications. Proc Adv Neural Inf Process Syst 16:1385–1392
Google Scholar
Kharroubi J, Petrovska-Delacretaz D, Chollet G (2001) Combining GMM’s with Support Vector Machines for text-independent speaker verification. In: Eurospeech
Le Q, Bengio S (2003) Client dependent GMM-SVM models for speaker verification. In: ICONP international conference on neural information processing. Springer, New York, pp 181–189
Google Scholar
Louradour J, Daoudi K, Bach F (2006) SVM speaker verification using an incomplete cholesky decomposition sequence kernel. In: Odyssey: the speaker and language recognition workshop
Magrin-Chagnolleau I, Bimbot F (2000) Indexing telephone conversations by speakers using time-frequency principal component analysis. In: Multimedia and expo, ICME, vol 2, pp 881–884
Reynolds D (1997) Comparison of background normalization methods for text-independent speaker verification. In: Eurospeech
Reynolds D (2002) An overview of automatic speaker recognition technology. In: ICASSP, IEEE international conference on acoustics, speech and signal processing, vol 4, pp 4072–4075
Reynolds DA (1995) Speaker identification and verification using gaussian mixture speaker models. Speech Commun 17(1):91–108
Article Google Scholar
Reynolds D, Quatieri T, Dunn R (2000) Speaker verification using adapted gaussian mixture models. Digit Signal Process 10(1-3):19–41
Article Google Scholar
Rosenberg A, DeLong J, Lee C, Juang B, Soong F (1992) The use of cohort normalized scores for speaker verification. In: Second international conference on spoken language processing
Schmidt M, Gish H (1996) Speaker identification via support vector classifiers. In: ICASSP, IEEE international conference on acoustics, speech and signal processing, vol 1, pp 105–108
Stanford V, Garofolo J, Galibert O, Michel M, Laprun C (2003) The nist smart space and meeting room projects: signals, acquisition annotation, and metrics. In: ICASSP, IEEE international conference on acoustics, speech and signal processing, vol 4, pp 128–132
Sturim D, Reynolds D (2005) Speaker adaptive cohort selection for tnorm in text-independent speaker verification. In: ICASSP, IEEE international conference on acoustics, speech and signal processing, vol 1, pp 741–744
Sturim D, Reynolds D, Singer E, Campbell J (2001) Speaker indexing in large audio databases using anchor models. In: ICASSP, IEEE international conference on acoustics, speech and signal processing, vol 1, pp 429–432
Tax D, van Breukelen M, Duin R, Kittler J (2000) Combining multiple classifiers by averaging or by multiplying? Pattern Recogn 33(9):1475–1485
Article Google Scholar
Vapnik V (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Wan V (2003) Speaker verification using support vector machines. PhD thesis, University of Sheffield
Wan V, Renals S (2005) Speaker verification using sequence discriminant support vector machines. Speech Audio Process 13(2):203–210
Article Google Scholar
Zhu X, Barras C, Lamel L, Gauvain J (2006) Speaker diarization: from broadcast news to lectures. Lect Notes Comput Sci 4299:396
Article Google Scholar

Download references

Author information

Authors and Affiliations

Machine Learning Group, School of Computer Science and Infomatics, University College Dublin, Dublin, Ireland
Anthony Brew & Pádraig Cunningham

Authors

Anthony Brew
View author publications
You can also search for this author in PubMed Google Scholar
Pádraig Cunningham
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anthony Brew.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brew, A., Cunningham, P. Combining cohort and UBM models in open set speaker detection. Multimed Tools Appl 48, 141–159 (2010). https://doi.org/10.1007/s11042-009-0381-x

Download citation

Published: 14 October 2009
Issue Date: May 2010
DOI: https://doi.org/10.1007/s11042-009-0381-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Combining cohort and UBM models in open set speaker detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Client-wise cohort set selection by combining speaker- and phoneme-specific I-vectors for speaker verification

A strong hybrid AdaBoost classification algorithm for speaker recognition

Speaker Identification Using Semi-supervised Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Combining cohort and UBM models in open set speaker detection

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Client-wise cohort set selection by combining speaker- and phoneme-specific I-vectors for speaker verification

A strong hybrid AdaBoost classification algorithm for speaker recognition

Speaker Identification Using Semi-supervised Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation