Reliability Estimation of the Speaker Verification Decisions Using Bayesian Networks to Combine Information from Multiple Speech Quality Measures

Villalba, Jesús; Lleida, Eduardo; Ortega, Alfonso; Miguel, Antonio

doi:10.1007/978-3-642-35292-8_1

Jesús Villalba⁷,
Eduardo Lleida⁷,
Alfonso Ortega⁷ &
…
Antonio Miguel⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 328))

765 Accesses

Abstract

In some situations the quality of the signals involved in a speaker verification trial is not as good as needed to take a reliable decision. In this work, we use Bayesian networks to model the relations between the speaker verification score, a set of speech quality measures and the trial reliability. We use this model to detect and discard unreliable trials. We present results on the NIST SRE2010 dataset artificially degraded with different types and levels of additive noise and reverberation. We show that a speaker verification system, that is well calibrated for clean speech, produces an unacceptable actual DCF on the degraded dataset. We show how this method can be used to reduce the actual DCF to values lower than 1. We compare results using different quality measures and Bayesian network configurations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Probabilistic Prediction in Multiclass Classification Derived for Flexible Text-Prompted Speaker Verification

A speaker identification-verification approach for noise-corrupted and improved speech using fusion features and a convolutional neural network

Article 19 May 2024

Blind Signal-to-Noise Ratio Estimation of Speech Based on Vector Quantizer Classifiers and Decision Level Fusion

Article 09 November 2016

References

Huggins, M.C., Grieco, J.J.: Confidence metrics for speaker identification. In: 7th ICSLP, Denver, Colorado (2002)
Google Scholar
Campbell, W.M., Reynolds, D.A., Campbell, J.P., Brady, K.J.: Estimating and evaluating confidence for forensic speaker recognition. In: ICASSP 2005, vol. 1, pp. 717–720 (2005)
Google Scholar
Solewicz, Y., Koppel, M.: Considering Speech Quality in Speaker Verification Fusion. In: Interspeech 2005 (2005)
Google Scholar
Richiardi, J., Drygajlo, A., Prodanov, P.: A probabilistic measure of modality reliability in speaker verification. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2005 (2005)
Google Scholar
Richiardi, J., Drygajlo, A., Prodanov, P.: Confidence and reliability measures in speaker verification. Journal of the Franklin Institute 343(6), 574–595 (2006)
Article MATH Google Scholar
Richiardi, J., Drygajlo, A., Prodanov, P.: Speaker Verification with Confidence and Reliability Measures. In: Proc. of ICASSP, vol. 1(6), pp. 641–644 (2006)
Google Scholar
Nakasone, H., Beck, S.D.: Forensic automatic speaker recognition. In: Odyssey Speaker and Language Recognition Workshop (2001)
Google Scholar
Bengio, S., Marcel, C., Marcel, S., Mariethoz, J.: Confidence Measures for Multimodal Identity Verification. Information Fusion 3(4), 267–276 (2002)
Article Google Scholar
Poh, N., Bengio, S.: Improving Fusion with Margin-Derived Confidence in Biometric Authentication Tasks. In: Kanade, T., Jain, A., Ratha, N.K. (eds.) AVBPA 2005. LNCS, vol. 3546, pp. 474–483. Springer, Heidelberg (2005)
Chapter Google Scholar
Kryszczuk, K., Richiardi, J., Prodanov, P., Drygajlo, A.: Reliability-Based Decision Fusion in Multimodal Biometric Verification Systems. EURASIP Journal on Advances in Signal Processing, 1–10 (2007)
Google Scholar
Talkin, D.: A robust algorithm for pitch tracking (RAPT). In: Speech Coding and Synthesis, pp. 495–518 (1995)
Google Scholar
Villalba, J., Lleida, E.: Detecting Replay Attacks from Far-Field Recordings on Speaker Verification Systems. In: Vielhauer, C., Dittmann, J., Drygajlo, A., Juul, N.C., Fairhurst, M.C. (eds.) BioID 2011. LNCS, vol. 6583, pp. 274–285. Springer, Heidelberg (2011)
Chapter Google Scholar
Harriero, A., Ramos, D., Gonzalez-Rodriguez, J., Fierrez, J.: Analysis of the Utility of Classical and Novel Speech Quality Measures for Speaker Verification. In: Tistarelli, M., Nixon, M.S. (eds.) ICB 2009. LNCS, vol. 5558, pp. 434–442. Springer, Heidelberg (2009)
Chapter Google Scholar
Bishop, C.: Pattern Recognition and Machine Learning. Springer Science+Business Media, LLC (2006)
Google Scholar
Hirsch, H.-G., Pearce, D.: The Aurora Experimental Framework for the Performance Evaluation of Speech Recognition Systems under Noisy Conditions. In: 6th International Conference on Spoken Language Processing, ICSLP 2000, pp. 16–19. Citeseer, Beijing (2000)
Google Scholar
Hirsch, H.-G.: FaNT - Filtering and Noise Adding Tool (2005)
Google Scholar
McGovern, S.: A Model for Room Acoustics (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Communications Technology Group (GTC), Aragon Institute for Engineering Research (I3A), University of Zaragoza, Spain
Jesús Villalba, Eduardo Lleida, Alfonso Ortega & Antonio Miguel

Authors

Jesús Villalba
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Lleida
View author publications
You can also search for this author in PubMed Google Scholar
Alfonso Ortega
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Miguel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Escuela Politecnica Superior, Universidad Autonoma de Madrid. C/ Francisco, Tomas y Valiente 11, 28049, Madrid, Spain
Doroteo Torre Toledano
Centro Politécnico Superior, Edificio Ada Byron, C/ María de Luna nº 1, 50018, Zaragoza, Spain
Alfonso Ortega Giménez
Universidade de Aveiro, Campus Universitário Aveiro, 3810-193, Aveiro, Portugal
António Teixeira
Escuela Politecnica Superior, Universidad Autonoma de Madrid, C/ Francisco, Tomas y Valiente 11, 28049, Madrid, Spain
Joaquín González Rodríguez
E.T.S.I.Telecomunicacion, Universidad Politécnica de Madrid, Ciudad Universitaria s/n, 28040, Madrid, Spain
Luis Hernández Gómez & Rubén San Segundo Hernández &
Escuela Politecnica Superior, Universidad Autonoma de Madrid, C/ Francisco, Tomas y Valiente 11, 28049, Madrid, Spain
Daniel Ramos Castro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Villalba, J., Lleida, E., Ortega, A., Miguel, A. (2012). Reliability Estimation of the Speaker Verification Decisions Using Bayesian Networks to Combine Information from Multiple Speech Quality Measures. In: Torre Toledano, D., et al. Advances in Speech and Language Technologies for Iberian Languages. Communications in Computer and Information Science, vol 328. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35292-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-35292-8_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35291-1
Online ISBN: 978-3-642-35292-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Reliability Estimation of the Speaker Verification Decisions Using Bayesian Networks to Combine Information from Multiple Speech Quality Measures

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Probabilistic Prediction in Multiclass Classification Derived for Flexible Text-Prompted Speaker Verification

A speaker identification-verification approach for noise-corrupted and improved speech using fusion features and a convolutional neural network

Blind Signal-to-Noise Ratio Estimation of Speech Based on Vector Quantizer Classifiers and Decision Level Fusion

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Reliability Estimation of the Speaker Verification Decisions Using Bayesian Networks to Combine Information from Multiple Speech Quality Measures

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Probabilistic Prediction in Multiclass Classification Derived for Flexible Text-Prompted Speaker Verification

A speaker identification-verification approach for noise-corrupted and improved speech using fusion features and a convolutional neural network

Blind Signal-to-Noise Ratio Estimation of Speech Based on Vector Quantizer Classifiers and Decision Level Fusion

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation