Abstract
It is well known that variability of speech signal quality affects the performance of speaker recognition systems. Difference in speech quality between enrollment and test utterances leads to shifting of scores and performance degradation. In order to improve the effectiveness of speaker recognition in these circumstances the scores calibration is required. Speech signal parameters that have a strong impact on speaker recognition performance are total speech duration, signal to noise ratio and reverberation time. Their variability leads to scores shifting and unreliable accept/reject decisions. In this paper we investigate the effects of speech duration variability on the calibration when enroll and test speech utterances originate from the same channel. An effective method of scores stabilization is also presented.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Sestek, the rise of voice biometrics as a key security solution. Speech Technology Magazine, White paper of SESTEK
Averbouch, D., Kahn, J.: Fraud targets the contact center: What now? Speech Technol. Mag. 18(4), 9 (2013)
Batchelor, J., Lee, D., Banks, D., Crosby, D., Moore, K., Kuhn, S., Rodriguez, T., Stephens, A.: Ivestigative report. Florida Department of Law Enforcement (2012)
Brümmer, N.: Measuring, refining and calibrating speaker and language information extracted from speech. Ph.D. thesis, Citeseer (2010)
Brümmer, N., Garcia-Romero, D.: Generative modelling for unsupervised score calibration. arXiv preprint (2013). arXiv:1311.0707
Brümmer, N., de Villiers, E.: The bosaris toolkit: Theory, algorithms and code for surviving the new DCF. arXiv preprint (2013). arxiv:1304.2865
Doddington, G.: The role of score calibration in speaker recognition. In: Thirteenth Annual Conference of the International Speech Communication Association (2012)
Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: Interspeech, pp. 249–252 (2011)
Hautamäki, V., Kinnunen, T., Sedlák, F., Lee, K.A., Ma, B., Li, H.: Sparse classifier fusion for speaker verification. IEEE Trans. Audio Speech Lang. Process. 21(8), 1622–1631 (2013)
Jain, A.K., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Trans. Circuits Syst. Video Technol. 14(1), 4–20 (2004)
Katz, M., Schafföner, M., Krüger, S.E., Wendemuth, A.: Score calibrating for speaker recognition based on support vector machines and gaussian mixture models. In: SIP. pp. 139–144 (2007)
Kozlov, A., Kudashev, O., Matveev, Y., Pekhovsky, T., Simonchik, K., Shulipa, A.: SVID speaker recognition system for NIST SRE 2012. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 278–285. Springer, Heidelberg (2013)
van Leeuwen, D.A., Brümmer, N.: The distribution of calibrated likelihood-ratios in speaker recognition. arXiv preprint (2013). arXiv:1304.1199
Mandasari, M.I., Saeidi, R., van Leeuwen, D.A.: Calibration based on duration quality measures function in noise robust speaker recognition for NIST SRE’12. Parameters 1(Q1), w2 (2013)
Mandasari, M.I., Saeidi, R., McLaren, M., van Leeuwen, D.A.: Quality measure functions for calibration of speaker recognition systems in various duration conditions. IEEE Trans. Audio Speech Lang. Process. 21(11), 2425–2438 (2013)
Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The det curve in assessment of detection task performance. Technical report, DTIC Document (1997)
van Leeuwen, D.A., Brümmer, N.: An introduction to application-independent evaluation of speaker recognition systems. In: Müller, C. (ed.) Speaker Classification 2007. LNCS (LNAI), vol. 4343, pp. 330–353. Springer, Heidelberg (2007)
Villalba, J., Lleida, E., Ortega, A., Miguel, A.: A new bayesian network to assess the reliability of speaker verification decisions (2013)
Acknowledgments
This work was partially financially supported by the Government of the Russian Federation, Grant 074-U01.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Shulipa, A., Novoselov, S., Matveev, Y. (2016). Scores Calibration in Speaker Recognition Systems. In: Ronzhin, A., Potapova, R., Németh, G. (eds) Speech and Computer. SPECOM 2016. Lecture Notes in Computer Science(), vol 9811. Springer, Cham. https://doi.org/10.1007/978-3-319-43958-7_72
Download citation
DOI: https://doi.org/10.1007/978-3-319-43958-7_72
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43957-0
Online ISBN: 978-3-319-43958-7
eBook Packages: Computer ScienceComputer Science (R0)