Abstract
Individual weighting of speaker models in VQ-based recognition has some advantages but means that scores from different models may not be directly comparable, so making identification difficult. It is also problematic for verification as decision thresholds cannot easily be set without first testing models with genuine and imposter utterances. We present a novel normalisation method for VQ speaker recognition which applies an offset to each model, based on the average score between it and the imposter models, to bring particularly high- or low-scoring models into line with the general score range. It may be calculated a priori, before running any actual tests. The method works for both text-dependent and text-independent tasks and improves both the identification and verification error rates.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
Booth, I., Barlow, M., Watson, B.: Enhancements to DTW and VQ decision algorithms for speaker recognition. Speech Comm. 13 (1993) 427–433.
Furui, S.: An analysis of long-term variation of feature parameters of speech and its application to talker recognition. Electronic Comms. 57-A (1974) 34–42.
Green, D.M., Swets, J.A.: Signal Detection Theory and Psychophysics. John Wiley, New York (1966).
Hannah, M.I., Sapeluk, A.T., Damper, R.I., Roger, I.M.: Using genetic algorithms to improve speaker-verifier performance. Proc. IEE/IEEE Workshop on Natural Algorithms in Signal Processing, Chelmsford, UK (1993) 24/1–24/9.
Li, K-P., Porter, J.E.: Normalizations and selection of speech segments for speaker recognition scoring. Proc. ICASSP '88 (1988) 595–598.
Linde, J., Buzo, A., Gray, R.M.: An algorithm for vector quantizer design. IEEE Trans. Comms. 28 (1980) 84–95.
Picone, J.: Signal modeling techniques in speech recognition. Proc. IEEE 81 (1990) 1215–1247.
Rosenberg, A.E., Soong, F.K.: Evaluation of a vector quantization talker recognition system in text dependent and text independent modes. Comp. Speech Lang. 22 (1987) 143–157.
Yu, K., Mason, J., Oglesby, J.: Speaker recognition using hidden Markov models, dynamic time warping and vector quantisation. IEE Proc. Vision, Image and Sig. Process. 142 (1995) 313–318.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Finan, R.A., Sapeluk, A.T., Damper, R.I. (1997). VQ score normalisation for text-dependent and text-independent speaker recognition. In: Bigün, J., Chollet, G., Borgefors, G. (eds) Audio- and Video-based Biometric Person Authentication. AVBPA 1997. Lecture Notes in Computer Science, vol 1206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0015998
Download citation
DOI: https://doi.org/10.1007/BFb0015998
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62660-2
Online ISBN: 978-3-540-68425-1
eBook Packages: Springer Book Archive