Abstract
The annual NIST Speaker Recognition Evaluations (SREs) from 1996 to 2006 have been internationally recognized as the leading source or performance evaluation of research systems in the speaker classification field. We discuss how these evaluations have developed and been conducted and the performance measures used. We consider the key factors that have been studied for their effect on performance, including training and test durations, channel variability, and speaker variability. We examine the extent to which progress has been observed in state-of-the-art performance. We also consider how the technology has changed over the past decade, other evaluations that have been conducted or planned, and where the field may be headed in the future.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Doddington, G.: Speech Recognition: Â turning theory to practice. IEEE Spectrum 18(9), 26–32 (1981)
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallet, D.S., Dahlgren, N.L.: The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM. Technical report, National Institute of Standards and Technology, Gaithersburg (1993)
Brümmer, N., Du Preez, J.: Application-independent evaluation of speaker detection. Computer Speech & Language 20(2-3), 230–275 (2006)
Doddington, G.: Speaker recognition based on idiolectal differences between speakers. In: Eurospeech 2001. Proceedings of the 7th European Conference on Speech Communication and Technology. Aalborg, Denmark, Vol. 4, pp. 2521–2524 (2001)
Andrews, W.D., Kohler, M.A., Campbell, J.P., Godfrey, J.J.: Phonetic, idiolectal, and acoustic speaker recognition. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 55–63 (2001)
Doddington, G., Liggett, W., Martin, A., Przybocki, M., Reynolds, D.: Sheep, Goats, Lambs and Woves: A Statistical Analysis of Speaker Performance in the NIST 1998 Speaker Recognition Evaluation. In: ICSLP 1998. Proceedings of the 5th International Conference on Spoken Language Processing (1998)
Fiscus, J.: The NIST Rich Transcription Evaluation Series, NIST web-site (2007), http://nist.gov/speech/tests/rt/
Fiscus, J., Radde, N., Garofolo, J.S., Le, A., Ajot, J., Laprun, C.: The Rich Transcription 2005 Spring Meeting Recognition Evaluation. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, Springer, Heidelberg (2006)
CLEAR2007: Classification of Events, Activities and Relationships, Evaluation and Workshop (2007), http://www.clear-evaluation.org/
Van Leeuwen, D.A., Martin, A.F., Przybocki, M.A., Boutenc, J.S.: NIST and NFI-TNO evaluations of automatic speaker recognition. Computer Speech & Language 20(2–3), 128–158 (2006)
Hansen, E.G., Slyh, R.E., Anderson, T.R.: Formant and F0 Features for Speaker Verification. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 25–29 (2001)
Przybocki, M.A., Martin, A.F.: Odyssey Text Independent Evaluation Data. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 21–23 (2001)
Higgins, A.L., Bahler, L.G.: ITT SpeakerKey Evaluation. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 31–32 (2001)
Toledo-Ronen, O.: Speech Detection for Text-Dependent Speaker Verification. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 33–36 (2001)
BioSecure: BioSecure Evaluation Campaign (2007), http://www.biosecure.info/eval/
Campbell, J.P., Nakasone, H., Cieri, C., Miller, D., Walker, K., Martin, A.F., Przybocki, M.A.: The MMSR Bilingual and Crosschannel Corpora for Speaker Recognition. In: LREC 2004. Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal, [Alvin: wasn’t this one published at the Odyssey 04 workshop raher than LREC?] (2004)
Cieri, C., Andrew, W., Campbell, J.P., Doddington, G., Godfrey, J., Huang, S., Libermann, M., Martin, A., Nakasone, H., Przybocki, M., Walter, K.: The Mixer and Transcript Reading Corpora: Resources for Multilingual Crosschannel Speaker Recognition Research. In: LREC 2006. Proceedings of the 5th International Conference on Language Resources and Evaluation, Genoa, Italy (2006)
Reynolds, D.A., Doddington, G., Przybocki, M., Marin, A.: The NIST speaker recognition evaluation - overview, methodology, systems, results, perspectives. Speech Communication 31(2-3), 225–254 (2000)
Fiscus, J., Ajot, J., Michel, M., Garofolo, J.S.: The Rich Transcription 2006 Spring Meeting Recognition Evaluation. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, Springer, Heidelberg (2006)
Linguistic Data Consortium: Catalog of Speaker Recognition Corpora (2007), http://www.ldc.upenn.edu/Catalog/
Martin, A.F., Przybocki, M.A.: The NIST 1999 Speaker Recognition Evaluation - An Overview. Digital Signal Processing 10, 1–18 (2000)
Martin, A.F., Przybocki, M.A.: The NIST Speaker Recognition Evaluations: 1996-2001. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 39–43 (2001)
Martin, A.F., Przybocki, M.A., Doddington, G.: Speaker Recognition in a Multi-Speaker Environment. In: Eurospeech 2001. Proceedings of the 7th European Conference on Speech Communication and Technology, Aalborg, Denmark, vol. 2, pp. 787–790 (2001)
Martin, A., Miller, D., Przybocki, M., Campbell, J.: Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004. In: LREC 2004. Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal (2004)
Martin, A.F., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET Curve in Assessment of Detection Task Performance. In: Eurospeech 1997. Proceedings of the 5th European Conference on Speech Communication and Technology. Rhodes, Greece, vol. 4, pp. 1985–1988 (1997)
Martin, A.F., Przybocki, M.A., Campbell, J.P.: The NIST speaker recognition evaluation program. In: Wayman, J., Jain, A.K., Wayman, D.M. (eds.) Biometric Systems: Technology, Design and Performance Evaluation, pp. 241–262. Springer, Heidelberg (2005)
Martin, A.F., Przybocki, M.A., Le, A.N.: The NIST Speaker Recognition Evaluation Series, NIST web-site (2007), http://www.nist.gov/speech/tests/spk/
Philipps, P.J., Martin, A., Wilson, C., Przybocki, M.: An introduction to evaluating biometric systems. IEEE Computer 33(2), 56–63 (2000)
Przybocki, M.A., Martin, A.F.: NIST speaker recognition evaluation. In: RLA2C 1998. Proceedings of the Workshop on Speaker Recognition and its Commercial and Forensic Applications, Avignon, pp. 120–123 (1998)
Przybocki, M.A., Martin, A.F.: NIST Speaker Recognition Evaluation Chronicles. In: Odyssey 2004. Proceedings of the ODYSSEY Speaker and Language Recognition Workshop, Toledo, Spain (2004)
Przybocki, M.A., Martin, A.F.: NIST’s Assessment of Text Independent Speaker Recognition Performance. In: The Advent of Biometrics on the Internet: Proceedings of the COST 275 Workshop, Rome, Italy, pp. 25–32 (2000)
Przybocki, M.A., Martin, A.F., Le, A.N.: NIST Speaker Recognition Evaluation Chronicles Part 2. In: Odyssey 2006. Proceedings of the ODYSSEY Speaker and Language Recognition Workshop, San Juan, Puerto Rico (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Martin, A.F. (2007). Evaluations of Automatic Speaker Classification Systems. In: Müller, C. (eds) Speaker Classification I. Lecture Notes in Computer Science(), vol 4343. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74200-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-74200-5_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74186-2
Online ISBN: 978-3-540-74200-5
eBook Packages: Computer ScienceComputer Science (R0)