Evaluations of Automatic Speaker Classification Systems

Martin, Alvin F.

doi:10.1007/978-3-540-74200-5_18

Alvin F. Martin¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4343))

2374 Accesses
5 Citations

Abstract

The annual NIST Speaker Recognition Evaluations (SREs) from 1996 to 2006 have been internationally recognized as the leading source or performance evaluation of research systems in the speaker classification field. We discuss how these evaluations have developed and been conducted and the performance measures used. We consider the key factors that have been studied for their effect on performance, including training and test durations, channel variability, and speaker variability. We examine the extent to which progress has been observed in state-of-the-art performance. We also consider how the technology has changed over the past decade, other evaluations that have been conducted or planned, and where the field may be headed in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Doddington, G.: Speech Recognition: Â turning theory to practice. IEEE Spectrum 18(9), 26–32 (1981)
Google Scholar
Garofolo, J.S., Lamel, L.F., Fisher, W.M., Fiscus, J.G., Pallet, D.S., Dahlgren, N.L.: The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM. Technical report, National Institute of Standards and Technology, Gaithersburg (1993)
Google Scholar
Brümmer, N., Du Preez, J.: Application-independent evaluation of speaker detection. Computer Speech & Language 20(2-3), 230–275 (2006)
Article Google Scholar
Doddington, G.: Speaker recognition based on idiolectal differences between speakers. In: Eurospeech 2001. Proceedings of the 7th European Conference on Speech Communication and Technology. Aalborg, Denmark, Vol. 4, pp. 2521–2524 (2001)
Google Scholar
Andrews, W.D., Kohler, M.A., Campbell, J.P., Godfrey, J.J.: Phonetic, idiolectal, and acoustic speaker recognition. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 55–63 (2001)
Google Scholar
Doddington, G., Liggett, W., Martin, A., Przybocki, M., Reynolds, D.: Sheep, Goats, Lambs and Woves: A Statistical Analysis of Speaker Performance in the NIST 1998 Speaker Recognition Evaluation. In: ICSLP 1998. Proceedings of the 5th International Conference on Spoken Language Processing (1998)
Google Scholar
Fiscus, J.: The NIST Rich Transcription Evaluation Series, NIST web-site (2007), http://nist.gov/speech/tests/rt/
Fiscus, J., Radde, N., Garofolo, J.S., Le, A., Ajot, J., Laprun, C.: The Rich Transcription 2005 Spring Meeting Recognition Evaluation. In: Renals, S., Bengio, S. (eds.) MLMI 2005. LNCS, vol. 3869, Springer, Heidelberg (2006)
Chapter Google Scholar
CLEAR2007: Classification of Events, Activities and Relationships, Evaluation and Workshop (2007), http://www.clear-evaluation.org/
Van Leeuwen, D.A., Martin, A.F., Przybocki, M.A., Boutenc, J.S.: NIST and NFI-TNO evaluations of automatic speaker recognition. Computer Speech & Language 20(2–3), 128–158 (2006)
Google Scholar
Hansen, E.G., Slyh, R.E., Anderson, T.R.: Formant and F0 Features for Speaker Verification. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 25–29 (2001)
Google Scholar
Przybocki, M.A., Martin, A.F.: Odyssey Text Independent Evaluation Data. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 21–23 (2001)
Google Scholar
Higgins, A.L., Bahler, L.G.: ITT SpeakerKey Evaluation. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 31–32 (2001)
Google Scholar
Toledo-Ronen, O.: Speech Detection for Text-Dependent Speaker Verification. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 33–36 (2001)
Google Scholar
BioSecure: BioSecure Evaluation Campaign (2007), http://www.biosecure.info/eval/
Campbell, J.P., Nakasone, H., Cieri, C., Miller, D., Walker, K., Martin, A.F., Przybocki, M.A.: The MMSR Bilingual and Crosschannel Corpora for Speaker Recognition. In: LREC 2004. Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal, [Alvin: wasn’t this one published at the Odyssey 04 workshop raher than LREC?] (2004)
Google Scholar
Cieri, C., Andrew, W., Campbell, J.P., Doddington, G., Godfrey, J., Huang, S., Libermann, M., Martin, A., Nakasone, H., Przybocki, M., Walter, K.: The Mixer and Transcript Reading Corpora: Resources for Multilingual Crosschannel Speaker Recognition Research. In: LREC 2006. Proceedings of the 5th International Conference on Language Resources and Evaluation, Genoa, Italy (2006)
Google Scholar
Reynolds, D.A., Doddington, G., Przybocki, M., Marin, A.: The NIST speaker recognition evaluation - overview, methodology, systems, results, perspectives. Speech Communication 31(2-3), 225–254 (2000)
Article Google Scholar
Fiscus, J., Ajot, J., Michel, M., Garofolo, J.S.: The Rich Transcription 2006 Spring Meeting Recognition Evaluation. In: Renals, S., Bengio, S., Fiscus, J.G. (eds.) MLMI 2006. LNCS, vol. 4299, Springer, Heidelberg (2006)
Chapter Google Scholar
Linguistic Data Consortium: Catalog of Speaker Recognition Corpora (2007), http://www.ldc.upenn.edu/Catalog/
Martin, A.F., Przybocki, M.A.: The NIST 1999 Speaker Recognition Evaluation - An Overview. Digital Signal Processing 10, 1–18 (2000)
Article Google Scholar
Martin, A.F., Przybocki, M.A.: The NIST Speaker Recognition Evaluations: 1996-2001. In: Odyssey 2001. Proceedings of the the Odyssey Speaker Recognition Workshop, Chania, Crete, Greece, pp. 39–43 (2001)
Google Scholar
Martin, A.F., Przybocki, M.A., Doddington, G.: Speaker Recognition in a Multi-Speaker Environment. In: Eurospeech 2001. Proceedings of the 7th European Conference on Speech Communication and Technology, Aalborg, Denmark, vol. 2, pp. 787–790 (2001)
Google Scholar
Martin, A., Miller, D., Przybocki, M., Campbell, J.: Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004. In: LREC 2004. Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal (2004)
Google Scholar
Martin, A.F., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET Curve in Assessment of Detection Task Performance. In: Eurospeech 1997. Proceedings of the 5th European Conference on Speech Communication and Technology. Rhodes, Greece, vol. 4, pp. 1985–1988 (1997)
Google Scholar
Martin, A.F., Przybocki, M.A., Campbell, J.P.: The NIST speaker recognition evaluation program. In: Wayman, J., Jain, A.K., Wayman, D.M. (eds.) Biometric Systems: Technology, Design and Performance Evaluation, pp. 241–262. Springer, Heidelberg (2005)
Google Scholar
Martin, A.F., Przybocki, M.A., Le, A.N.: The NIST Speaker Recognition Evaluation Series, NIST web-site (2007), http://www.nist.gov/speech/tests/spk/
Philipps, P.J., Martin, A., Wilson, C., Przybocki, M.: An introduction to evaluating biometric systems. IEEE Computer 33(2), 56–63 (2000)
Google Scholar
Przybocki, M.A., Martin, A.F.: NIST speaker recognition evaluation. In: RLA2C 1998. Proceedings of the Workshop on Speaker Recognition and its Commercial and Forensic Applications, Avignon, pp. 120–123 (1998)
Google Scholar
Przybocki, M.A., Martin, A.F.: NIST Speaker Recognition Evaluation Chronicles. In: Odyssey 2004. Proceedings of the ODYSSEY Speaker and Language Recognition Workshop, Toledo, Spain (2004)
Google Scholar
Przybocki, M.A., Martin, A.F.: NIST’s Assessment of Text Independent Speaker Recognition Performance. In: The Advent of Biometrics on the Internet: Proceedings of the COST 275 Workshop, Rome, Italy, pp. 25–32 (2000)
Google Scholar
Przybocki, M.A., Martin, A.F., Le, A.N.: NIST Speaker Recognition Evaluation Chronicles Part 2. In: Odyssey 2006. Proceedings of the ODYSSEY Speaker and Language Recognition Workshop, San Juan, Puerto Rico (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

National Institute of Standards and Technology, 100 Bureau Drive Stop 8940, Gaithersburg, MD 20899-8940,
Alvin F. Martin

Authors

Alvin F. Martin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Christian Müller

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Martin, A.F. (2007). Evaluations of Automatic Speaker Classification Systems. In: Müller, C. (eds) Speaker Classification I. Lecture Notes in Computer Science(), vol 4343. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74200-5_18

Download citation

DOI: https://doi.org/10.1007/978-3-540-74200-5_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74186-2
Online ISBN: 978-3-540-74200-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics