Selective Fusion for Speaker Verification in Surveillance

Solewicz, Yosef A.; Koppel, Moshe

doi:10.1007/11427995_22

Yosef A. Solewicz^23,24 &
Moshe Koppel²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3495))

Included in the following conference series:

International Conference on Intelligence and Security Informatics

3997 Accesses

Abstract

This paper presents an improved speaker verification technique that is especially appropriate for surveillance scenarios. The main idea is a meta-learning scheme aimed at improving fusion of low- and high-level speech information. While some existing systems fuse several classifier outputs, the proposed method uses a selective fusion scheme that takes into account conveying channel, speaking style and speaker stress as estimated on the test utterance. Moreover, we show that simultaneously employing multi-resolution versions of regular classifiers boosts fusion performance. The proposed selective fusion method aided by multi-resolution classifiers decreases error rate by 30% over ordinary fusion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Reynolds, D., Quatieri, T., Dunn, R.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10(1), 19–41 (2000)
Article Google Scholar
NIST - Speaker Recognition Evaluations, http://www.nist.gov/speech/tests/spk/index.htm
Zhou, G., Hansen, J.H.L., Kaiser, J.F.: Nonlinear Feature Based Classification of Speech under Stress. IEEE Transactions on Speech & Audio Processing 9(2), 201–216 (2001)
Article Google Scholar
Campbell, J., Reynolds, D., Dunn, R.: Fusing High- and Low-Level Features for Speaker Recognition. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech), Geneva, Switzerland, pp. 2665–2668 (2003)
Google Scholar
Solewicz, Y.A., Koppel, M.: Enhanced Fusion Methods for Speaker Verification. In: 9th International Conference Speech and Computer (SPECOM 2004), St. Petersburg, Russia, pp. 388–392 (2004)
Google Scholar
Doddington, G.: Speaker Recognition based on Idiolectal Differences between Speakers. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), Aalborg, Denmark, pp. 2517–2520 (2001)
Google Scholar
Auckenthaler, R., Carey, M., Lloyd-Thomas, H.: Score Normalization for Text-Independent Speaker Verification Systems. Digital Signal Processing 10, 42–54 (2000)
Article Google Scholar
Andrews, W.D., Kohler, M.A., Campbell, J.P., Godfrey, J.J., Hernández-Cordero, J.: Gender-Dependent Phonetic Refraction for Speaker Recognition. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Orlando, Florida, pp. 149–152 (2002)
Google Scholar
Joachims, T.: Making large-Scale SVM Learning Practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1999)
Google Scholar
Ramaswamy, G., Navratil, J., Chaudhari, U., Zilca, R., Pelecanos, J.: The IBM Systems for the NIST 2003 Speaker Recognition Evaluation. In: NIST 2003 Speaker Recognition Workshop, College Park, Maryland (2003)
Google Scholar
Przybocki, M., Martin, A.: The NIST Year 2001 Speaker Recognition Evaluation Plan (2001), http://www.nist.gov/speech/tests/spk/2001/doc/
SWITCHBOARD: A User’s Manual. Linguistic Data Consortium, http://www.ldc.upenn.edu/readme_files/switchboard.readme.html

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Bar-Ilan University, Ramat-Gan, Israel
Yosef A. Solewicz & Moshe Koppel
Division of Identification and Forensic Science, Israel National Police, Jerusalem, Israel
Yosef A. Solewicz

Authors

Yosef A. Solewicz
View author publications
You can also search for this author in PubMed Google Scholar
Moshe Koppel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Library and Information Science, Rutgers University,
Paul Kantor
School of Communication, Information and Library Studies, Rutgers University, 4 Huntington Street, 08901-1071, New Brunswick, NJ, USA
Gheorghe Muresan
Artificial Solutions, Altonaer Poststraße 13b, 22767, Hamburg, Germany
Fred Roberts
MIS Department, University of Arizona, 85721, Tucson, AZ, USA
Daniel D. Zeng
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Fei-Yue Wang
Department of Management Information Systems, Eller College of Management, The University of Arizona, 85721, AZ, USA
Hsinchun Chen
College of Computing, Georgia Tech Information Security Center, Georgia Institute of Technology, 801 Atlantic Drive, 30332-0280, Atlanta, GA, USA
Ralph C. Merkle

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Solewicz, Y.A., Koppel, M. (2005). Selective Fusion for Speaker Verification in Surveillance. In: Kantor, P., et al. Intelligence and Security Informatics. ISI 2005. Lecture Notes in Computer Science, vol 3495. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11427995_22

Download citation

DOI: https://doi.org/10.1007/11427995_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25999-2
Online ISBN: 978-3-540-32063-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics