Skip to main content

Selective Fusion for Speaker Verification in Surveillance

  • Conference paper
Book cover Intelligence and Security Informatics (ISI 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3495))

Included in the following conference series:

  • 3997 Accesses

Abstract

This paper presents an improved speaker verification technique that is especially appropriate for surveillance scenarios. The main idea is a meta-learning scheme aimed at improving fusion of low- and high-level speech information. While some existing systems fuse several classifier outputs, the proposed method uses a selective fusion scheme that takes into account conveying channel, speaking style and speaker stress as estimated on the test utterance. Moreover, we show that simultaneously employing multi-resolution versions of regular classifiers boosts fusion performance. The proposed selective fusion method aided by multi-resolution classifiers decreases error rate by 30% over ordinary fusion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Reynolds, D., Quatieri, T., Dunn, R.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10(1), 19–41 (2000)

    Article  Google Scholar 

  2. NIST - Speaker Recognition Evaluations, http://www.nist.gov/speech/tests/spk/index.htm

  3. Zhou, G., Hansen, J.H.L., Kaiser, J.F.: Nonlinear Feature Based Classification of Speech under Stress. IEEE Transactions on Speech & Audio Processing 9(2), 201–216 (2001)

    Article  Google Scholar 

  4. Campbell, J., Reynolds, D., Dunn, R.: Fusing High- and Low-Level Features for Speaker Recognition. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech), Geneva, Switzerland, pp. 2665–2668 (2003)

    Google Scholar 

  5. Solewicz, Y.A., Koppel, M.: Enhanced Fusion Methods for Speaker Verification. In: 9th International Conference Speech and Computer (SPECOM 2004), St. Petersburg, Russia, pp. 388–392 (2004)

    Google Scholar 

  6. Doddington, G.: Speaker Recognition based on Idiolectal Differences between Speakers. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), Aalborg, Denmark, pp. 2517–2520 (2001)

    Google Scholar 

  7. Auckenthaler, R., Carey, M., Lloyd-Thomas, H.: Score Normalization for Text-Independent Speaker Verification Systems. Digital Signal Processing 10, 42–54 (2000)

    Article  Google Scholar 

  8. Andrews, W.D., Kohler, M.A., Campbell, J.P., Godfrey, J.J., Hernández-Cordero, J.: Gender-Dependent Phonetic Refraction for Speaker Recognition. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Orlando, Florida, pp. 149–152 (2002)

    Google Scholar 

  9. Joachims, T.: Making large-Scale SVM Learning Practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1999)

    Google Scholar 

  10. Ramaswamy, G., Navratil, J., Chaudhari, U., Zilca, R., Pelecanos, J.: The IBM Systems for the NIST 2003 Speaker Recognition Evaluation. In: NIST 2003 Speaker Recognition Workshop, College Park, Maryland (2003)

    Google Scholar 

  11. Przybocki, M., Martin, A.: The NIST Year 2001 Speaker Recognition Evaluation Plan (2001), http://www.nist.gov/speech/tests/spk/2001/doc/

  12. SWITCHBOARD: A User’s Manual. Linguistic Data Consortium, http://www.ldc.upenn.edu/readme_files/switchboard.readme.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Solewicz, Y.A., Koppel, M. (2005). Selective Fusion for Speaker Verification in Surveillance. In: Kantor, P., et al. Intelligence and Security Informatics. ISI 2005. Lecture Notes in Computer Science, vol 3495. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11427995_22

Download citation

  • DOI: https://doi.org/10.1007/11427995_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25999-2

  • Online ISBN: 978-3-540-32063-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics