Skip to main content

Robust Speaker Recognition Using MAP Estimation of Additive Noise in i-vectors Space

  • Conference paper
  • First Online:
Statistical Language and Speech Processing (SLSP 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8791))

Included in the following conference series:

Abstract

In the last few years, the use of i-vectors along with a generative back-end has become the new standard in speaker recognition. An i-vector is a compact representation of a speaker utterance extracted from a low dimensional total variability subspace. Although current speaker recognition systems achieve very good results in clean training and test conditions, the performance degrades considerably in noisy environments. The compensation of the noise effect is actually a research subject of major importance. As far as we know, there was no serious attempt to treat the noise problem directly in the i-vectors space without relying on data distributions computed on a prior domain. This paper proposes a full-covariance Gaussian modeling of the clean i-vectors and noise distributions in the i-vectors space then introduces a technique to estimate a clean i-vector given the noisy version and the noise density function using MAP approach. Based on NIST data, we show that it is possible to improve up to 60 % the baseline system performances. A noise adding tool is used to help simulate a real-world noisy environment at different signal-to-noise ratio levels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Acero, A., Deng, L., Kristjansson, T.T., Zhang, J.: Hmm adaptation using vector taylor series for noisy speech recognition. In: INTERSPEECH, pp. 869–872 (2000)

    Google Scholar 

  2. The NIST year 2008 speaker recognition evaluation plan (2008). http://www.itl.nist.gov/iad/mig/tests/sre/2008/sre08_evalplan_release4.pdf. Accessed 15 May 2014

  3. Brümmer, N., De Villiers, E.: The speaker partitioning problem. In: Odyssey, p. 34 (2010)

    Google Scholar 

  4. Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: Interspeech, pp. 249–252 (2011)

    Google Scholar 

  5. Hirsch, H.G.: FaNT - Filtering and Noise Adding Tool. http://dnt.kr.hsnr.de/download.html. Accessed 15 May 2014

  6. Kenny, P.: Joint factor analysis of speaker and session variability: theory and algorithms. CRIM, Montreal, (Report) CRIM-06/08-13 (2005)

    Google Scholar 

  7. Lei, Y., Burget, L., Scheffer, N.: A noise robust i-vector extractor using vector taylor series for speaker recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6788–6791. IEEE (2013)

    Google Scholar 

  8. Lei, Y., McLaren, M., Ferrer, L., Scheffer, N.: Simplified vts-based i-vector extraction in noise-robust speaker recognition. Submitted to ICASSP, Florence, Italy (2014)

    Google Scholar 

  9. Martınez, D., Burget, L., Stafylakis, T., Lei, Y., Kenny, P., Lleida, E.: Unscented transform for ivector-based noisy speaker recognition. Submitted to ICASSP, Florence, Italy (2014)

    Google Scholar 

  10. Matrouf, D., Scheffer, N., Fauve, B.G., Bonastre, J.F.: A straightforward and efficient implementation of the factor analysis model for speaker verification. In: INTERSPEECH, pp. 1242–1245 (2007)

    Google Scholar 

  11. Pelecanos, J., Sridharan, S.: Feature warping for robust speaker verification. In: Speaker Odyssey, Crete, Greece (2001)

    Google Scholar 

  12. Prince, S.J., Elder, J.H.: Probabilistic linear discriminant analysis for inferences about identity. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–8. IEEE (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Waad Ben Kheder .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ben Kheder, W., Matrouf , D., Bousquet, PM., Bonastre, JF., Ajili, M. (2014). Robust Speaker Recognition Using MAP Estimation of Additive Noise in i-vectors Space. In: Besacier, L., Dediu, AH., Martín-Vide, C. (eds) Statistical Language and Speech Processing. SLSP 2014. Lecture Notes in Computer Science(), vol 8791. Springer, Cham. https://doi.org/10.1007/978-3-319-11397-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11397-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11396-8

  • Online ISBN: 978-3-319-11397-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics