Skip to main content

Effect of the Front-End Processing on Speaker Verification Performance Using PCA and Scores Level Fusion

  • Conference paper
  • First Online:
E-Business and Telecommunications (ICETE 2013)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 456))

Included in the following conference series:

  • 791 Accesses

Abstract

This paper evaluates the impact of low-level features on speaker verification performance, with an emphasis on the recently proposed MFCC variant based on asymmetric tapers (MFCC asymmetric from now on) stand-alone as features or followed by PCA as linear projection technique applied before the GMM-UBM back-end classifier in clean and noisy environments. The performances of the MFCC-asymmetric features are compared with: the standard Mel-Frequency Cepstral Coefficients (MFCC) that extracted from TIMIT corpus, under clean and noisy conditions. A score level fusion framework based on simples linear methods such as min, max, sum, …, etc. and training methods like SVM is proposed to improve performance and to mitigate noise degradation. The obtained results on corrupted TIMIT database confirm the superiority of fused system in noisy environments against each system alone, and the drastic degradation of the performances of PCA based systems in the presence of environmental noise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Reynolds, D. A.: An overview of automatic speaker recognition technology. ICASSP, pp. 4072–4075 (2002)

    Google Scholar 

  2. Bimbot, F., Bonastre, J.F., Fredouille, C., Gravier, G., Magrin-Chagnolleau, I., Meignier, S., Merlin, T., Ortega-Garcia, J., Petrovska-Delacretaz, D., Reynolds, D.A.: A tutorial on text-independent speaker verification. EURASIP J. Appl. Signal Process. 4(2), 430–451 (2004)

    Article  Google Scholar 

  3. Reynolds, D.A., Quatieri, T.F., Dunn, R.: Speaker verification using adapted Gaussian mixture models. Dig. Signal Process. 10(1–3), 19–41 (2000)

    Article  Google Scholar 

  4. Minh, N., Do, M.: Fast approximation of Kullback-Leibler distance for dependence trees and hidden Markov models. IEEE Signal Process. Lett. 10(4), 115–118 (2003)

    Article  Google Scholar 

  5. Dong, X., Zhaohui, W.: Speaker Recognition using continuous density support vector machines. Electron. Lett. 37(17), 1099–1101 (2001)

    Article  Google Scholar 

  6. Morales-Cordovilla, J.A., Sánchez, V., Gómez, A.M., Peinado, A.M.,: On the use of asymmetric windows for robust speech recognition. Circ. Syst. Signal Process. 31(2), 727–736 (2012)

    Article  Google Scholar 

  7. Rozman, R., Kodek, D.M.: Using asymmetric windows in automatic speech recognition. Speech Commun. 49, 268–276 (2007)

    Google Scholar 

  8. Kitaoka, N., Yamamoto, K., Kusamizu, T., Nakagawa, S., Yamada, T., Tsuge, S., Miyajima, C., Nishiura, T., Nakayama, M., Denda, Y., et al.: Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition performance. In: ASRU IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 607–612 (2007)

    Google Scholar 

  9. Kinnunen, T., Rajan, P.: A practical, self-adaptive voice activity detector for speaker verification with noisy telephone and microphone data. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013), pp. 7229–7233, Vancouver, Canada, May 2013

    Google Scholar 

  10. Kua, J.M., Epps, J.R., Ambikairajah, E., Nosratighods, M.H.: Front-end diversity in fused speaker recognition systems. In: The Proceedings of APSIPA ASC 2010, Asia-Pacific Signal Processing Association, Hong Kong, Presented at Asia-Pacific Signal Processing Association Conference, Singapore, 14–17 Dec 2010

    Google Scholar 

  11. Kinnunen, T., Li, H.: An overview of text independent speaker recognition: from features to supervectors. Speech Commun. 52, 12–40, Science Direct (2009)

    Google Scholar 

  12. Harris, F.: On the use of windows for harmonic analysis with the discrete Fourier transform. Proc. IEEE 66(1), 51–84 (1978)

    Article  Google Scholar 

  13. Delac, K., Grgic, M., Grgic, S.: Independent comparative study of PCA, ICA, and LDA on the FERET data set. Technical report, University of Zagreb (2004)

    Google Scholar 

  14. Moore, B.: Hearing. Academic Press, San Diego, ISBN 0-12-505626-5 (1995)

    Google Scholar 

  15. Alam. J., Kenny, P., O Shaughnessy, D.: On the use of asymmetric-shaped tapers for speaker verification using I-Vectors. In: Proceedings of the Odyssey Speaker and Language Recognition Workshop, Singapore, June 2012

    Google Scholar 

  16. Turk, M., Pentland, A.: Eigenfaces for recognition. J. Cogn. Neurosci. 3(1), 71–86 (1991)

    Google Scholar 

  17. Golub, G.H.: The generalized eigenvalue problem. Lectures on matrix computation, Ph.D. program of the Dipartimento di Matematica Istituto “Guido Castelnuovo”. Lecture No. 11, Roma (2004)

    Google Scholar 

  18. Varga, A.P, et al.: The NOISEX-92 study on the effect of additive noise on automatic speech recognition. NOISEX92, CDROM (1992)

    Google Scholar 

  19. Toh, A.M.: Feature extraction for robust speech recognition in hostile environments. Ph.D. thesis, School of Electrical, Electronic and Computer Engineering, University of Western Australia (UWA) (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nassim Asbai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Asbai, N., Bengherabi, M., Harizi, F., Amrouche, A. (2014). Effect of the Front-End Processing on Speaker Verification Performance Using PCA and Scores Level Fusion. In: Obaidat, M., Filipe, J. (eds) E-Business and Telecommunications. ICETE 2013. Communications in Computer and Information Science, vol 456. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44788-8_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-44788-8_21

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-44787-1

  • Online ISBN: 978-3-662-44788-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics