Skip to main content

Robust Principal Component Analysis Based Speaker Verification Under Additive Noise Conditions

  • Conference paper
  • First Online:
Pattern Recognition (CCPR 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 663))

Included in the following conference series:

  • 2266 Accesses

Abstract

Previous researches show that the approaches based on the total variability space (TVS) followed by Gaussian probabilistic linear discriminant analysis (GPLDA) work effectively for dealing with convolutional noise (such as channel noise) and can bring some degree of gains in term of accuracy under additive noisy environment as well. However they meet difficulty while many types of noises are unseen and non-stationary in real world. To address this issue, we introduce the robust principal component analysis (RPCA) into the TVS modeled speaker verification system, called RPCA-TVS, which regards the noise spectrum as the low-rank component and the speech spectrum as the sparse component in short-time Fourier transform (SFT) domain. The highlighting of this paper is to improve the robustness of speaker verification under additive noisy environment, especially in non-stationary and unseen noise conditions. For evaluating the performance, we designed and generated an additive noisy corpus, based on the TIMIT and NUST603-2014 database, using the NaFT tools with 12 types of noise samples deriving from NOISEX-92 and FREESOUND. Experimental results demonstrate that the proposed RPCA-TVS can achieve better performance than the competing methods at various signal-to-noise ratio (SNR) levels. Especially, RPCA-TVS reduces the equal error rate (EER) by 5.12 % in average than the multi-condition system under additive noise conditions at SNR = 8 dB.

This work is supported by the National Science Foundation of China (Grand no. 61473154).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    available from http://www.freesound.com.

References

  1. Lei, Y., Burget, L., Ferrer, L., et al.: Towards noise-robust speaker recognition using probabilistic linear discriminant analysis. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, pp. 4253–4256 (2012)

    Google Scholar 

  2. Sun, M., Zhang, X., Van Hamme, H., et al.: Unseen noise estimation using separable deep auto encoder for speech enhancement. IEEE/ACM Trans. Audio Speech Lang. Process. 24(1), 93–104 (2016)

    Article  Google Scholar 

  3. Candès, E.J., Li, X., Ma, Y., et al.: Robust principal component analysis. J. ACM 58(3), 1–73 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  4. Dat, T.T., Jin, Y.K., Kim, H.G., et al.: Robust speaker verification using low-rank recovery under total variability space. In: International Conference on IT Convergence and Security, Kuala Lumpur, pp. 1–4 (2015)

    Google Scholar 

  5. Dehak, N., Kenny, P., Dehak, R., et al.: Front-end factor analysis for speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)

    Article  Google Scholar 

  6. Li, W., Fu, T., Zhu, J.: An improved i-vector extraction algorithm for speaker verification. EURASIP J. Audio Speech Music Process. 2015(1), 1–9 (2015)

    Article  Google Scholar 

  7. Li, N., Mak, M.W.: SNR-invariant PLDA modeling in nonparametric subspace for robust speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 23(10), 1648–1659 (2015)

    Article  Google Scholar 

  8. Kanagasundaram, A., Dean, D., Sridharan, S., et al.: I-vector based speaker recognition using advanced channel compensation techniques. Comput. Speech Lang. 28(1), 121–140 (2014)

    Article  Google Scholar 

  9. Jiang, Y., Lee, K.A., Wang, L.B.: PLDA in the I-SUPERVECTOR space for text-independent speaker verification. EURASIP J. Audio Speech Music Process. 1–13, 2014 (2014)

    Google Scholar 

  10. Mak, M.W., Pang, X., Chien, J.T.: Mixture of PLDA for noise robust i-vector speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 24(1), 130–142 (2016)

    Article  Google Scholar 

  11. Huang, P.S., Chen, S.D., Smaragdis, P., et al.: Singing-voice separation from monaural recordings using robust principal component analysis. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, pp. 57–60 (2012)

    Google Scholar 

  12. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. Adv. Neural Inf. Process. Syst. 13, 556–562 (2001)

    Google Scholar 

  13. Gemmeke, J.F., Virtanen, T., Hurmalainen, A.: Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 19(7), 2067–2080 (2011)

    Article  Google Scholar 

  14. Hu, Y., Liu, G.: Separation of singing voice using nonnegative matrix partial co-factorization for singer identification. IEEE/ACM Trans. Audio Speech Lang. Process. 23(4), 643–653 (2015)

    Article  Google Scholar 

  15. Li, J., Deng, L., Gong, Y., et al.: An overview of noise-robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(4), 745–777 (2014)

    Article  Google Scholar 

  16. Kheder, W.B., Matrouf, D., Bonastre, J.F., et al.: Additive noise compensation in the i-vector space for speaker recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, 35–39 (2015)

    Google Scholar 

  17. Gonzalez-Rodriguez, J.: Evaluating automatic speaker recognition systems: An overview of the NIST speaker recognition evaluations (1996-2014). Loquens 1(1), 1–15 (2014)

    Article  Google Scholar 

  18. Wang, M.H., Chen, Y., Tang, Z.M., et al.: I-vector based speaker gender recognition. In: IEEE Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, pp. 729–732 (2015)

    Google Scholar 

  19. Avila, A.R., Sarria-Paja, M., Fraga, F.J., et al.: Improving the performance of far-field speaker verification using multi-condition training: The case of GMM-UBM and i-vector systems. In: Fifteenth Conference of the International Speech Communication Association (ISCA), Singapore, pp. 1096–1100 (2014)

    Google Scholar 

  20. Mekonnen, B.W., Dufera, B.D.: Noise robust speaker verification using GMM-UBM multi-condition training. In: AFRICON, Addis Ababa, pp. 1–5 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Minghe Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Wang, M., Zhang, E., Tang, Z. (2016). Robust Principal Component Analysis Based Speaker Verification Under Additive Noise Conditions. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_49

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3005-5_49

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3004-8

  • Online ISBN: 978-981-10-3005-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics