Robust Principal Component Analysis Based Speaker Verification Under Additive Noise Conditions

Wang, Minghe; Zhang, Erhua; Tang, Zhenmin

doi:10.1007/978-981-10-3005-5_49

Minghe Wang¹⁶,
Erhua Zhang¹⁶ &
Zhenmin Tang¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 663))

Included in the following conference series:

Chinese Conference on Pattern Recognition

2266 Accesses

Abstract

Previous researches show that the approaches based on the total variability space (TVS) followed by Gaussian probabilistic linear discriminant analysis (GPLDA) work effectively for dealing with convolutional noise (such as channel noise) and can bring some degree of gains in term of accuracy under additive noisy environment as well. However they meet difficulty while many types of noises are unseen and non-stationary in real world. To address this issue, we introduce the robust principal component analysis (RPCA) into the TVS modeled speaker verification system, called RPCA-TVS, which regards the noise spectrum as the low-rank component and the speech spectrum as the sparse component in short-time Fourier transform (SFT) domain. The highlighting of this paper is to improve the robustness of speaker verification under additive noisy environment, especially in non-stationary and unseen noise conditions. For evaluating the performance, we designed and generated an additive noisy corpus, based on the TIMIT and NUST603-2014 database, using the NaFT tools with 12 types of noise samples deriving from NOISEX-92 and FREESOUND. Experimental results demonstrate that the proposed RPCA-TVS can achieve better performance than the competing methods at various signal-to-noise ratio (SNR) levels. Especially, RPCA-TVS reduces the equal error rate (EER) by 5.12 % in average than the multi-condition system under additive noise conditions at SNR = 8 dB.

This work is supported by the National Science Foundation of China (Grand no. 61473154).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
available from http://www.freesound.com.

References

Lei, Y., Burget, L., Ferrer, L., et al.: Towards noise-robust speaker recognition using probabilistic linear discriminant analysis. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, pp. 4253–4256 (2012)
Google Scholar
Sun, M., Zhang, X., Van Hamme, H., et al.: Unseen noise estimation using separable deep auto encoder for speech enhancement. IEEE/ACM Trans. Audio Speech Lang. Process. 24(1), 93–104 (2016)
Article Google Scholar
Candès, E.J., Li, X., Ma, Y., et al.: Robust principal component analysis. J. ACM 58(3), 1–73 (2011)
Article MathSciNet MATH Google Scholar
Dat, T.T., Jin, Y.K., Kim, H.G., et al.: Robust speaker verification using low-rank recovery under total variability space. In: International Conference on IT Convergence and Security, Kuala Lumpur, pp. 1–4 (2015)
Google Scholar
Dehak, N., Kenny, P., Dehak, R., et al.: Front-end factor analysis for speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)
Article Google Scholar
Li, W., Fu, T., Zhu, J.: An improved i-vector extraction algorithm for speaker verification. EURASIP J. Audio Speech Music Process. 2015(1), 1–9 (2015)
Article Google Scholar
Li, N., Mak, M.W.: SNR-invariant PLDA modeling in nonparametric subspace for robust speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 23(10), 1648–1659 (2015)
Article Google Scholar
Kanagasundaram, A., Dean, D., Sridharan, S., et al.: I-vector based speaker recognition using advanced channel compensation techniques. Comput. Speech Lang. 28(1), 121–140 (2014)
Article Google Scholar
Jiang, Y., Lee, K.A., Wang, L.B.: PLDA in the I-SUPERVECTOR space for text-independent speaker verification. EURASIP J. Audio Speech Music Process. 1–13, 2014 (2014)
Google Scholar
Mak, M.W., Pang, X., Chien, J.T.: Mixture of PLDA for noise robust i-vector speaker verification. IEEE/ACM Trans. Audio Speech Lang. Process. 24(1), 130–142 (2016)
Article Google Scholar
Huang, P.S., Chen, S.D., Smaragdis, P., et al.: Singing-voice separation from monaural recordings using robust principal component analysis. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Kyoto, pp. 57–60 (2012)
Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. Adv. Neural Inf. Process. Syst. 13, 556–562 (2001)
Google Scholar
Gemmeke, J.F., Virtanen, T., Hurmalainen, A.: Exemplar-based sparse representations for noise robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 19(7), 2067–2080 (2011)
Article Google Scholar
Hu, Y., Liu, G.: Separation of singing voice using nonnegative matrix partial co-factorization for singer identification. IEEE/ACM Trans. Audio Speech Lang. Process. 23(4), 643–653 (2015)
Article Google Scholar
Li, J., Deng, L., Gong, Y., et al.: An overview of noise-robust automatic speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(4), 745–777 (2014)
Article Google Scholar
Kheder, W.B., Matrouf, D., Bonastre, J.F., et al.: Additive noise compensation in the i-vector space for speaker recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, 35–39 (2015)
Google Scholar
Gonzalez-Rodriguez, J.: Evaluating automatic speaker recognition systems: An overview of the NIST speaker recognition evaluations (1996-2014). Loquens 1(1), 1–15 (2014)
Article Google Scholar
Wang, M.H., Chen, Y., Tang, Z.M., et al.: I-vector based speaker gender recognition. In: IEEE Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, pp. 729–732 (2015)
Google Scholar
Avila, A.R., Sarria-Paja, M., Fraga, F.J., et al.: Improving the performance of far-field speaker verification using multi-condition training: The case of GMM-UBM and i-vector systems. In: Fifteenth Conference of the International Speech Communication Association (ISCA), Singapore, pp. 1096–1100 (2014)
Google Scholar
Mekonnen, B.W., Dufera, B.D.: Noise robust speaker verification using GMM-UBM multi-condition training. In: AFRICON, Addis Ababa, pp. 1–5 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
Minghe Wang, Erhua Zhang & Zhenmin Tang

Authors

Minghe Wang
View author publications
You can also search for this author in PubMed Google Scholar
Erhua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenmin Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minghe Wang .

Editor information

Editors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an, China
Xuelong Li
Chinese Academy of Sciences, Institute of Computing Technology, Beijing, China
Xilin Chen
Tsinghua University , Beijing, China
Jie Zhou
Nanjing University of Science and Technology, Nanjing, China
Jian Yang
University of Electronic Science and Technology, Chengdu, Sichuan, China
Hong Cheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, M., Zhang, E., Tang, Z. (2016). Robust Principal Component Analysis Based Speaker Verification Under Additive Noise Conditions. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_49

Download citation

DOI: https://doi.org/10.1007/978-981-10-3005-5_49
Published: 22 October 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3004-8
Online ISBN: 978-981-10-3005-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics