Abstract
A time session variability between the enrollment data and the recognized data degrades speaker recognition performance. Hence, the time session variability is one of the most important issues in the speaker recognition technology. In this paper, we propose a robust speaker recognition method for the time session variability. The proposed method estimates a time session variability subspace. Then, the proposed method carries out the speaker recognition in the orthogonal complement of the time session variability subspace. In addition, we incorporate a linear discriminant analysis method into the proposed method. In order to evaluate the proposed method, we conducted a speaker identification experiment. Experimental results show that the proposed method improves speaker identification performance of baseline.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: from features to supervectors. Speech Commun. 52(1), 12–40 (2010)
Matsui, T., Nishitani, T., Furui, S.: A study of model and a priori threshold updating in speaker verification. IEICE Trans. J81-DII(2), 268–276 (1998). (in Japanese)
Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Joint factor analysis versus eigenchannels in speaker recognition. IEEE Trans. Audio Speech Lang. Process. 15(4), 1435–1447 (2007)
Kenny, P., Boulianne, G., Ouellet, P., Dumouchel, P.: Speaker and session variability in GMM-based speaker verification. IEEE Trans. Audio Speech Lang. Process. 15(4), 1448–1460 (2007)
Kenny, P., Ouellet, P., Dehak, N., Gupta, V., Dumouchel, P.: A study of interspeaker variability in speaker verification. IEEE Trans. Audio Speech Lang. Process. 16(5), 980–988 (2008)
Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: Proceedings of Odyssey (2010)
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)
Makinae, H., Osanai, T., Kamada, T., Tanimoto, M.: Construction and preliminary analysis of a large-scale bone-conducted speech database. IEICE Techn. Rep. Speech 107(165), 97–102 (2007). (in Japanese)
Furui, S., Maekawa, K., Isahara, H.: A Japanese national project on spontaneous speech corpus and processing technology. In: Proceedings of ASR 2000, pp. 244–248 (2000)
Partridge, M., Calvo, R.A.: Fast dimensionality reduction and simple PCA. Intell. Data Anal. 2, 203–214 (1998)
Tsuge, S., Kuroiwa, S.: AWA long-term recording speech corpus (AWA-LTR). In: Proceedings of 2013 International Workshop on Nonlinear Circuits, Communication and Signal Processing (NCSP 2013), pp. 17–20 (2013)
Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: Proceedings of Interspeech, pp. 249–252 (2011)
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., Stemmer, G., Vesely, K.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (2011)
scikit-learn, machine learning in Python. http://scikit-learn.org/stable/
Acknowledgments
This work was supported by JSPS KAKENHI Grant Number JP16K00229.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Tsuge, S., Kuroiwa, S. (2019). Speaker Recognition in Orthogonal Complement of Time Session Variability Subspace. In: De Pietro, G., Gallo, L., Howlett, R., Jain, L., Vlacic, L. (eds) Intelligent Interactive Multimedia Systems and Services. KES-IIMSS-18 2018. Smart Innovation, Systems and Technologies, vol 98. Springer, Cham. https://doi.org/10.1007/978-3-319-92231-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-92231-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92230-0
Online ISBN: 978-3-319-92231-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)