Abstract
In this paper, a new model-based clustering algorithm is introduced for optimal speaker modeling in speaker identification systems. The introduced algorithm can estimates the optimal number of mixture components using a cross-validation methodology, as well as, overcome the initialization sensitivity and local maxima problems of classical EM algorithm using a split & merge incremental learning approach. The performed experiments in speaker identification task demonstrate the efficiency and effectivity of the proposed algorithm compared to the commonly used Expectation-Maximization (EM) algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Reynolds, D.A.: Automatic speaker recognition using gaussian mixture speaker models. Lincoln Lab. J. 8(2), 173–192 (1995)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Sig. Process. 10(1), 19–41 (2000)
Reynolds, D.: Universal background models. In: Li, S.Z., Jain, A.K. (eds.) Encyclopedia of Biometrics, pp. 1547–1550. Springer, New York (2015)
Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Sig. Process. Lett. 13(5), 308–311 (2006)
Kenny, P.: Joint factor analysis of speaker and session variability: theory and algorithms. CRIM, Montreal, (Report) CRIM-06/08-13 (2005)
Kenny, P.: A small footprint i-vector extractor. In: Odyssey, pp. 1–6 (2012)
Ueda, N., Nakano, R.: Deterministic annealing EM algorithm. Neural Netw. 11(2), 271–282 (1998)
Ueda, N., Nakano, R., Ghahramani, Z., Hinton, G.E.: SMEM algorithm for mixture models. Neural Comput. 12(9), 2109–2128 (2000)
Zhang, Y., Scordilis, M.S.: Optimization of GMM training for speaker verification. In: ODYSSEY 2004 The Speaker and Language Recognition Workshop (2004)
Blekas, K., Lagaris, Isaac E.: Split–Merge Incremental LEarning (SMILE) of mixture models. In: Sá, J.M., Alexandre, LuÃs A., Duch, W., Mandic, D. (eds.) ICANN 2007. LNCS, vol. 4669, pp. 291–300. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74695-9_30
Verbeek, J.J., Vlassis, N., Kröse, B.: Efficient greedy learning of gaussian mixture models. Neural Comput. 15(2), 469–485 (2003)
Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Parzen, E., Tanabe, K., Kitagawa, G., (eds.) Selected Papers of Hirotugu Akaike, pp. 199–213. Springer New York (1998)
Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Yang, Z.R., Zwolinski, M.: Mutual information theory for adaptive mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 23(4), 396–403 (2001)
Bozdogan, H.: A new class of information complexity (ICOMP) criteria with an application to customer profiling and segmentation. Istanbul Univ. J. Sch. Bus. Adm. 39(2), 370–398 (2010)
Pekhovsky, T., Lokhanova, A.: Variational Bayesian model selection for GMM-speaker verification using universal background model. In: INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, pp. 2705–2708, 27–31 August 2011 (2011)
Smyth, P.: Model selection for probabilistic clustering using cross-validated likelihood. Stat. Comput. 10(1), 63–72 (2000)
Lee, Y., Lee, K.Y., Lee, J.: The estimating optimal number of Gaussian mixtures based on incremental k-means for speaker identification. Int. J. Inf. Technol. 12(7), 13–21 (2006)
Ayoub, B., Jamal, K., Arsalane, Z.: An analysis and comparative evaluation of MFCC variants for speaker identification over VoIP networks. In: 2015 World Congress on Information Technology and Computer Applications Congress (WCITCA), pp. 1–6 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Bouziane, A., Kharroubi, J., Zarghili, A. (2017). An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models. In: Silhavy, R., Senkerik, R., Kominkova Oplatkova, Z., Prokopova, Z., Silhavy, P. (eds) Artificial Intelligence Trends in Intelligent Systems. CSOC 2017. Advances in Intelligent Systems and Computing, vol 573. Springer, Cham. https://doi.org/10.1007/978-3-319-57261-1_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-57261-1_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57260-4
Online ISBN: 978-3-319-57261-1
eBook Packages: EngineeringEngineering (R0)