An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models

Bouziane, Ayoub; Kharroubi, Jamal; Zarghili, Arsalane

doi:10.1007/978-3-319-57261-1_36

Ayoub Bouziane¹⁹,
Jamal Kharroubi¹⁹ &
Arsalane Zarghili¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 573))

Included in the following conference series:

Computer Science On-line Conference

1157 Accesses

Abstract

In this paper, a new model-based clustering algorithm is introduced for optimal speaker modeling in speaker identification systems. The introduced algorithm can estimates the optimal number of mixture components using a cross-validation methodology, as well as, overcome the initialization sensitivity and local maxima problems of classical EM algorithm using a split & merge incremental learning approach. The performed experiments in speaker identification task demonstrate the efficiency and effectivity of the proposed algorithm compared to the commonly used Expectation-Maximization (EM) algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Speaker Classification via Supervised Hierarchical Clustering Using ICA Mixture Model

Improved Text-Independent Speaker Identification and Verification with Gaussian Mixture Models

2S-Norm: A New Score Normalization for a GMM Based Text-Independent Speaker Identification System

References

Reynolds, D.A.: Automatic speaker recognition using gaussian mixture speaker models. Lincoln Lab. J. 8(2), 173–192 (1995)
Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Sig. Process. 10(1), 19–41 (2000)
Article Google Scholar
Reynolds, D.: Universal background models. In: Li, S.Z., Jain, A.K. (eds.) Encyclopedia of Biometrics, pp. 1547–1550. Springer, New York (2015)
Google Scholar
Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Sig. Process. Lett. 13(5), 308–311 (2006)
Article Google Scholar
Kenny, P.: Joint factor analysis of speaker and session variability: theory and algorithms. CRIM, Montreal, (Report) CRIM-06/08-13 (2005)
Google Scholar
Kenny, P.: A small footprint i-vector extractor. In: Odyssey, pp. 1–6 (2012)
Google Scholar
Ueda, N., Nakano, R.: Deterministic annealing EM algorithm. Neural Netw. 11(2), 271–282 (1998)
Article Google Scholar
Ueda, N., Nakano, R., Ghahramani, Z., Hinton, G.E.: SMEM algorithm for mixture models. Neural Comput. 12(9), 2109–2128 (2000)
Article Google Scholar
Zhang, Y., Scordilis, M.S.: Optimization of GMM training for speaker verification. In: ODYSSEY 2004 The Speaker and Language Recognition Workshop (2004)
Google Scholar
Blekas, K., Lagaris, Isaac E.: Split–Merge Incremental LEarning (SMILE) of mixture models. In: Sá, J.M., Alexandre, Luís A., Duch, W., Mandic, D. (eds.) ICANN 2007. LNCS, vol. 4669, pp. 291–300. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74695-9_30
Chapter Google Scholar
Verbeek, J.J., Vlassis, N., Kröse, B.: Efficient greedy learning of gaussian mixture models. Neural Comput. 15(2), 469–485 (2003)
Article MATH Google Scholar
Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Parzen, E., Tanabe, K., Kitagawa, G., (eds.) Selected Papers of Hirotugu Akaike, pp. 199–213. Springer New York (1998)
Google Scholar
Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Article MathSciNet MATH Google Scholar
Yang, Z.R., Zwolinski, M.: Mutual information theory for adaptive mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 23(4), 396–403 (2001)
Article Google Scholar
Bozdogan, H.: A new class of information complexity (ICOMP) criteria with an application to customer profiling and segmentation. Istanbul Univ. J. Sch. Bus. Adm. 39(2), 370–398 (2010)
Google Scholar
Pekhovsky, T., Lokhanova, A.: Variational Bayesian model selection for GMM-speaker verification using universal background model. In: INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, pp. 2705–2708, 27–31 August 2011 (2011)
Google Scholar
Smyth, P.: Model selection for probabilistic clustering using cross-validated likelihood. Stat. Comput. 10(1), 63–72 (2000)
Article Google Scholar
Lee, Y., Lee, K.Y., Lee, J.: The estimating optimal number of Gaussian mixtures based on incremental k-means for speaker identification. Int. J. Inf. Technol. 12(7), 13–21 (2006)
Google Scholar
Ayoub, B., Jamal, K., Arsalane, Z.: An analysis and comparative evaluation of MFCC variants for speaker identification over VoIP networks. In: 2015 World Congress on Information Technology and Computer Applications Congress (WCITCA), pp. 1–6 (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Intelligent Systems and Applications, Faculty of Sciences and Technologies, Fez, Morocco
Ayoub Bouziane, Jamal Kharroubi & Arsalane Zarghili

Authors

Ayoub Bouziane
View author publications
You can also search for this author in PubMed Google Scholar
Jamal Kharroubi
View author publications
You can also search for this author in PubMed Google Scholar
Arsalane Zarghili
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ayoub Bouziane .

Editor information

Editors and Affiliations

Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlin, Czech Republic
Radek Silhavy
Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlin, Czech Republic
Roman Senkerik
Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlin, Czech Republic
Zuzana Kominkova Oplatkova
Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlin, Czech Republic
Zdenka Prokopova
Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlin, Czech Republic
Petr Silhavy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bouziane, A., Kharroubi, J., Zarghili, A. (2017). An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models. In: Silhavy, R., Senkerik, R., Kominkova Oplatkova, Z., Prokopova, Z., Silhavy, P. (eds) Artificial Intelligence Trends in Intelligent Systems. CSOC 2017. Advances in Intelligent Systems and Computing, vol 573. Springer, Cham. https://doi.org/10.1007/978-3-319-57261-1_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-57261-1_36
Published: 07 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57260-4
Online ISBN: 978-3-319-57261-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics