Skip to main content

An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models

  • Conference paper
  • First Online:
Artificial Intelligence Trends in Intelligent Systems (CSOC 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 573))

Included in the following conference series:

  • 1157 Accesses

Abstract

In this paper, a new model-based clustering algorithm is introduced for optimal speaker modeling in speaker identification systems. The introduced algorithm can estimates the optimal number of mixture components using a cross-validation methodology, as well as, overcome the initialization sensitivity and local maxima problems of classical EM algorithm using a split & merge incremental learning approach. The performed experiments in speaker identification task demonstrate the efficiency and effectivity of the proposed algorithm compared to the commonly used Expectation-Maximization (EM) algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Reynolds, D.A.: Automatic speaker recognition using gaussian mixture speaker models. Lincoln Lab. J. 8(2), 173–192 (1995)

    Google Scholar 

  2. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Sig. Process. 10(1), 19–41 (2000)

    Article  Google Scholar 

  3. Reynolds, D.: Universal background models. In: Li, S.Z., Jain, A.K. (eds.) Encyclopedia of Biometrics, pp. 1547–1550. Springer, New York (2015)

    Google Scholar 

  4. Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Sig. Process. Lett. 13(5), 308–311 (2006)

    Article  Google Scholar 

  5. Kenny, P.: Joint factor analysis of speaker and session variability: theory and algorithms. CRIM, Montreal, (Report) CRIM-06/08-13 (2005)

    Google Scholar 

  6. Kenny, P.: A small footprint i-vector extractor. In: Odyssey, pp. 1–6 (2012)

    Google Scholar 

  7. Ueda, N., Nakano, R.: Deterministic annealing EM algorithm. Neural Netw. 11(2), 271–282 (1998)

    Article  Google Scholar 

  8. Ueda, N., Nakano, R., Ghahramani, Z., Hinton, G.E.: SMEM algorithm for mixture models. Neural Comput. 12(9), 2109–2128 (2000)

    Article  Google Scholar 

  9. Zhang, Y., Scordilis, M.S.: Optimization of GMM training for speaker verification. In: ODYSSEY 2004 The Speaker and Language Recognition Workshop (2004)

    Google Scholar 

  10. Blekas, K., Lagaris, Isaac E.: Split–Merge Incremental LEarning (SMILE) of mixture models. In: Sá, J.M., Alexandre, Luís A., Duch, W., Mandic, D. (eds.) ICANN 2007. LNCS, vol. 4669, pp. 291–300. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74695-9_30

    Chapter  Google Scholar 

  11. Verbeek, J.J., Vlassis, N., Kröse, B.: Efficient greedy learning of gaussian mixture models. Neural Comput. 15(2), 469–485 (2003)

    Article  MATH  Google Scholar 

  12. Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Parzen, E., Tanabe, K., Kitagawa, G., (eds.) Selected Papers of Hirotugu Akaike, pp. 199–213. Springer New York (1998)

    Google Scholar 

  13. Schwarz, G., et al.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  14. Yang, Z.R., Zwolinski, M.: Mutual information theory for adaptive mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 23(4), 396–403 (2001)

    Article  Google Scholar 

  15. Bozdogan, H.: A new class of information complexity (ICOMP) criteria with an application to customer profiling and segmentation. Istanbul Univ. J. Sch. Bus. Adm. 39(2), 370–398 (2010)

    Google Scholar 

  16. Pekhovsky, T., Lokhanova, A.: Variational Bayesian model selection for GMM-speaker verification using universal background model. In: INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association, Florence, Italy, pp. 2705–2708, 27–31 August 2011 (2011)

    Google Scholar 

  17. Smyth, P.: Model selection for probabilistic clustering using cross-validated likelihood. Stat. Comput. 10(1), 63–72 (2000)

    Article  Google Scholar 

  18. Lee, Y., Lee, K.Y., Lee, J.: The estimating optimal number of Gaussian mixtures based on incremental k-means for speaker identification. Int. J. Inf. Technol. 12(7), 13–21 (2006)

    Google Scholar 

  19. Ayoub, B., Jamal, K., Arsalane, Z.: An analysis and comparative evaluation of MFCC variants for speaker identification over VoIP networks. In: 2015 World Congress on Information Technology and Computer Applications Congress (WCITCA), pp. 1–6 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ayoub Bouziane .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Bouziane, A., Kharroubi, J., Zarghili, A. (2017). An Improved Speaker Identification System Using Automatic Split-Merge Incremental Learning (A-SMILE) of Gaussian Mixture Models. In: Silhavy, R., Senkerik, R., Kominkova Oplatkova, Z., Prokopova, Z., Silhavy, P. (eds) Artificial Intelligence Trends in Intelligent Systems. CSOC 2017. Advances in Intelligent Systems and Computing, vol 573. Springer, Cham. https://doi.org/10.1007/978-3-319-57261-1_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57261-1_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57260-4

  • Online ISBN: 978-3-319-57261-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics