Skip to main content
Log in

An extreme learning machine approach for speaker recognition

  • Extreme Learning Machine’s Theory & Application
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Over the last two decades, automatic speaker recognition has been an interesting and challenging problem to speech researchers. It can be classified into two different categories, speaker identification and speaker verification. In this paper, a new classifier, extreme learning machine, is examined on the text-independent speaker verification task and compared with SVM classifier. Extreme learning machine (ELM) classifiers have been proposed for generalized single hidden layer feedforward networks with a wide variety of hidden nodes. They are extremely fast in learning and perform well on many artificial and real regression and classification applications. The database used to evaluate the ELM and SVM classifiers is ELSDSR corpus, and the Mel-frequency Cepstral Coefficients were extracted and used as the input to the classifiers. Empirical studies have shown that ELM classifiers and its variants could perform better than SVM classifiers on the dataset provided with less training time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Atal B (1976) Automatic recognition of speakers from their voices. In: Proceedings of the IEEE, vol 64, pp 460–475

  2. Brookes M (2000) Voicebox: speech processing toolbox for matlab. World Wide Web, http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html

  3. Campbell JP (1997) Speaker recognition: a tutorial. In: Proceedings of the IEEE, vol 85, pp 1437–1462

  4. Canu S, Grandvalet Y, Guigue V, Rakotomamonjy A (2005) Svm and kernel methods matlab toolbox. Perception Systèmes et Information, INSA de Rouen, Rouen, France

  5. Doddington GR (1985) Speaker recognition-identifying people by their voices. In: Proceedings of the IEEE, vol 73, pp 1651–1664

  6. Egan JP (1975) Signal detection theory and ROC-analysis. Academic Press, New York

    Google Scholar 

  7. Farrell KR, Mammone RJ, Assaleh KT (1994) Speaker recognition using neural networks and conventional classifiers. IEEE Trans Speech Audio Process 2(1):194–205

    Article  Google Scholar 

  8. Feng L, Hansen LK (2004) A new database for speaker recognition

  9. Furui S (1997) Recent advances in speaker recognition. Patt Recognit Lett 18:859–872

    Article  Google Scholar 

  10. Huang GB, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70:3056–3062

    Article  Google Scholar 

  11. Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of international joint conference on neural networks (IJCNN’04), vol 2, Budapest, pp 985–990

  12. Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892

    Article  Google Scholar 

  13. Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501

    Article  Google Scholar 

  14. Huang GB, Ding XJ, Zhou HM (2010) Optimization method based extreme learning machine for classfication. Neurocomputing 74(1-3):155–163

    Article  Google Scholar 

  15. Huang GB, Wang D, Lan Y (2011) Extreme learning machine: a survey. Int J Mach Learn Cybernet 2:107–122

    Article  Google Scholar 

  16. Huang GB, Zhou H, Ding X, Zhang R (2011) Extreme learning machine for regression and multi-class classification. IEEE Trans Syst Man Cybernet (in press)

  17. Mut O, Göktürk M (2005) Improved weighted matching for speaker recognition. In: Proceedings of World Academy of Science, Engineering and Technology, vol 5, pp 170–172

  18. NOV (1997) Nova online. http://www.pbs.org/wgbh/nova/pyramid

  19. Pruzansky S (1963) Pattern-matching procedure for automatic talker recognition. J Acoustical Soc Am 35(3):354–358

    Article  Google Scholar 

  20. Rao CR, Mitra SK (1971) Generalized inverse of matrices and its applications. Wiley, New York

    MATH  Google Scholar 

  21. Rosenberg A (1976) Automatic speaker verification: a review. In: Proceedings of the IEEE, vol 64, pp 475–487

  22. Swets JA, Dawes RM, Monahan J (2000) Better decisions through science. Scientific American, pp 82–87

  23. Zhu QY, Huang GB (2004) Source codes of ELM algorithm. In: http://www.ntu.edu.sg/home/egbhuang/, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yeng Chai Soh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lan, Y., Hu, Z., Soh, Y.C. et al. An extreme learning machine approach for speaker recognition. Neural Comput & Applic 22, 417–425 (2013). https://doi.org/10.1007/s00521-012-0946-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-012-0946-x

Keywords

Navigation