Abstract
Over the last two decades, automatic speaker recognition has been an interesting and challenging problem to speech researchers. It can be classified into two different categories, speaker identification and speaker verification. In this paper, a new classifier, extreme learning machine, is examined on the text-independent speaker verification task and compared with SVM classifier. Extreme learning machine (ELM) classifiers have been proposed for generalized single hidden layer feedforward networks with a wide variety of hidden nodes. They are extremely fast in learning and perform well on many artificial and real regression and classification applications. The database used to evaluate the ELM and SVM classifiers is ELSDSR corpus, and the Mel-frequency Cepstral Coefficients were extracted and used as the input to the classifiers. Empirical studies have shown that ELM classifiers and its variants could perform better than SVM classifiers on the dataset provided with less training time.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Atal B (1976) Automatic recognition of speakers from their voices. In: Proceedings of the IEEE, vol 64, pp 460–475
Brookes M (2000) Voicebox: speech processing toolbox for matlab. World Wide Web, http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
Campbell JP (1997) Speaker recognition: a tutorial. In: Proceedings of the IEEE, vol 85, pp 1437–1462
Canu S, Grandvalet Y, Guigue V, Rakotomamonjy A (2005) Svm and kernel methods matlab toolbox. Perception Systèmes et Information, INSA de Rouen, Rouen, France
Doddington GR (1985) Speaker recognition-identifying people by their voices. In: Proceedings of the IEEE, vol 73, pp 1651–1664
Egan JP (1975) Signal detection theory and ROC-analysis. Academic Press, New York
Farrell KR, Mammone RJ, Assaleh KT (1994) Speaker recognition using neural networks and conventional classifiers. IEEE Trans Speech Audio Process 2(1):194–205
Feng L, Hansen LK (2004) A new database for speaker recognition
Furui S (1997) Recent advances in speaker recognition. Patt Recognit Lett 18:859–872
Huang GB, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70:3056–3062
Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of international joint conference on neural networks (IJCNN’04), vol 2, Budapest, pp 985–990
Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501
Huang GB, Ding XJ, Zhou HM (2010) Optimization method based extreme learning machine for classfication. Neurocomputing 74(1-3):155–163
Huang GB, Wang D, Lan Y (2011) Extreme learning machine: a survey. Int J Mach Learn Cybernet 2:107–122
Huang GB, Zhou H, Ding X, Zhang R (2011) Extreme learning machine for regression and multi-class classification. IEEE Trans Syst Man Cybernet (in press)
Mut O, Göktürk M (2005) Improved weighted matching for speaker recognition. In: Proceedings of World Academy of Science, Engineering and Technology, vol 5, pp 170–172
NOV (1997) Nova online. http://www.pbs.org/wgbh/nova/pyramid
Pruzansky S (1963) Pattern-matching procedure for automatic talker recognition. J Acoustical Soc Am 35(3):354–358
Rao CR, Mitra SK (1971) Generalized inverse of matrices and its applications. Wiley, New York
Rosenberg A (1976) Automatic speaker verification: a review. In: Proceedings of the IEEE, vol 64, pp 475–487
Swets JA, Dawes RM, Monahan J (2000) Better decisions through science. Scientific American, pp 82–87
Zhu QY, Huang GB (2004) Source codes of ELM algorithm. In: http://www.ntu.edu.sg/home/egbhuang/, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lan, Y., Hu, Z., Soh, Y.C. et al. An extreme learning machine approach for speaker recognition. Neural Comput & Applic 22, 417–425 (2013). https://doi.org/10.1007/s00521-012-0946-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-0946-x