Abstract
In this paper, we propose a new text-independent speaker identification method using VQ and MLP. It consists of three parts: a new spectral peak analysis based feature extraction, speaker clustering and model selection using VQ, and MLP based speaker identification. The feature vector reflects the speaker specific characteristics and has a long-term feature for which makes it text-independent. The proposed method has a computational efficient for feature extraction and identification. To evaluate the proposed method, we calculated the correct identification ratio (CIR), the average CIR of the proposed and GMM method was 92.27% and 85.78% for 5 seconds segments in 15-speaker identification. Experimental results, we have achieved a performance comparable to GMM-method.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Joseph, P., Campbell, J.R.: Speaker Recognition A Tutorial. Proceeding of The IEEE 85(9), 1437–1462 (1997)
Sadaoki, F.: Recent Advances in Speaker Recognition. Pattern Recognition Letter 18, 859–872 (1997)
Herbert, G., Michael, S.: Text-independent Speaker Identification. IEEE Signal Processing Magazine, 18–32 (1994)
Reynolds, D.A., Rose, R.C.: Robust Text-independent Speaker Identification using Gaussian Mixture Speaker Models. IEEE Trans. on Speech and Audio Processing 3(1), 72–83 (1995)
Narayanaswamy, B., Gangadharaiah, R.: Extracting Additional Information from Gaussian Mixture Model Probabilities for Improved Text-Independent Speaker Identification. In: IEEE International Conference on Acoustics Speech and Signal Processing, vol. 1, pp. 621–624 (2005)
Farrell, K.R., Mammone, R.J., Assaleh, K.T.: Speaker Recognition using Neural Networks and Conventional Classifiers. IEEE Trans. on Speech and Audio Processing 2(1), 194–205 (1994)
Hiroaki, H.: Text-Independent Speaker Recognition using Neural Networks. IEICE Trans. INF. & SYST. E76-D(3), 345–351 (1993)
Lu, L., Zhang, H.J., Jiang, H.: Content Analysis for Audio Classification and Segmentation. IEEE Trans. on Speech and Audio Processing 10(7), 504–516 (2002)
Zhang, T., Kuo, J.: Audio Content Analysis for Online Audiovisual Data Segmentation and Classification. IEEE Trans. on Speech and Audio Processing 9(4), 441–457 (2001)
Keum, J.S., Lee, H.S.: Speaker Change Detection Based on Spectral Peak Track Analysis for Korean Broadcast News. In: The Fifth International Conference on Information Communications and Signal Processing, pp. 724–728 (2005)
Mohamed, Q.: Vector Quantization, http://www.geocities.com/mohamedqasem/vectorquantization/vq.html
Laurene, F.: Fundamentals of Neural Networks. Prentice Hall, Englewood Cliffs (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Keum, JS., Park, CH., Lee, HS. (2006). A New Text-Independent Speaker Identification Using Vector Quantization and Multi-layer Perceptron. In: Wang, J., Yi, Z., Zurada, J.M., Lu, BL., Yin, H. (eds) Advances in Neural Networks - ISNN 2006. ISNN 2006. Lecture Notes in Computer Science, vol 3972. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11760023_25
Download citation
DOI: https://doi.org/10.1007/11760023_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34437-7
Online ISBN: 978-3-540-34438-4
eBook Packages: Computer ScienceComputer Science (R0)