Abstract
This paper presents a comparison of the discrimination in representing the individual features of speakers between Mel Frequency Cepstrum Coefficients(MFCC) and Line Spectrum Pair Frequencies(LSP). We use Self Organizing Map of Kohonen(SOM) to explore the effectiveness of these two parameters. Because SOM can keep the topological property of the feature space, it helps us to understand the difference directly through the senses. In the experiment, MFCC is derived from FFT and LSP is derived from LPC analysis. To reduce the computation complexity and improve the robustness, LSP parameters are vector quantized by a codebook like in speech coding and a distance weighting is incorporated. SOM is trained by 33 speakers and a codebook with 400 codes. For each speaker, the training utterance is 60 sec. long. The final result shows that these two speech parameters produce very similar feature maps for the same speaker in the general feature space. A correlation criterion gives further verification. Thus, LSP and MFCC coefficients may be considered to be equivalent in Euclidean distance meaning. At the end of the paper, neural networks VQ model method is adopted to compare the experiment validity of these two parameters in text independent speaker identification and both of them achieve satisfactory results.
Preview
Unable to display preview. Download preview PDF.
References
R. Brunelli and D. Falavigna, “Person Identification Using Multiple Cues”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 10, pp. 955–966, 1995.
T. Kohonen, “The Self Organizing Map”, Proc. IEEE, vol. 78, pp. 1464–1480, September 1990.
X.J. Yang and H.S. Chi, “Digital Processing of Speech Signal”, Publishing house of electronics industry, 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yue, P., Qixiu, H., Wenhu, W. (1997). Parameter discrimination analysis in speaker identification using self organizing map. In: Bigün, J., Chollet, G., Borgefors, G. (eds) Audio- and Video-based Biometric Person Authentication. AVBPA 1997. Lecture Notes in Computer Science, vol 1206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0016005
Download citation
DOI: https://doi.org/10.1007/BFb0016005
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62660-2
Online ISBN: 978-3-540-68425-1
eBook Packages: Springer Book Archive