Parameter discrimination analysis in speaker identification using self organizing map

Yue, Pan; Qixiu, Hu; Wenhu, Wu

doi:10.1007/BFb0016005

Pan Yue¹,
Hu Qixiu¹ &
Wu Wenhu¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1206))

Included in the following conference series:

International Conference on Audio- and Video-Based Biometric Person Authentication

2408 Accesses

Abstract

This paper presents a comparison of the discrimination in representing the individual features of speakers between Mel Frequency Cepstrum Coefficients(MFCC) and Line Spectrum Pair Frequencies(LSP). We use Self Organizing Map of Kohonen(SOM) to explore the effectiveness of these two parameters. Because SOM can keep the topological property of the feature space, it helps us to understand the difference directly through the senses. In the experiment, MFCC is derived from FFT and LSP is derived from LPC analysis. To reduce the computation complexity and improve the robustness, LSP parameters are vector quantized by a codebook like in speech coding and a distance weighting is incorporated. SOM is trained by 33 speakers and a codebook with 400 codes. For each speaker, the training utterance is 60 sec. long. The final result shows that these two speech parameters produce very similar feature maps for the same speaker in the general feature space. A correlation criterion gives further verification. Thus, LSP and MFCC coefficients may be considered to be equivalent in Euclidean distance meaning. At the end of the paper, neural networks VQ model method is adopted to compare the experiment validity of these two parameters in text independent speaker identification and both of them achieve satisfactory results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Brunelli and D. Falavigna, “Person Identification Using Multiple Cues”, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 10, pp. 955–966, 1995.
Google Scholar
T. Kohonen, “The Self Organizing Map”, Proc. IEEE, vol. 78, pp. 1464–1480, September 1990.
Google Scholar
X.J. Yang and H.S. Chi, “Digital Processing of Speech Signal”, Publishing house of electronics industry, 1995.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, P.R. China
Pan Yue, Hu Qixiu & Wu Wenhu

Authors

Pan Yue
View author publications
You can also search for this author in PubMed Google Scholar
Hu Qixiu
View author publications
You can also search for this author in PubMed Google Scholar
Wu Wenhu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Josef Bigün Gérard Chollet Gunilla Borgefors

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yue, P., Qixiu, H., Wenhu, W. (1997). Parameter discrimination analysis in speaker identification using self organizing map. In: Bigün, J., Chollet, G., Borgefors, G. (eds) Audio- and Video-based Biometric Person Authentication. AVBPA 1997. Lecture Notes in Computer Science, vol 1206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0016005

Download citation

DOI: https://doi.org/10.1007/BFb0016005
Published: 10 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-62660-2
Online ISBN: 978-3-540-68425-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics