Abstract
Log area ratio coefficients (LAR) derived from linear prediction coefficients (LPC) is a well known feature extraction technique used in speech applications. This paper presents a novel way to use the LAR feature in a speaker identification system. Here, instead of using the mel frequency cepstral coefficients (MFCC), the LAR feature is used in a Gaussian mixture model (GMM) based speaker identification system. An F-ratio feature analysis was conducted on both the LAR and MFCC feature vectors which showed the lower order LAR coefficients are superior to MFCC counterpart. The text- independent, closed-set speaker identification rate, as tested on the down- sampled version of TIMIT database, was improved from 96.73%, using the MFCC feature, to 98.81%, using the LAR features.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Premakanthan, P., Mikhad, W.B.: Speaker Verification/Recognition and the Importance of Selective Feature Extraction: Review. MWSCAS 1, 57–61 (2001)
Orman, O.D.: Frequency Analysis of Speaker Identification Performance. Master thesis, Bo aziçi University (2000)
Sanderson, S.: Automatic Person Verification Using Speech and Face Information. PhD thesis. Griffith University (2002)
Petry, A., Barone, D.A.C.: Fractal Dimension Applied to Speaker Identification. In: ICASSP (Salt Lake City). May 7-11, pp. 405–408 (2001)
Liu, C.H., Chen, O.T.C.: A Text-Independent Speaker Identification System Using PARCOR and AR Model. MWSCAS 3, 332–335 (2002)
Marvin, R.S.: Speaker Recognition Using Orthogonal Linear Prediction. IEEE Transactions on Acoustic, Speech and Signal Processing 24, 283–289 (1976)
Makhoul, J.: Linear Prediction: A Tutorial Review. Proceedings of the IEEE 63, 561–579 (1975)
Reynolds, D.A.: Speaker identification and verification using Gaussian mixture speaker models. Speech Communication 17, 91–108 (1995)
Campell Jr., J.P.: Speaker recognition: a tutorial. Speaker recognition: a tutorial 85, 1437–1462 (1997)
Karpov, E.: Real-Time Speaker Identification. Master thesis, University of Joensuu (2003)
Bilmes, J.A.: A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models. Technical Report, University of Berkeley (1998)
Rabiner, L., Sambur, B.: An Algorithm for Determining the Endpoints of Isolated Utterances. The Bell System Technical Journal 54, 297–315 (1975)
Linde, Y., Buzo, A., Gray, R.: An Algorithm for Vector Quantizer Design. IEEE Transactions on Communications 28(1), 84–95 (1980)
Paliwal, K.K.: Dimensionality Reduction of the Enhanced Feature Set for the HMMBased Speech Recognizer. Digital Signal Processing 2, 157–173 (1992)
Reynolds, D.A., Zissman, M.A., Quatieri, T.F., O’Leary, G.C., Carlson, B.A.: The Effects of Telephone Transmission Degradations on Speaker Recognition Performance. In: ICASSP (Detroit). May 9-12, pp. 329–331 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chow, D., Abdulla, W.H. (2004). Speaker Identification Based on Log Area Ratio and Gaussian Mixture Models in Narrow-Band Speech. In: Zhang, C., W. Guesgen, H., Yeap, WK. (eds) PRICAI 2004: Trends in Artificial Intelligence. PRICAI 2004. Lecture Notes in Computer Science(), vol 3157. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28633-2_95
Download citation
DOI: https://doi.org/10.1007/978-3-540-28633-2_95
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22817-2
Online ISBN: 978-3-540-28633-2
eBook Packages: Springer Book Archive