Abstract
This paper compares the feature sets extracted using frequency-time analysis approach and time-frequency analysis approach for text-independent speaker identification. The impetus for the frequency-time analysis approach comes from the band pass filtering view of STFT. Nyquist filter bank and Gaussian filter bank both have been used for extracting features using frequency-time analysis approach. Experimental evaluation was conducted on the POLYCOST database with 130 speakers using Gaussian mixture speaker model. Results reveal that, the feature sets extracted using frequency-time analysis approach performs significantly better compared to the feature set extracted using time-frequency analysis approach.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Davis, S.B., Mermelsteine, P.: Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust., Speech, Signal Processing ASSP-28(4), 357–365 (1980)
Reynolds, D.A., Rose, R.C.: Robust Text-Independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech and Audio Processing 3(1), 72–83 (1995)
Hayakawa, S., Itakura, F.: Text-dependent speaker recognition using the information in the higher frequency band. In: ICASSP 1994, pp. 137–140 (1994)
Besacier, L., Bonastre, J.-F.: Subband architecture for automatic speaker recognition. Signal Processing 80(7), 1245–1259 (2000)
Lu, X., Dang, J.: An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification. Speech Communication 50, 312–322 (2008)
Chakroborty, S., Roy, A., Saha, G.: Improved closed set text-independent speaker identification by combining MFCC with evidence from flipped filter banks. International Journal of Signal Processing 4(2), 1304–4478 (2007) ISSN 1304-4478
Sen, N., Basu, T.K.: A New Nyquist window with near optimal time-bandwidth product. In: IEEE Conference INDICON (2009)
Sen, N., Patil, H.A., Basu, T.K.: A New transform for robust Text-Independent speaker identification. In: IEEE Conference INDICON (2009)
Sen, N., Basu, T.K., Patil, H.A.: Significant improvement in the closed set text-independent speaker identification using features extracted from Nyquist filterbank. In: 5th International Conference on Industrial and Information Systems, ICIIS 2010, pp. 61–66 (2010)
Quatieri, T.F.: Discrete-Time Speech Signal Processing Principles and Practice. Pearson Education, London
Haykin, S., Veen, B.V.: Signals and Systems. John Wiley & Sons, Inc., Chichester (2001)
Petrovska, D., et al.: POLYCOST: A Telephonic speech database for speaker recognition. In: RLA2C, Avignon, France, April 20-23, pp. 211–214 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sen, N., Basu, T.K. (2011). Features Extracted Using Frequency-Time Analysis Approach from Nyquist Filter Bank and Gaussian Filter Bank for Text-Independent Speaker Identification. In: Vielhauer, C., Dittmann, J., Drygajlo, A., Juul, N.C., Fairhurst, M.C. (eds) Biometrics and ID Management. BioID 2011. Lecture Notes in Computer Science, vol 6583. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19530-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-19530-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19529-7
Online ISBN: 978-3-642-19530-3
eBook Packages: Computer ScienceComputer Science (R0)