Text-dependent Speaker Recognition using Wavelets and Neural Networks

Lim, Chee Peng; Woo, Siew Chan

doi:10.1007/s00500-006-0099-x

Text-dependent Speaker Recognition using Wavelets and Neural Networks

Original Paper
Published: 31 May 2006

Volume 11, pages 549–556, (2007)
Cite this article

Soft Computing Aims and scope Submit manuscript

Chee Peng Lim¹ &
Siew Chan Woo¹

112 Accesses
4 Citations
Explore all metrics

Abstract

An intelligent system for text-dependent speaker recognition is proposed in this paper. The system consists of a wavelet-based module as the feature extractor of speech signals and a neural-network-based module as the signal classifier. The Daubechies wavelet is employed to filter and compress the speech signals. The fuzzy ARTMAP (FAM) neural network is used to classify the processed signals. A series of experiments on text-dependent gender and speaker recognition are conducted to assess the effectiveness of the proposed system using a collection of vowel signals from 100 speakers. A variety of operating strategies for improving the FAM performance are examined and compared. The experimental results are analyzed and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Cole RA, Mariani J, Uszkoreit H, Zaenen A, Zue V (1996) Survey of the state of art in human language technology. Cambridge University Press, Cambridge
Google Scholar
Bracewell R (1969) The Fourier transform and its applications. McGrawHill, New York
Google Scholar
Keiser G (1994) A Friendly guide to wavelets. Birkhauser, Boston, pp 44–45
Google Scholar
Burrus CS, Gopinath RA, Guo H (1998) Introduction to wavelets and wavelet transforms: a primer. Prentice Hall, New Jersey
Google Scholar
Donoho DL (1993) Unconditional bases are optimal bases for data compression and for statistical estimation. Appl Comput Harmonic Anal 1(1): 100–115
Article MathSciNet Google Scholar
Daubechies I (1992) Ten lectures on wavelets. SIAM, Philadelphia
MATH Google Scholar
Prunzanky S (1963) Pattern-matching procedure for automatic talker recognition. J Acoustic Soc Am 35: 354–358
Article Google Scholar
Li K, Wrench E (1983) An approach to text-independent speaker recognition with short utterances. In: Proceedings of the ICASSP, pp 555–558
Savic M, Gupta SK (1990) Variable parameter speaker verification system based on hidden Markov modeling. In: Proceedings of the ICASSP, pp 281–284
Higgins AL, Bahler LG, Porter JE (1993) Voice identification using nearest-neighbor distance measure. In: Proceeding of the ICASSP, vol 2, pp 375–378
Rudasi R, Zahorian SA (1991) Text-independent talker identification with neural networks. In: Proceeding of the ICASSP, pp 389–392
Beale R, Jackson T, (1990) Neural computing: an introduction. Institute of Physics, Bristol
Book MATH Google Scholar
Carpenter GA, Grossberg S (1987) A massively parallel architecture for a self-organizing neural pattern recognition machine. Comput Vis Graph Image Process 37: 54–115
Article Google Scholar
Carpenter GA, Grossberg S, Rosen DB (1991) Fuzzy ART: fast stable learning and categorization of analog patterns by an adaptive resonance system. Neural Networks 4: 795–771
Google Scholar
Carpenter GA, Grossberg S, Markuzon N, Reynolds JH, Rosen DB (1992) Fuzzy ARTMAP: a neural network architecture for incremental supervised learning of analog multidimentional maps. IEEE Trans Neural Netw 3:698–713
Article Google Scholar
Vaidyanathan PP, (1987) Quadrature mirror filter banks, M-band extensions and perfect reconstruction techniques. IEEE Acoust Speech Signal Process Maga 4:4–20
Google Scholar
Zadeh LA (1965) Fuzzy sets. Information and control. 8: 338–939
Article MathSciNet Google Scholar
Lim CP, Harrison RF (1997) An incremental adaptive network for on-line supervised learning and probability estimation. Neural Networks 10:925–939
Article Google Scholar
Spech DF (1991) Probabilistic neural network: Neural Networks 3:109–118
Google Scholar
Ramachandran RP, Zilovic MS, Mammone RJ (1995) A comparative study of robust linear predictive analysis methods with applications to speaker recognition and speaker identification. IEEE Trans Speech Audio Process 3:117–125
Article Google Scholar
Kuan MM, Lim CP, Harrison RF (2003) On operating strategies of the Fuzzy ARTMAP neural network: A Comparative Study. Int J Comput Intell Appl 3:23–43
Article Google Scholar
Dagher I, Geogiopoulos M, Bebis G (1999) An ordering algorithm for pattern presentation in Fuzzy ARTMAP that tends to improve generalization performance. IEEE Trans Neural Networks 10:768–778
Article Google Scholar
Efron B (1970) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Electronic Engineering, University of Science Malaysia, Engineering Campus, 14300, Nibong Tebal, Penang, Malaysia
Chee Peng Lim & Siew Chan Woo

Authors

Chee Peng Lim
View author publications
You can also search for this author in PubMed Google Scholar
Siew Chan Woo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chee Peng Lim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lim, C.P., Woo, S.C. Text-dependent Speaker Recognition using Wavelets and Neural Networks. Soft Comput 11, 549–556 (2007). https://doi.org/10.1007/s00500-006-0099-x

Download citation

Published: 31 May 2006
Issue Date: April 2007
DOI: https://doi.org/10.1007/s00500-006-0099-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text-dependent Speaker Recognition using Wavelets and Neural Networks

Abstract

Access this article

Similar content being viewed by others

A Robust Wavelet Based Decomposition and Multilayer Neural Network for Speaker Identification

An Artificial Neural Networks Model by Using Wavelet Analysis for Speaker Recognition

Text dependant speaker recognition using MFCC, LPC and DWT

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Text-dependent Speaker Recognition using Wavelets and Neural Networks

Abstract

Access this article

Similar content being viewed by others

A Robust Wavelet Based Decomposition and Multilayer Neural Network for Speaker Identification

An Artificial Neural Networks Model by Using Wavelet Analysis for Speaker Recognition

Text dependant speaker recognition using MFCC, LPC and DWT

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation