Skip to main content
Log in

Text-dependent Speaker Recognition using Wavelets and Neural Networks

  • Original Paper
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

An intelligent system for text-dependent speaker recognition is proposed in this paper. The system consists of a wavelet-based module as the feature extractor of speech signals and a neural-network-based module as the signal classifier. The Daubechies wavelet is employed to filter and compress the speech signals. The fuzzy ARTMAP (FAM) neural network is used to classify the processed signals. A series of experiments on text-dependent gender and speaker recognition are conducted to assess the effectiveness of the proposed system using a collection of vowel signals from 100 speakers. A variety of operating strategies for improving the FAM performance are examined and compared. The experimental results are analyzed and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Cole RA, Mariani J, Uszkoreit H, Zaenen A, Zue V (1996) Survey of the state of art in human language technology. Cambridge University Press, Cambridge

    Google Scholar 

  2. Bracewell R (1969) The Fourier transform and its applications. McGrawHill, New York

    Google Scholar 

  3. Keiser G (1994) A Friendly guide to wavelets. Birkhauser, Boston, pp 44–45

    Google Scholar 

  4. Burrus CS, Gopinath RA, Guo H (1998) Introduction to wavelets and wavelet transforms: a primer. Prentice Hall, New Jersey

    Google Scholar 

  5. Donoho DL (1993) Unconditional bases are optimal bases for data compression and for statistical estimation. Appl Comput Harmonic Anal 1(1): 100–115

    Article  MathSciNet  Google Scholar 

  6. Daubechies I (1992) Ten lectures on wavelets. SIAM, Philadelphia

    MATH  Google Scholar 

  7. Prunzanky S (1963) Pattern-matching procedure for automatic talker recognition. J Acoustic Soc Am 35: 354–358

    Article  Google Scholar 

  8. Li K, Wrench E (1983) An approach to text-independent speaker recognition with short utterances. In: Proceedings of the ICASSP, pp 555–558

  9. Savic M, Gupta SK (1990) Variable parameter speaker verification system based on hidden Markov modeling. In: Proceedings of the ICASSP, pp 281–284

  10. Higgins AL, Bahler LG, Porter JE (1993) Voice identification using nearest-neighbor distance measure. In: Proceeding of the ICASSP, vol 2, pp 375–378

  11. Rudasi R, Zahorian SA (1991) Text-independent talker identification with neural networks. In: Proceeding of the ICASSP, pp 389–392

  12. Beale R, Jackson T, (1990) Neural computing: an introduction. Institute of Physics, Bristol

    Book  MATH  Google Scholar 

  13. Carpenter GA, Grossberg S (1987) A massively parallel architecture for a self-organizing neural pattern recognition machine. Comput Vis Graph Image Process 37: 54–115

    Article  Google Scholar 

  14. Carpenter GA, Grossberg S, Rosen DB (1991) Fuzzy ART: fast stable learning and categorization of analog patterns by an adaptive resonance system. Neural Networks 4: 795–771

    Google Scholar 

  15. Carpenter GA, Grossberg S, Markuzon N, Reynolds JH, Rosen DB (1992) Fuzzy ARTMAP: a neural network architecture for incremental supervised learning of analog multidimentional maps. IEEE Trans Neural Netw 3:698–713

    Article  Google Scholar 

  16. Vaidyanathan PP, (1987) Quadrature mirror filter banks, M-band extensions and perfect reconstruction techniques. IEEE Acoust Speech Signal Process Maga 4:4–20

    Google Scholar 

  17. Zadeh LA (1965) Fuzzy sets. Information and control. 8: 338–939

    Article  MathSciNet  Google Scholar 

  18. Lim CP, Harrison RF (1997) An incremental adaptive network for on-line supervised learning and probability estimation. Neural Networks 10:925–939

    Article  Google Scholar 

  19. Spech DF (1991) Probabilistic neural network: Neural Networks 3:109–118

    Google Scholar 

  20. Ramachandran RP, Zilovic MS, Mammone RJ (1995) A comparative study of robust linear predictive analysis methods with applications to speaker recognition and speaker identification. IEEE Trans Speech Audio Process 3:117–125

    Article  Google Scholar 

  21. Kuan MM, Lim CP, Harrison RF (2003) On operating strategies of the Fuzzy ARTMAP neural network: A Comparative Study. Int J Comput Intell Appl 3:23–43

    Article  Google Scholar 

  22. Dagher I, Geogiopoulos M, Bebis G (1999) An ordering algorithm for pattern presentation in Fuzzy ARTMAP that tends to improve generalization performance. IEEE Trans Neural Networks 10:768–778

    Article  Google Scholar 

  23. Efron B (1970) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chee Peng Lim.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lim, C.P., Woo, S.C. Text-dependent Speaker Recognition using Wavelets and Neural Networks. Soft Comput 11, 549–556 (2007). https://doi.org/10.1007/s00500-006-0099-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-006-0099-x

Keywords

Navigation