Abstract
This paper deals with a new speaker recognition system based on a model of the human auditory system. Our model is based on a human nonlinear cochlear filter-bank and Neural Nets.
The efficiency of this system has been tested using a number of Spanish words from the ‘Ahumada’ database as uttered by a native male speaker. These words were fed into the cochlea model and their corresponding outputs were processed with an envelope component extractor, yielding five parameters that convey different auditory sensations (loudness, roughness and virtual tones).
Because this process generates large data sets, the use of multivariate statistical methods and Neural Nets was appropriate. A variety of normalization techniques and classifying methods were tested on this biologically motivated feature set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lopez-Poveda, E.A., Meddis, R.: A human nonlinear cochlear filterbank. J. Acoust. Soc. Am. 110(6), 3107–3118 (2001)
Atal, B.S., Hanauer, S.L.: Speech analysis and synthesis by linear prediction of the speech wave. Journal of The American Acoustics Society 50, 637–655 (1971)
Merkel, J.D., Gray, A.H.: Linear prediction of speech. Springer, Heidelberg (1976)
Furui, S.: Cepstral analysis techniques for automatic speaker verification. IEEE Transaction on Acoustics, Speech and Signal Processing 27, 254–277 (1981)
Mermelstein, P.: Distance measures for speech recognition, psychological and instrumental. In: Chen, C.H. (ed.) Pattern Recognition and Artificial Intelligence, pp. 374–388. Academic, New York (1976)
Gunnar Fant. Acoustic Theory of Speech Production. Mouton 1970. The Hague, Paris (1970)
von Békésy, G.: Experiments in Hearing. McGraw-Hill, New York (1960); reprinted in 1989
Anderson, T.R.: A comparison of auditory models for speaker independent phoneme recognition. In: Proceedings of the 1993 International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 231–234 (1993)
Anderson, T.R.: Speaker independent phoneme recognition with an auditory model and a neural network: a comparison with traditional techniques. In: Proceedings of the Acoustics, Speech, and Signal Processing, pp. 149–152 (1991)
Anderson, T.R.: Auditory models with Kohonen SOFM and LVQ for speaker Independent Phoneme Recognition. In: IEEE International Conference on Neural Networks, vol. 7, pp. 4466–4469 (1994)
Jankowski Jr., C.R., Lippmann, R.P.: Comparison of auditory models for robust speech recognition. In: Proceedings of the workshop on Speech and Natural Language, pp. 453–454 (1992)
Kasper, K., Reininger, H., Wolf, D.: Exploiting the potential of auditory preprocessing for robust speech recognition by locally recurrent neural networks. In: Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), pp. 1223–1226 (1997)
Kim, D.-S., Lee, S.-Y., Hil, R.M.: Auditory processing of speech signals for robust speech recognition in real-world noisy environments. IEEE Transactions on Speech and Audio Processing, 55–69 (1999)
Koizumi, T., Mori, M., Taniguchi, S.: Speech recognition based on a model of human auditory system. In: 4th International Conference on Spoken Language Processing, pp. 937–940 (1996)
Hunt, M.J., Lefébvre, C.: Speaker dependent and independent speech recognition experiments with an auditory model. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 215–218 (1988)
Colombi, J.M., Anderson, T.R., Rogers, S.K., Ruck, D.W., Warhola, G.T.: Auditory model representation and comparison for speaker recognition. In: IEEE International Conference on Neural Networks, pp. 1914–1919 (1993)
Colombi, J.M.: Cepstral and Auditory Model Features for Speaker Recognition. Master’s thesis (1992)
Shao, Y., Wang, D.: Robust speaker identification using auditory features and computational auditory scene analysis. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 1589–1592 (2008)
Ortega-Garcia, J., González-Rodriguez, J., Marrero-Aguiar, V., et al.: Ahumada: A large speech corpus in Spanish for speaker identification and verification. Speech Communication 31(2-3), 255–264 (2000)
Shamma, S.A., Chadwich, R.S., Wilbur, W.J., Morrish, K.A., Rinzel, J.: A biophysical model of cochlear processing: intensity dependence of pure tone responses. J. Acoust. Soc. Am. 80(1), 133–145 (1986)
Poveda, E.A.L., Eustaquio-Martín, A.: A biophysical model of the Inner Hair Cell: The contribution of potassium currents to peripherical auditory compression. Journal of the Association for Research in Otolaryngology. JARO 7, 218–235 (2006)
Martínez-Rams, E., Garcerán-Hernández, V., Ferrández-Vicente, J.M.: Low rate stochastic strategy for cochlear implants. Neurocomputing 72(4-6), 936–943 (2009)
Martens, J.-P., Van Immerseel, L.: An auditory based on the analysis of envelope patterns. In: International Conference on Acoustic, Speech and Signal Processing, ICASSP 1990, vol. 1, pp. 401–404 (1990)
Immerseel, L.V., Martens, J.P.: Pitch and voiced/unvoiced determination with a auditory model. J. Acoust. Soc. Am. 91(6), 3511–3526 (1992)
Kohonen, T.: Self-Organization and associative Memory, 3rd edn. Springer, Berlin (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martínez–Rams, E.A., Garcerán–Hernández, V. (2009). Assessment of a Speaker Recognition System Based on an Auditory Model and Neural Nets. In: Mira, J., Ferrández, J.M., Álvarez, J.R., de la Paz, F., Toledo, F.J. (eds) Bioinspired Applications in Artificial and Natural Computation. IWINAC 2009. Lecture Notes in Computer Science, vol 5602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02267-8_52
Download citation
DOI: https://doi.org/10.1007/978-3-642-02267-8_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02266-1
Online ISBN: 978-3-642-02267-8
eBook Packages: Computer ScienceComputer Science (R0)