Assessment of a Speaker Recognition System Based on an Auditory Model and Neural Nets

Martínez–Rams, Ernesto A.; Garcerán–Hernández, Vicente

doi:10.1007/978-3-642-02267-8_52

Ernesto A. Martínez–Rams²⁰ &
Vicente Garcerán–Hernández²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5602))

Included in the following conference series:

International Work-Conference on the Interplay Between Natural and Artificial Computation

963 Accesses
4 Citations

Abstract

This paper deals with a new speaker recognition system based on a model of the human auditory system. Our model is based on a human nonlinear cochlear filter-bank and Neural Nets.

The efficiency of this system has been tested using a number of Spanish words from the ‘Ahumada’ database as uttered by a native male speaker. These words were fed into the cochlea model and their corresponding outputs were processed with an envelope component extractor, yielding five parameters that convey different auditory sensations (loudness, roughness and virtual tones).

Because this process generates large data sets, the use of multivariate statistical methods and Neural Nets was appropriate. A variety of normalization techniques and classifying methods were tested on this biologically motivated feature set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lopez-Poveda, E.A., Meddis, R.: A human nonlinear cochlear filterbank. J. Acoust. Soc. Am. 110(6), 3107–3118 (2001)
Article Google Scholar
Atal, B.S., Hanauer, S.L.: Speech analysis and synthesis by linear prediction of the speech wave. Journal of The American Acoustics Society 50, 637–655 (1971)
Article Google Scholar
Merkel, J.D., Gray, A.H.: Linear prediction of speech. Springer, Heidelberg (1976)
Book Google Scholar
Furui, S.: Cepstral analysis techniques for automatic speaker verification. IEEE Transaction on Acoustics, Speech and Signal Processing 27, 254–277 (1981)
Article Google Scholar
Mermelstein, P.: Distance measures for speech recognition, psychological and instrumental. In: Chen, C.H. (ed.) Pattern Recognition and Artificial Intelligence, pp. 374–388. Academic, New York (1976)
Google Scholar
Gunnar Fant. Acoustic Theory of Speech Production. Mouton 1970. The Hague, Paris (1970)
Google Scholar
von Békésy, G.: Experiments in Hearing. McGraw-Hill, New York (1960); reprinted in 1989
Google Scholar
Anderson, T.R.: A comparison of auditory models for speaker independent phoneme recognition. In: Proceedings of the 1993 International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 231–234 (1993)
Google Scholar
Anderson, T.R.: Speaker independent phoneme recognition with an auditory model and a neural network: a comparison with traditional techniques. In: Proceedings of the Acoustics, Speech, and Signal Processing, pp. 149–152 (1991)
Google Scholar
Anderson, T.R.: Auditory models with Kohonen SOFM and LVQ for speaker Independent Phoneme Recognition. In: IEEE International Conference on Neural Networks, vol. 7, pp. 4466–4469 (1994)
Google Scholar
Jankowski Jr., C.R., Lippmann, R.P.: Comparison of auditory models for robust speech recognition. In: Proceedings of the workshop on Speech and Natural Language, pp. 453–454 (1992)
Google Scholar
Kasper, K., Reininger, H., Wolf, D.: Exploiting the potential of auditory preprocessing for robust speech recognition by locally recurrent neural networks. In: Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), pp. 1223–1226 (1997)
Google Scholar
Kim, D.-S., Lee, S.-Y., Hil, R.M.: Auditory processing of speech signals for robust speech recognition in real-world noisy environments. IEEE Transactions on Speech and Audio Processing, 55–69 (1999)
Google Scholar
Koizumi, T., Mori, M., Taniguchi, S.: Speech recognition based on a model of human auditory system. In: 4th International Conference on Spoken Language Processing, pp. 937–940 (1996)
Google Scholar
Hunt, M.J., Lefébvre, C.: Speaker dependent and independent speech recognition experiments with an auditory model. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 215–218 (1988)
Google Scholar
Colombi, J.M., Anderson, T.R., Rogers, S.K., Ruck, D.W., Warhola, G.T.: Auditory model representation and comparison for speaker recognition. In: IEEE International Conference on Neural Networks, pp. 1914–1919 (1993)
Google Scholar
Colombi, J.M.: Cepstral and Auditory Model Features for Speaker Recognition. Master’s thesis (1992)
Google Scholar
Shao, Y., Wang, D.: Robust speaker identification using auditory features and computational auditory scene analysis. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 1589–1592 (2008)
Google Scholar
Ortega-Garcia, J., González-Rodriguez, J., Marrero-Aguiar, V., et al.: Ahumada: A large speech corpus in Spanish for speaker identification and verification. Speech Communication 31(2-3), 255–264 (2000)
Article Google Scholar
Shamma, S.A., Chadwich, R.S., Wilbur, W.J., Morrish, K.A., Rinzel, J.: A biophysical model of cochlear processing: intensity dependence of pure tone responses. J. Acoust. Soc. Am. 80(1), 133–145 (1986)
Article Google Scholar
Poveda, E.A.L., Eustaquio-Martín, A.: A biophysical model of the Inner Hair Cell: The contribution of potassium currents to peripherical auditory compression. Journal of the Association for Research in Otolaryngology. JARO 7, 218–235 (2006)
Article Google Scholar
Martínez-Rams, E., Garcerán-Hernández, V., Ferrández-Vicente, J.M.: Low rate stochastic strategy for cochlear implants. Neurocomputing 72(4-6), 936–943 (2009)
Article Google Scholar
Martens, J.-P., Van Immerseel, L.: An auditory based on the analysis of envelope patterns. In: International Conference on Acoustic, Speech and Signal Processing, ICASSP 1990, vol. 1, pp. 401–404 (1990)
Google Scholar
Immerseel, L.V., Martens, J.P.: Pitch and voiced/unvoiced determination with a auditory model. J. Acoust. Soc. Am. 91(6), 3511–3526 (1992)
Article Google Scholar
Kohonen, T.: Self-Organization and associative Memory, 3rd edn. Springer, Berlin (1989)
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Universidad de Oriente, Avenida de la América s/n, Santiago de Cuba, Cuba
Ernesto A. Martínez–Rams
Antiguo Cuartel de Antiguones, Universidad Politécnica de Cartagena, (Campus de la Muralla), Cartagena, 30202, Murcia, España
Vicente Garcerán–Hernández

Authors

Ernesto A. Martínez–Rams
View author publications
You can also search for this author in PubMed Google Scholar
Vicente Garcerán–Hernández
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dapartamento de Inteligencia Artificial, Universidad Nacional de Educación a Distancia, E.T.S. de Ingeniería Informática, Juan del Rosal, 16, 28040, Madrid, Spain
José Mira & Félix de la Paz &
Departamento de Electrónica, Tecnología de Computadores y Proyectos, Universidad Politécnica de Cartagena,, Pl. Hospital, 1, 30201, Cartagena, Spain
José Manuel Ferrández
Departamento de Inteligencia Artificial, Universidad Nacional de Educación a Distancia, E.T.S. de Ingeniería Informática, Juan del Rosal, 16, 28040, Madrid, Spain
José R. Álvarez
Departamento de Electrónica, Tecnología de Computadoras y Proyectos, Universidad Politécnica de Cartagena, Pl. Hospital, 1, 30201, Cartagena
F. Javier Toledo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martínez–Rams, E.A., Garcerán–Hernández, V. (2009). Assessment of a Speaker Recognition System Based on an Auditory Model and Neural Nets. In: Mira, J., Ferrández, J.M., Álvarez, J.R., de la Paz, F., Toledo, F.J. (eds) Bioinspired Applications in Artificial and Natural Computation. IWINAC 2009. Lecture Notes in Computer Science, vol 5602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02267-8_52

Download citation

DOI: https://doi.org/10.1007/978-3-642-02267-8_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02266-1
Online ISBN: 978-3-642-02267-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics