A biological front-end processing for speech recognition

Ferrández, J. M.; del Valle, D.; Rodellar, V.; Gómez, P.

doi:10.1007/BFb0032565

J. M. Ferrández¹,
D. del Valle¹,
V. Rodellar¹ &
…
P. Gómez¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1240))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

Abstract

In this paper, a new method for extracting formants is provided. It presents low computational cost compared with other similar models and the robustness inherent to physiological systems. This will be achieved using a place-time strategy which uses temporal information for low characteristic frequency fibers and spatial information for higher ones.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pitton JW, Wang K. and Juang BH: Time-Frequency Analysis and Auditory Modelling for Automatic Recognition of Speech. Proceedings of the IEEE, Vol 84, no.9, September 1996, pp 1199–1215.
Google Scholar
Deller JR, Proakis JG, Hensen JHL: Discrete-Time Processing of Speech Signals., Edit. Me Millan, 1993.
Google Scholar
Davis S. B. and Mermelstein, P: Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE, Transactions on ASSP, vol 28, 1980, pp. 357–366.
Google Scholar
Hermansky H: Perceptual Linear Predictive (PLP) Analysis for Speech. J. Acoust. Soc. Am. vol 87, no. 4, 1990, pp. 1738–1752.
Google Scholar
Hermansky H: RASTA Processing of Speech. Hermansky H. IEEE, Transactions on Speech Audio Processing, vol2, no. 4, 1994, pp. 578–589.
Google Scholar
Suga, N: Basic Acoustic Patterns and Neural Mechanism Shared By Humans and Animals for Auditory Perception: A Neuroethologist's view. Proceedings of Workshop on the Auditory bases of Speech Perception, ESCA, July 1996, pp. 31–38.
Google Scholar
Secker H. and Searle C: Time Domain Analysis of Auditory-Nerve Fibers Firing Rates. J. Acoust. Soc. Am. 88 (3), 1990, pp. 1427–1436.
Google Scholar
Schreiner C.E: Order and Disorder in Auditory Cortical Maps. Curr. Op. Neurobiol., 5, pp. 489–496.
Google Scholar
Mendelson JR, Cynader MS: Sensitivity of Cat Primary Auditory Cortex (AI) Neurons to the Direction and Rate of Frequency Modulation. Brain Research, 327, 1985, pp 331–335.
Google Scholar
Rauschecker JP, Tian B, Hauser M: Processing of Complex Sounds in the Macaque Nonprimary Auditory Cortex. Science, vol. 268, 7 April 995, pp 111–114.
Google Scholar
Suga, N: Cortical Computational Maps for Auditory Imaging. Neural Networks, 3, 1990, pp. 3–21.
Google Scholar
Sams M, Salmening, R.: Evidence of sharp frequency tuning in human auditory cortex. Hearing Research, 75, 1994, pp. 67–74.
Google Scholar
Ojemann G.A: Organization of language cortex derived from investigation during neurosurgery. Sem. Neuros. 2, 1990, pp. 297–305.
Google Scholar
Ferrández J.M., Del Valle D., Rodellar V. and Gómez P. A Neural Network Hierarchical Model for Speech Recognition Based on Biological Plausibility. Proceedings of International Workshop on Machine Learning, U. Carlos III, Getafe 1996, pp. 53–64.
Google Scholar
Slaney M: An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank. Apple Computer TR#35
Google Scholar
Meddis R: Simulator of mechanical to neural transduction in the auditory receptor. J. Acoust. Soc Am 79(3), 1986, pp. 702–711.
Google Scholar
Assmann P.F. and Summerfield Q: Modelling the perception of concurrent vowels: Vowels with different fundamental frequencies. J. Acoust. Soc. Am. 88(2), 1990, pp. 680–697.
Google Scholar
Patterson RD, Anderson TR, Allerhand M: The Auditory Image Model as a Pre-processor for Spoken Language. ICSLP 1994, pp. 1395–1398.
Google Scholar
Shamma S.A: Speech Processing in the Auditory System I. The representation of Speech Sounds in the Responses of Auditory Nerve. J. Acoust. Soc. Am. 78(5), 1985, pp. 1612–1632.
Google Scholar
Ghitza O: “Auditory Nerve Representation as a Front-End for Speech Recognition in a Noisy Environment” Computer, Speech and Language, vol. 1, 1986, pp. 109–130.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratorio de Comunicación Oral Robert W. Newcomb Facultad de Informática, Universidad Politécnica de Madrid, Campus de Montegancedo-Boadilla del Monte, 28660, Madrid, Spain
J. M. Ferrández, D. del Valle, V. Rodellar & P. Gómez

Authors

J. M. Ferrández
View author publications
You can also search for this author in PubMed Google Scholar
D. del Valle
View author publications
You can also search for this author in PubMed Google Scholar
V. Rodellar
View author publications
You can also search for this author in PubMed Google Scholar
P. Gómez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

José Mira Roberto Moreno-Díaz Joan Cabestany

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ferrández, J.M., del Valle, D., Rodellar, V., Gómez, P. (1997). A biological front-end processing for speech recognition. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032565

Download citation

DOI: https://doi.org/10.1007/BFb0032565
Published: 18 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63047-0
Online ISBN: 978-3-540-69074-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics