Abstract
A comprehensive view of speech and voice technologies is now demanding better and more complex tools amenable of extracting as much knowledge about sound and speech as possible. Many knowledge-extraction tasks from speech and voice share well-known procedures at the algorithmic level under the point of view of bio-inspiration. The same resources employed to decode speech phones may be used in the characterization of the speaker (gender, age, speaking group, etc.). Based on these facts the present paper examines a hierarchy of sound processing levels at the auditory and perceptual levels on the brain neural paths which can be translated into a bio-inspired audio-processing architecture. Through this paper its fundamental characteristics are analyzed in relation with current tendencies in cognitive audio processing. Examples extracted from speech processing applications in the domain of acoustic-phonetics are presented. These may find applicability in speaker’s characterization, forensics, and biometry, among others.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Delattre, P., Liberman, A., Cooper, F.: Acoustic loci and transitional cues for consonants. J. Acoust. Soc. Am. 27, 769–773 (1955)
Deller, J.R., Proakis, J.G., Hansen, J.H.: Discrete-Time Processing of Speech Signals. Macmillan, New York (1993)
Gómez, P., Godino, J.I., Alvarez, A., Martínez, R., Nieto, V., Rodellar, V.: Evidence of Glottal Source Spectral Features found in Vocal Fold Dynamics. In: Proc. of the ICASSP’05, pp. 441–444 (2005)
Hermansky, H.: Should Recognizers Have Ears? In: ESCA-NATO Tutorial and Research Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-à-Mousson, France, 17-18 April 1997, pp. 1–10 (1997)
Ferrández, J.M.: Study and Realization of a Bio-inspired Hierarchical Architecture for Speech Recognition. Ph.D. Thesis (in Spanish), Universidad Politécnica de Madrid (1998)
Gómez, P., Martínez, R., Rodellar, V., Ferrández, J.M.: Bio-inspired Systems in Speech Perception: An overview and a study case. In: IEEE/NML Life Sciences Systems and Applications Workshop (by invitation), National Institute of Health, Bethesda, Maryland, July 13-14 (2006)
Haykin, S.: Neural Networks - A comprehensive Foundation. Prentice-Hall, Upper Saddle River (1999)
Irino, T., Patterson, R.D.: A time-domain, level-dependent auditory filter: the gammachirp. J. Acoust. Soc. Am. 101(1), 412–419 (1997)
Jahne, B.: Digital Image Processing. Springer, Berlin (2005)
Mendelson, J.R., Cynader, M.S.: Sensitivity of Cat Primary Auditory Cortex (AI) Neurons to the Direction and Rate of Frequency Modulation. Brain Research 327, 331–335 (1985)
Mountcastle, V.B.: The columnar organization of the neocortex. Brain 120, 701–722 (1997)
Ojemann, G.A.: Organization of language cortex derived from investigation during neurosurgery. Sem. Neuros. 2, 297–305 (1990)
O’Shaughnessy, D.: Speech Communication. IEEE Press, Los Alamitos (2000)
Rauschecker, J.P., Tian, B., Hauser, M.: Processing of Complex Sounds in the Macaque Nonprimary Auditory Cortex. Science 268, 111–114 (1995)
Sams, M., Salmening, R.: Evidence of sharp frequency tuning in human auditory cortex. Hearing Research 75, 67–74 (1994)
Schreiner, C.E.: Time Domain Analysis of Auditory-Nerve Fibers Firing Rates. Curr. Op. Neurobiol. 5, 489–496 (1995)
Secker, H., Searle, C.: Study and Realization of a Bio-inspired Hierarchical Architecture for Speech Recognition. J. Acoust. Soc. Am. 88(3), 1427–1436 (1990)
Sejnowski, T.J., Rosenberg, C.R.: Parallel networks that learn to pronounce English text. Complex Systems 1, 145–168 (1987)
Suga, N.: Cortical Computational Maps for Auditory Imaging. Neural Networks 3, 3–21 (1990)
Suga, N.: Basic Acoustic Patterns and Neural Mechanism Shared By Humans and Animals for Auditory Perception: A Neuroethologist’s view. In: Proceedings of Workshop on the Auditory bases of Speech Perception, ESCA, July 1996, pp. 31–38 (1996)
Waibel, A.: Neural Network Approaches for Speech Recognition. In: Furui, S., Sondhi, M.M. (eds.) Advances in Speech Signal Processing, pp. 555–597. Marcel Dekker, New York (1992)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Gómez-Vilda, P., Ferrández-Vicente, J.M., Rodellar-Biarge, V., Álvarez-Marquina, A., Mazaira-Fernández, L.M. (2007). A Bio-inspired Architecture for Cognitive Audio. In: Mira, J., Álvarez, J.R. (eds) Bio-inspired Modeling of Cognitive Tasks. IWINAC 2007. Lecture Notes in Computer Science, vol 4527. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73053-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-73053-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73052-1
Online ISBN: 978-3-540-73053-8
eBook Packages: Computer ScienceComputer Science (R0)