Skip to main content

Using Values of the Human Cochlea in the Macro and Micro Mechanical Model for Automatic Speech Recognition

  • Conference paper
Book cover Nature-Inspired Computation and Machine Learning (MICAI 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8857))

Included in the following conference series:

  • 2184 Accesses

Abstract

Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). That is because this hearing organ in mammalians is the most important element used to make a transduction of the sound pressure that is received by the outer ear. This paper shows how the macro and micro mechanical model is used in ASR tasks. The values that Neely, Elliot and Ku founded in their works, related with the macro and micro mechanical model such as Neely were used to set the central frequencies of a bank filter to obtain parameters from the speech in a similar form as MFCC (Mel Frequency Cepstrum Coefficients) has been constructed.

An approach that considers a new form to distribute the bank filter in our parametric representation is proposed. Then this distribution of the bank filter to have a different representation of the speech in frequency domain compared with MFCC is applied. The response of these three values mentioned above into macro and micro mechanical model to create the central frequencies of the bank filter were used, then the Mel scale function substituted by a representation based in the cochlear response based on the Neely model. This model was used with a set of different parameters of the cochlea, used by Nelly, Elliot and Ku in their works, such as mass, damping and stiffness; among others. A performance of 98 to 100% was reached for a task that uses Spanish isolated digits pronounced by 5 different speakers. Corpus SUSAS with neutral sound records with some advantages in comparison with MFCC was applied.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Noll, A.M.: Shortime Spectrum and Cepstrum Techniques for Vocal Pitch Detection. Journal of Acoustical Society of America 36, 296–302 (1964)

    Article  MathSciNet  Google Scholar 

  2. John, M.: Linear Prediction: A Tutorial Review. Proceedings of the IEEE 63(4), 561–580 (1975)

    Article  Google Scholar 

  3. Davis, S.B., Mermelstein, P.: Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentence. IEEE Transactions on Acoustics, Speech and Signal Processing ASSP-28(4) (August 1980)

    Google Scholar 

  4. Hermansky: Perceptual Linear Predictive (PLP) analysis of speech. Journal of Acoustical Society of America, 1738–1752 (April 1990)

    Google Scholar 

  5. Dallos, P., Fay, R.R.: Mechanics of the cochlea: modeling effects. In: de Boer, S.E. (ed.) The Cochlea. Springer, USA (1996)

    Google Scholar 

  6. Luis, R., Ruggero, M.A.: Mechanics of the Mammalian Cochlea’. Physiological Reviews 81(3) (July 2001)

    Google Scholar 

  7. Kim, D.S., Lee, S.Y., Kill, R.M.: Auditory processing of speech signals for robust speech recognition in real word noisy environments. IEEE Trans. Speech Audio Processing 7(1), 55–69 (1999)

    Article  Google Scholar 

  8. Geisler, C.D.: A model of the effect of outer hair cell motility on cochlear vibration. Hear. Res. 24, 125–131 (1996)

    Article  Google Scholar 

  9. Geisler, C.D., Shan, X.: A model for cochlear vibration based on feedback from motile outer hair cells. In: Dallos, P., Geilser, C.D., Matthews, J.W., Ruggero, M.A., Steele, C.R. (eds.) The Mechanics and Biophysics of Hearing, pp. 86–95. Springer, New York (1990)

    Chapter  Google Scholar 

  10. Holmberg, M., Gelbart, D., Hemmert, W.: Automatic speech recognition with an adaptation model motivated by auditory processing. IEEE Trans. Audio, Speech, Language Processing 14(1), 44–49 (2006)

    Article  Google Scholar 

  11. Haque, S., Togneri, R.: A feature extraction method for automatic speech recognition based on the cochlear nucleus. In: 11th Annual Conference of the International Speech Communication Association, InterSpeech 2010, Makuhari, Chiba, Japan, September 26-30 (2010)

    Google Scholar 

  12. Harczos, T., Szepannek, G., Klefenz, F.: Towards Automatic Speech Recognition based on Cochlear Traveling Wave Delay Trajectories. In: Dau, T., Buchholz, J.M., Harte, J.M., Christiansen, T.U. (eds.) 1st International Symposium on Auditory and Audiological Research (ISAAR 2007), Auditory Signal Processing in Hearing-impaired Listeners (2007) ISBN: 87-990013-1-4. Print: Centertryk A/S

    Google Scholar 

  13. Keener, Sneyd, J.: Mathematical Physiology. Springer, USA (2008)

    Google Scholar 

  14. Elliot, S.J., Ku, E.M., Lineton, B.A.: A state space model for cochlear mechanics. Journal of Acoustical Society of America 122, 2759–2771 (2007)

    Article  Google Scholar 

  15. Elliott, S.J., Lineton, B., Ni, G.: Fluid coupling in a discrete model of cochlear mechanics. Journal of Acoustical Society of America 130, 1441–1451 (2011)

    Article  Google Scholar 

  16. Ku, E.M., Elliot, S.J., Lineton, B.A.: Statistics of instabilities in a state space model of the human cochlea. Journal of Acoustical Society of America 124, 1068–1079 (2008)

    Article  Google Scholar 

  17. Neely, S.T.: A model for active elements in cochlear biomechanics. Journal of Acoustical Society of America 79, 1472–1480 (1986)

    Article  Google Scholar 

  18. Békésy: Concerning the pleasures of observing, and the mechanics of the inner ear. Nobel Lecture (December 11, 1961)

    Google Scholar 

  19. Mario, J.H., Rodríguez, J.L.O., Guerra, S.S., Fernández, R.B.: Computational Model of the Cochlea using Resonance Analysis. Journal Revista Mexicana Ingeniería Biomédica 33(2), 77–86 (2012)

    Google Scholar 

  20. Mario, J.H.: Modelo mecánico acústico del oído interno en reconocimiento de voz, Ph. D. Thesis, Center for Computing Research-IPN (Junio 2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Oropeza Rodríguez, J.L., Suárez Guerra, S. (2014). Using Values of the Human Cochlea in the Macro and Micro Mechanical Model for Automatic Speech Recognition. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, S.N. (eds) Nature-Inspired Computation and Machine Learning. MICAI 2014. Lecture Notes in Computer Science(), vol 8857. Springer, Cham. https://doi.org/10.1007/978-3-319-13650-9_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13650-9_22

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13649-3

  • Online ISBN: 978-3-319-13650-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics