Skip to main content

Emulating Temporal Receptive Fields of Higher Level Auditory Neurons for ASR

  • Conference paper
Text, Speech and Dialogue (TSD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5246))

Included in the following conference series:

  • 963 Accesses

Abstract

This paper proposes modifications to the Multi-resolution RASTA (MRASTA) feature extraction technique for the automatic speech recognition (ASR). By emulating asymmetries of the temporal receptive field (TRF) profiles of higher level auditory neurons, we obtain more than 11.4% relative improvement in word error rate on OGI-Digits database. Experiments on TIMIT database confirm that proposed modifications are indeed useful.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hermansky, H.: Perceptual Linear Predictive (PLP) Analysis of Speech. J. Acoust. Soc. Am. 87(4), 1738–1752 (1990)

    Article  Google Scholar 

  2. Hermansky, H., Fousek, P.: Multi-resolution RASTA filtering for TANDEM-based ASR. In: INTERSPEECH, September 2005, pp. 361–364 (2005)

    Google Scholar 

  3. Hermansky, H., Ellis, D.P.W., Sharma, S.: Tandem connectionist feature extraction for conventional HMMsystems. In: Proc. of ICASSP, Istanbul, Turkey (2000)

    Google Scholar 

  4. Depireux, D.A., Simon, J.Z., Klein, D.J., Shamma, S.A.: Spectro-temporal response field characterization with dynamic ripples in ferret primary auditory cortex. Journal of Neurophysiology 85, 1220–1234 (2001)

    Google Scholar 

  5. Schreiner, C.E., Read, H.L., Sutter, M.L.: Modular Organization of Frequency Integration in Primary Auditory Cortex. Annual Review of Neuroscience 23, 501–529 (2000)

    Article  Google Scholar 

  6. Qiu, A., Schreiner, C.E., Escabi, M.A.: Gabor analysis of auditory midbrain receptive fields: Spectro-temporal and binaural composition. Journal of Neurophysiology 90 (2003)

    Google Scholar 

  7. Theunissen, F.E., Sen, K., Doupe, A.J.: Spectral-Temporal Receptive Fields of Nonlinear Auditory Neurons Obtained Using Natural Sounds. Journal of Neurophysiology 20, 2315–2331 (2000)

    Google Scholar 

  8. Kleinschmidt, M., Gelbart, D.: Improving Word Accuracy with Gabor Feature Extraction. In: Proc. of ICSLP, Colorado, USA (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Petr Sojka Aleš Horák Ivan Kopeček Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sivaram, G.S.V.S., Hermansky, H. (2008). Emulating Temporal Receptive Fields of Higher Level Auditory Neurons for ASR. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science(), vol 5246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87391-4_65

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87391-4_65

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87390-7

  • Online ISBN: 978-3-540-87391-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics