Skip to main content

Model of Auditory Filters and MPEG-7 Descriptors in Sound Recognition

  • Conference paper
Active Media Technology (AMT 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8610))

Included in the following conference series:

  • 2338 Accesses

Abstract

It was examined whether applying a model of human auditory filter could improve the quality of sound recognition with the use of MPEG-7 standard audio descriptors. Modeling of filtering in the auditory system was with a bank of 38 gammatone filters closely spaced across the audible frequency range. The bank of filters was implemented as a low-level audio descriptor to replace the short-term Fourier transform (STFT) MPEG-7 audio descriptor. Sound recognition tests were conducted on a large set of sounds of nine musical instruments and speech of twelve speakers. The results showed that the proposed descriptor employing a bank of gammatone filters led to improved recognition of musical instruments and speakers as compared to the STFT-based original low-level MPEG-7 audio descriptor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Buchner, E.: Linia rozwoju życia, PWM (1994) (in Polish)

    Google Scholar 

  2. Casey, M.: General sound classification and similarity in MPEG-7. Organised Sound 6(2), 153–164 (2001)

    Article  MathSciNet  Google Scholar 

  3. Dau, T., Puschel, D., Kohlrausch, A.: A quantitative model of the “effective” signal processing in the auditory system, I. Model structure. J. Acoust. Soc. Am. 99(6), 3615–3622 (1996)

    Article  Google Scholar 

  4. Dau, T., Puschel, D., Kohlrausch, A.: A quantitative model of the “effective” signal processing in the auditory system. II. Simulations and measurements. J. Acoust. Soc. Am. 99(6), 3623–3631 (1996)

    Article  Google Scholar 

  5. Hartmann, W.M.: Signals, Sound, and Sensation, AIP Series in modern acoustics and signal processing. Springer, New York (2000)

    Google Scholar 

  6. Irino, T., Patterson, R.D.: A time-domain. level-dependent auditory filter: the gammachirp. J. Acoust. Soc. Am. 101, 412–419 (1997)

    Article  Google Scholar 

  7. ISO/IEC FDIS 15938-4:2001(E), Information Technology - Multimedia Content Description Interface - Part 4: Audio (2001)

    Google Scholar 

  8. Manjunath, B.S., Salembier, P., Sikora, T.: Introduction to MPEG-7 Multimedia Content Description Interface. Wiley (2002)

    Google Scholar 

  9. Patterson, R.D., Allerhand, M., Giguere, C.: Time-domain modelling of peripheral auditory processing: A modular architecture and a software platform. J. Acoust. Soc. Am. 98, 1890–1894 (1995)

    Article  Google Scholar 

  10. Pruszewicz, A., Demenko, G., Richter, A., Wika, T.: Nowe listy artykulacyjne do badań audiometrycznych. Logopedia 20, 139 (1993) (in Polish )

    Google Scholar 

  11. Siedlaczek, P.: Peter Siedlaczek’s Advanced Orchestra, Recordings CD Vol. 1-5

    Google Scholar 

  12. Tchorz, J., Kollmeier, B.: A model of auditory perception as front end for automatic speech recognition, J. Acoust. Soc. Am. 106(4), 2040–2050 (1999)

    Article  Google Scholar 

  13. Unoki, M., Irino, T., Glasberg, B., Moore, B.C.J., Patterson, R.D.: Comparison of the roex and gammachirp filters as representation of the auditory filter. J. Acoust. Soc. Am. 120, 1474–1492 (2006)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Świercz, A., Żera, J. (2014). Model of Auditory Filters and MPEG-7 Descriptors in Sound Recognition. In: Ślȩzak, D., Schaefer, G., Vuong, S.T., Kim, YS. (eds) Active Media Technology. AMT 2014. Lecture Notes in Computer Science, vol 8610. Springer, Cham. https://doi.org/10.1007/978-3-319-09912-5_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09912-5_17

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09911-8

  • Online ISBN: 978-3-319-09912-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics