Abstract
It was examined whether applying a model of human auditory filter could improve the quality of sound recognition with the use of MPEG-7 standard audio descriptors. Modeling of filtering in the auditory system was with a bank of 38 gammatone filters closely spaced across the audible frequency range. The bank of filters was implemented as a low-level audio descriptor to replace the short-term Fourier transform (STFT) MPEG-7 audio descriptor. Sound recognition tests were conducted on a large set of sounds of nine musical instruments and speech of twelve speakers. The results showed that the proposed descriptor employing a bank of gammatone filters led to improved recognition of musical instruments and speakers as compared to the STFT-based original low-level MPEG-7 audio descriptor.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Buchner, E.: Linia rozwoju życia, PWM (1994) (in Polish)
Casey, M.: General sound classification and similarity in MPEG-7. Organised Sound 6(2), 153–164 (2001)
Dau, T., Puschel, D., Kohlrausch, A.: A quantitative model of the “effective” signal processing in the auditory system, I. Model structure. J. Acoust. Soc. Am. 99(6), 3615–3622 (1996)
Dau, T., Puschel, D., Kohlrausch, A.: A quantitative model of the “effective” signal processing in the auditory system. II. Simulations and measurements. J. Acoust. Soc. Am. 99(6), 3623–3631 (1996)
Hartmann, W.M.: Signals, Sound, and Sensation, AIP Series in modern acoustics and signal processing. Springer, New York (2000)
Irino, T., Patterson, R.D.: A time-domain. level-dependent auditory filter: the gammachirp. J. Acoust. Soc. Am. 101, 412–419 (1997)
ISO/IEC FDIS 15938-4:2001(E), Information Technology - Multimedia Content Description Interface - Part 4: Audio (2001)
Manjunath, B.S., Salembier, P., Sikora, T.: Introduction to MPEG-7 Multimedia Content Description Interface. Wiley (2002)
Patterson, R.D., Allerhand, M., Giguere, C.: Time-domain modelling of peripheral auditory processing: A modular architecture and a software platform. J. Acoust. Soc. Am. 98, 1890–1894 (1995)
Pruszewicz, A., Demenko, G., Richter, A., Wika, T.: Nowe listy artykulacyjne do badań audiometrycznych. Logopedia 20, 139 (1993) (in Polish )
Siedlaczek, P.: Peter Siedlaczek’s Advanced Orchestra, Recordings CD Vol. 1-5
Tchorz, J., Kollmeier, B.: A model of auditory perception as front end for automatic speech recognition, J. Acoust. Soc. Am. 106(4), 2040–2050 (1999)
Unoki, M., Irino, T., Glasberg, B., Moore, B.C.J., Patterson, R.D.: Comparison of the roex and gammachirp filters as representation of the auditory filter. J. Acoust. Soc. Am. 120, 1474–1492 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Świercz, A., Żera, J. (2014). Model of Auditory Filters and MPEG-7 Descriptors in Sound Recognition. In: Ślȩzak, D., Schaefer, G., Vuong, S.T., Kim, YS. (eds) Active Media Technology. AMT 2014. Lecture Notes in Computer Science, vol 8610. Springer, Cham. https://doi.org/10.1007/978-3-319-09912-5_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-09912-5_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09911-8
Online ISBN: 978-3-319-09912-5
eBook Packages: Computer ScienceComputer Science (R0)