In this chapter, we review the basic methods for audio signal processing, mainly from the point of view of audio classification. General properties of audio signals are discussed followed by a description of time-frequency representations for audio. Features useful for classification are reviewed. In addition, a discussion on prominent examples of audio classification systems with particular emphasis on feature extraction is provided.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Oppenheim A V, Lim J S (1981) The Importance of Phase in Signals. Proc of the IEEE 69(5):529-550
Moore B C J (2003) An Introduction to the Psychology of Hearing. Academic, San Diego
Patterson R D (2000) Auditory Images: How Complex Sounds Are Represented in the Auditory System. J Acoust Soc Japan (E) 21(4)
Lyon R F, Dyer L (1986) Experiments with a Computational Model of the Cochlea. Proc of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Martinez J M (2002) Standards - MPEG-7 overview of MPEG-7 description tools, part 2. IEEE Multimedia 9(3):83-93
Xiong Z, Radhakrishnan R, Divakaran A, Huang T (2003) Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification. Proc of the International Conference on Multimedia and Expo (ICME)
Wang L, Brown G (2006) Computational Auditory Scene Analysis: Principles, Algorithms and Applications. Wiley-IEEE, New York
McKinney M F, Breebaart J (2003) Features for Audio and Music Classification. Proc of the International Symposium on Music Information Retrieval (ISMIR)
Tzanetakis G, Cook P (2002) Musical Genre Classification of Audio Signals. IEEE Trans Speech Audio Process 10(5):293-302
Burred J J, Lerch A (2004) Hierarchical Automatic Audio Signal Classification. J of Audio Eng Soc 52(7/8):724-739
Logan B (2000) Mel frequency Cepstral Coefficients for Music Modeling. Proc of the International Symposium on Music Information Retrieval (ISMIR)
Zwicker E, Scharf B (1965) A Model of Loudness Summation. Psychol Rev 72:3-26
Klapuri A P (2005) A Perceptually Motivated Multiple-F0 Estimation Method for Polyphonic Music Signals. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPA)
Duda R, Hart P, Stork D (2000) Pattern Classification. Wiley, New York
El-Maleh K, Klein M, Petrucci G, Kabal P (2000) Speech/Music Discrimination for Multimedia Applications. Proc of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)
.Williams G, Ellis D (1999) Speech/Music Discrimination based on Posterior Probability Features. Proc of Eurospeech
Scheirer E, Slaney M (1997) Construction and Evaluation of a Robust Multi-feature Speech/Music Discriminator. Proc of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Chou W, Gu L (2001) Robust Singing Detection in Speech/Music Discriminator Design. Proc of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Zhang T, Kuo C C J (2001) Audio Content Analysis for Online AudioVi-sual Data Segmentation and Classification. IEEE Trans on Speech and Audio Processing 9(4):441-457
Wold E, Blum T, Keisler D, Wheaton J (1996) Content-based Classification, Search and Retrieval of Audio. IEEE Multimedia 3(3):27-36
Peeters G, McAdams S, Herrera P (2000) Instrument Sound Description in the Context of MPEG-7. Proc of the International Computer Music Conference (ICMC)
Dowling W J (1978) Scale and Contour: Two Components of a Theory of Memory for Melodies. Psychol Rev 85:342-389
Pradeep P, Joshi M, Hariharan S, Dutta-Roy S, Rao P (2007) Sung Note Segmentation for a Query-by-Humming System. Proc of the International Workshop on Artificial Intelligence and Music (Music-AI) in IJCAI
Klapuri A P (1999) Sound Onset Detection by Applying Psychoacoustic Knowl-edge. Proc of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)
de Cheveigne A, Kawahara H (1999) Multiple Period Estimation and Pitch Perception Model. Speech Communication 27:175-185
Uitdenbogerd A, Zobel J (1999) Melodic Matching Techniques for Large Music Databases. Proc of the 7th ACM International Conference on Multimedia (Part 1)
Aucouturier J J, Pachet F (2004) Improving Timbre Similarity: How High is the Sky. J Negat Result Speech Audio Sci 1(1)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Rao, P. (2008). Audio Signal Processing. In: Prasad, B., Prasanna, S.R.M. (eds) Speech, Audio, Image and Biomedical Signal Processing using Neural Networks. Studies in Computational Intelligence, vol 83. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75398-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-75398-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75397-1
Online ISBN: 978-3-540-75398-8
eBook Packages: EngineeringEngineering (R0)