Audio Signal Processing

Rao, Preeti

doi:10.1007/978-3-540-75398-8_8

Preeti Rao⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 83))

2335 Accesses
10 Citations

In this chapter, we review the basic methods for audio signal processing, mainly from the point of view of audio classification. General properties of audio signals are discussed followed by a description of time-frequency representations for audio. Features useful for classification are reviewed. In addition, a discussion on prominent examples of audio classification systems with particular emphasis on feature extraction is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Oppenheim A V, Lim J S (1981) The Importance of Phase in Signals. Proc of the IEEE 69(5):529-550
Article Google Scholar
Moore B C J (2003) An Introduction to the Psychology of Hearing. Academic, San Diego
Google Scholar
Patterson R D (2000) Auditory Images: How Complex Sounds Are Represented in the Auditory System. J Acoust Soc Japan (E) 21(4)
Google Scholar
Lyon R F, Dyer L (1986) Experiments with a Computational Model of the Cochlea. Proc of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Google Scholar
Martinez J M (2002) Standards - MPEG-7 overview of MPEG-7 description tools, part 2. IEEE Multimedia 9(3):83-93
Article Google Scholar
Xiong Z, Radhakrishnan R, Divakaran A, Huang T (2003) Comparing MFCC and MPEG-7 Audio Features for Feature Extraction, Maximum Likelihood HMM and Entropic Prior HMM for Sports Audio Classification. Proc of the International Conference on Multimedia and Expo (ICME)
Google Scholar
Wang L, Brown G (2006) Computational Auditory Scene Analysis: Principles, Algorithms and Applications. Wiley-IEEE, New York
Google Scholar
McKinney M F, Breebaart J (2003) Features for Audio and Music Classification. Proc of the International Symposium on Music Information Retrieval (ISMIR)
Google Scholar
Tzanetakis G, Cook P (2002) Musical Genre Classification of Audio Signals. IEEE Trans Speech Audio Process 10(5):293-302
Article Google Scholar
Burred J J, Lerch A (2004) Hierarchical Automatic Audio Signal Classification. J of Audio Eng Soc 52(7/8):724-739
Google Scholar
Logan B (2000) Mel frequency Cepstral Coefficients for Music Modeling. Proc of the International Symposium on Music Information Retrieval (ISMIR)
Google Scholar
Zwicker E, Scharf B (1965) A Model of Loudness Summation. Psychol Rev 72:3-26
Article Google Scholar
Klapuri A P (2005) A Perceptually Motivated Multiple-F0 Estimation Method for Polyphonic Music Signals. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPA)
Google Scholar
Duda R, Hart P, Stork D (2000) Pattern Classification. Wiley, New York
Google Scholar
El-Maleh K, Klein M, Petrucci G, Kabal P (2000) Speech/Music Discrimination for Multimedia Applications. Proc of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Google Scholar
.Williams G, Ellis D (1999) Speech/Music Discrimination based on Posterior Probability Features. Proc of Eurospeech
Google Scholar
Scheirer E, Slaney M (1997) Construction and Evaluation of a Robust Multi-feature Speech/Music Discriminator. Proc of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Google Scholar
Chou W, Gu L (2001) Robust Singing Detection in Speech/Music Discriminator Design. Proc of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Google Scholar
Zhang T, Kuo C C J (2001) Audio Content Analysis for Online AudioVi-sual Data Segmentation and Classification. IEEE Trans on Speech and Audio Processing 9(4):441-457
Article Google Scholar
Wold E, Blum T, Keisler D, Wheaton J (1996) Content-based Classification, Search and Retrieval of Audio. IEEE Multimedia 3(3):27-36
Article Google Scholar
Peeters G, McAdams S, Herrera P (2000) Instrument Sound Description in the Context of MPEG-7. Proc of the International Computer Music Conference (ICMC)
Google Scholar
Dowling W J (1978) Scale and Contour: Two Components of a Theory of Memory for Melodies. Psychol Rev 85:342-389
Article Google Scholar
Pradeep P, Joshi M, Hariharan S, Dutta-Roy S, Rao P (2007) Sung Note Segmentation for a Query-by-Humming System. Proc of the International Workshop on Artificial Intelligence and Music (Music-AI) in IJCAI
Google Scholar
Klapuri A P (1999) Sound Onset Detection by Applying Psychoacoustic Knowl-edge. Proc of the International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Google Scholar
de Cheveigne A, Kawahara H (1999) Multiple Period Estimation and Pitch Perception Model. Speech Communication 27:175-185
Article Google Scholar
Uitdenbogerd A, Zobel J (1999) Melodic Matching Techniques for Large Music Databases. Proc of the 7th ACM International Conference on Multimedia (Part 1)
Google Scholar
Aucouturier J J, Pachet F (2004) Improving Timbre Similarity: How High is the Sky. J Negat Result Speech Audio Sci 1(1)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Indian Institute of Technology Bombay, India
Preeti Rao

Authors

Preeti Rao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Information Sciences, Florida A&M University, Tallahassee, FL 32307, USA
Bhanu Prasad
Department of Electronics and Communication Engineering, Indian Institute of Technology Guwahati, Guwahati, India
S. R. Mahadeva Prasanna

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Rao, P. (2008). Audio Signal Processing. In: Prasad, B., Prasanna, S.R.M. (eds) Speech, Audio, Image and Biomedical Signal Processing using Neural Networks. Studies in Computational Intelligence, vol 83. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75398-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-540-75398-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75397-1
Online ISBN: 978-3-540-75398-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics