Skip to main content

Audio Representation

  • Reference work entry
Encyclopedia of Database Systems
  • 709 Accesses

Synonyms

Audio feature extraction; Audio characterization

Definition

An audio signal is a signal that contains information in the audible frequency range. Audio representation refers to the extraction of audio signal properties, or features, that are representative of the audio signal composition (both in temporal and spectral domain) and audio signal behavior over time. Feature extraction is typically combined with feature selection, through which the best set of features for the intended operation on the audio signal is defined.

Historical Background

Audio feature extraction typically leads to a strongly reduced audio signal representation. Obtaining such representation can improve the efficiency of audio processing and benefit many applications based on such processing. For example, a compact representation of an audio signal in the form of a fingerprintcan enable extremely fast search for a match between this signal and a large-scale audio database for the purpose of audio signal...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 2,500.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Cai R., Lu L., Hanjalic A., Zhang H.-J., and Cai L.-H. A flexible framework for key audio effects detection and auditory context inference. IEEE Trans. Audio, Speech Lang. Process., 14(3):1026–1039, 2006.

    Article  Google Scholar 

  2. Casey M.A. MPEG-7 sound-recognition tools. IEEE Trans. Circuits and Syst. for Video Tech., 11(6):737–747, 1997.

    Google Scholar 

  3. Foote J. Content-based retrieval of music and audio. In Proc. SPIE Multimedia Storage and Archiving Systems II. 1997, pp. 138–147.

    Google Scholar 

  4. Guyon I. and Elisseeff A. An introduction to variable and feature selection. J. Mach. Learn. Res., 3:1157–1182, 2003.

    Article  MATH  Google Scholar 

  5. Liu Z., Wang Y., and Chen T. Audio feature extraction and analysis for scene segmentation and classification. J. VLSI Signal Process. Sys., 20(1–2):61–79, 1998.

    Google Scholar 

  6. Lu L., Zhang H.-J., and Jiang H. Content analysis for audio classification and segmentation. IEEE Trans. Speech Audio Process., 10(7):504–516, 2002.

    Article  Google Scholar 

  7. Lu L., Zhang H.-J., and Li S. Content-based audio classification and segmentation by using support vector machines. ACM Multimedia Sys. J., 8(6):482–492, March, 2003.

    Article  Google Scholar 

  8. Peltonen V., Tuomi J., Klapuri A.P., Huopaniemi J., and Sorsa T. Computational auditory scene recognition. In Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Vol. 2, 2002, pp. 1941–1944.

    Google Scholar 

  9. Rabiner L. and Juang B.H. Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, New Jersey, 1993.

    Google Scholar 

  10. Saunders J. Real-time discrimination of broadcast speech/music. In Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 2, 1996, pp. 993–996.

    Google Scholar 

  11. Scheirer E. and Slaney M. Construction and evaluation of a robust multifeature music/speech discriminator. In Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Vol. 2, 1997, pp. 1331–1334.

    Google Scholar 

  12. Tzanetakis G. and Cook P. Marsyas: A framework for audio analysis. Organized Sound, 4(3):2000.

    Google Scholar 

  13. Wall M.E., Rechtsteiner A., and Rocha L.M. Singular value decomposition and principal component analysis. In A Practical Approach to Microarray Data Analysis, D.P. Berrar, W. Dubitzky, M. Granzow (eds.). Kluwer, Norwell, MA (2003). pp. 91–109, LANL LA-UR-02-4001.

    Google Scholar 

  14. Wold E., Blum T. and Wheaton J. Content-based classification, search and retrieval of audio. IEEE Multimedia, 3(3):27–36, 1996.

    Article  Google Scholar 

  15. Zhang T. and Kuo C.-C.J. Video content parsing based on combined audio and visual information. In Proc. SPIE: Multimedia Storage and Archiving Systems, IV, 1999, pp. 78–89.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this entry

Cite this entry

Lu, L., Hanjalic, A. (2009). Audio Representation. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_1442

Download citation

Publish with us

Policies and ethics