Abstract
Semantic understanding of video is an important frontier in content based retrieval. In the research literature, significant attention has been given to the visual aspect of video, however, relatively little work directly uses audio content for video retrieval. Our paper gives an overview of our current research directions in semantic video retrieval using audio content. We discuss the effectiveness of classifying audio into semantic categories by combining both global and local audio features based in the frequency spectrum. Furthermore, we introduce two novel features called Frequency Spectrum Differentials (FSD), and Differential Swap Rate (DSR), that both model the shape of the spectrum.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Foote, J.: An Overview of Audio Information Retrieval. ACM Multimedia Systems Journal (1999)
Lu, L., Jiang, H., Zhang, H.: A Robust Audio Classification and Segmentation Method. Proceedings of the ninth ACM International Conf. on Multimedia (2001)203–211
Matichuk, B., Zaiane, O.: Unsupervised Classification of Sound for Multimedia Indexing. MDM/ ACM SIGKDD’2000 (2000)
Melih, K., Gonzalez, R.: Structured Coding for Content Based Interactive Audio. Proc. IEEE International Conf. On Multimedia Computing and Systems’ 99 (1999)
Sebe, N.: Improving Visual Matching. Similarity Noise Distribution and Optimal Metrics. Ph.D. Thesis (2001)
Srinivasan, S., Petkovic, D., Ponceleon, D.: Towards Robust Features for Classifying Audio in the CueVideo System. Proceedings of the seventh ACM International Conf. on Multimedia (1999) 393–400
Subramanya, S. R., Youssef, A., Narahari, B., Simha, R.: Use of Transforms for Indexing in Audio Databases. ICIMA (1998)
Wold, E., Blum, T., Keislar, D., Wheaton, J.: Content-Based Classification, Search, and Retrieval of Audio. IEEE Multimedia (Fall) (1996) 27–36
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bakker, E.M., Lew, M.S. (2002). Semantic Video Retrieval Using Audio Analysis. In: Lew, M.S., Sebe, N., Eakins, J.P. (eds) Image and Video Retrieval. CIVR 2002. Lecture Notes in Computer Science, vol 2383. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45479-9_29
Download citation
DOI: https://doi.org/10.1007/3-540-45479-9_29
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43899-1
Online ISBN: 978-3-540-45479-3
eBook Packages: Springer Book Archive