Towards a High-Level Audio Framework for Video Retrieval Combining Conceptual Descriptions and Fully-Automated Processes

Charhad, Mbarek; Belkhatir, Mohammed

doi:10.1007/11581772_72

Mbarek Charhad¹⁸ &
Mohammed Belkhatir¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3767))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

1216 Accesses

Abstract

The growing need for ’intelligent’ video retrieval systems leads to new architectures combining multiple characterizations of the video content that rely on highly expressive frameworks while providing fully-automated indexing and retrieval processes. As a matter of fact, addressing the problem of combining modalities within expressive frameworks for video indexing and retrieval is of huge importance and the only solution for achieving significant retrieval performance. This paper presents a multi-facetted conceptual framework integrating multiple characterizations of the audio content for automatic video retrieval. It relies on an expressive representation formalism handling high-level audio descriptions of a video document and a full-text query framework in an attempt to operate video indexing and retrieval on audio features beyond state-of-the-art architectures operating on low-level features and keyword-annotation frameworks. Experiments on the multimedia topic search task of the TRECVID 2004 evaluation campaign validate our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

UIT at VBS 2022: An Unified and Interactive Video Retrieval System with Temporal Search

Multi-modal Transformer for Video Retrieval

SQL-Like Interpretable Interactive Video Search

References

Amato, G., Mainetto, G., Savino, P.: An Approach to a Content-Based Retrieval of Multimedia Data. Multimedia Tools and Applications 7, 9–36 (1998)
Article Google Scholar
Arslan, U., Dönderler, M.-E., Saykol, E., Ulusoy, Ö., Güdükbay, U.: A Semi-Automatic Semantic Annotation Tool for Video Databases. In: Workshop on Multimedia Semantics (SOFSEM 2002), pp. 1–10. The Czech Republic (2002)
Google Scholar
Assfalg, J., Bertini, M., Colombo, C., Del Bimbo, A.: Semantic Annotation of Sports Videos. IEEE MultiMedia 9(2), 52–60 (2002)
Article Google Scholar
Bertini, M., Del Bimbo, A., Nunziati, W.: Annotation and Retrieval of Structured Video Documents. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 14–16. Springer, Heidelberg (2003)
Chapter Google Scholar
Gauvain, J.-L., Lamel, L., Adda, G.: The LIMSI Broadcast News transcription system. Speech Communication 37, 89–108 (2002)
Article MATH Google Scholar
Gong, Y., Chua, H.C., Guo, X.Y.: Image Indexing and Retrieval Based on Color Histograms. Multimedia Tools and App. II, 133–156 (1996)
Google Scholar
Jiang, H., Danilo Montesi, D., Ahmed, K., Elmagarmid, A.k.: Integrated video and text for content-based access to video databases. Multimedia Tools and Applications (1999)
Google Scholar
Jiang, H., Abdelsalam Helal, A., Ahmed, K., Elmagarmid, A.k., Joshi, A.: Scene change detection techniques for video database systems. ACM Multimedia Systems 6, 186–195 (1998)
Article Google Scholar
Jiang, H., Danilo Montesi, D., Ahmed, K., Elmagarmid, A.k.: VideoText database systems. In: Int’l Conf. on Multimedia Computing and Systems, pp. 334–351 (1997)
Google Scholar
Kemp, T., Schmidt, M., Westphal, M., Waibel, A.: Strategies for Automatic Segmentation of Audio Data. In: ICASSP, pp. 1423–1426 (2000)
Google Scholar
Kraaij, W., Smeaton, A., Over, P.: TRECVID 2004– An Overview (2004)
Google Scholar
Kwon, S., Narayanan, S.: Speaker Change Detection Using a New Weighted Distance Measure. In: ICSLP, pp. 16–20 (2002)
Google Scholar
Lozano, R., Martin, H.: Querying virtual videos using path and temporal expressions. ACM Symposium on Applied Computing (1998)
Google Scholar
Ounis, I., Pasca, M.: RELIEF: Combining expressiveness and rapidity into a single system. In: SIGIR, pp. 266–274 (1998)
Google Scholar
Sowa, J.F.: Conceptual structures: information processing in mind and machine. Addison-Wesley, Reading (1984)
MATH Google Scholar
Tran, D.A., Hua, K.A., Vu, K.: VideoGraph: A Graphical Object-based Model for Representing and Querying Video Data. In: ICCM, pp. 383–396 (2000)
Google Scholar
Quénot, G.: TREC-10 Shot Boundary Detection Task: CLIPS System Description and Evaluation. In: TREC 2001 (2001)
Google Scholar
VanRijsbergen, C.J.: A Non-Classical Logic for Information Retrieval. Comput. J. 29(6), 481–485 (1986)
Article Google Scholar

Download references

Author information

Authors and Affiliations

IMAG-CNRS, BP 53, 38041 Cedex 9, Grenoble, France
Mbarek Charhad & Mohammed Belkhatir

Authors

Mbarek Charhad
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Belkhatir
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Gwangju Institute of Science and Technology (GIST), 1 Oryong-dong Buk-gu, 500-712, Gwangju, Korea
Yo-Sung Ho
Multimedia Security Lab, Korea University, Science Campus, 136-701, Seoul, Korea
Hyoung Joong Kim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Charhad, M., Belkhatir, M. (2005). Towards a High-Level Audio Framework for Video Retrieval Combining Conceptual Descriptions and Fully-Automated Processes. In: Ho, YS., Kim, H.J. (eds) Advances in Multimedia Information Processing - PCM 2005. PCM 2005. Lecture Notes in Computer Science, vol 3767. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11581772_72

Download citation

DOI: https://doi.org/10.1007/11581772_72
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30027-4
Online ISBN: 978-3-540-32130-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards a High-Level Audio Framework for Video Retrieval Combining Conceptual Descriptions and Fully-Automated Processes

Abstract

Access this chapter

Preview

Similar content being viewed by others

UIT at VBS 2022: An Unified and Interactive Video Retrieval System with Temporal Search

Multi-modal Transformer for Video Retrieval

SQL-Like Interpretable Interactive Video Search

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Towards a High-Level Audio Framework for Video Retrieval Combining Conceptual Descriptions and Fully-Automated Processes

Abstract

Access this chapter

Preview

Similar content being viewed by others

UIT at VBS 2022: An Unified and Interactive Video Retrieval System with Temporal Search

Multi-modal Transformer for Video Retrieval

SQL-Like Interpretable Interactive Video Search

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation