Skip to main content

Towards a High-Level Audio Framework for Video Retrieval Combining Conceptual Descriptions and Fully-Automated Processes

  • Conference paper
Advances in Multimedia Information Processing - PCM 2005 (PCM 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3767))

Included in the following conference series:

  • 1216 Accesses

Abstract

The growing need for ’intelligent’ video retrieval systems leads to new architectures combining multiple characterizations of the video content that rely on highly expressive frameworks while providing fully-automated indexing and retrieval processes. As a matter of fact, addressing the problem of combining modalities within expressive frameworks for video indexing and retrieval is of huge importance and the only solution for achieving significant retrieval performance. This paper presents a multi-facetted conceptual framework integrating multiple characterizations of the audio content for automatic video retrieval. It relies on an expressive representation formalism handling high-level audio descriptions of a video document and a full-text query framework in an attempt to operate video indexing and retrieval on audio features beyond state-of-the-art architectures operating on low-level features and keyword-annotation frameworks. Experiments on the multimedia topic search task of the TRECVID 2004 evaluation campaign validate our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Amato, G., Mainetto, G., Savino, P.: An Approach to a Content-Based Retrieval of Multimedia Data. Multimedia Tools and Applications 7, 9–36 (1998)

    Article  Google Scholar 

  2. Arslan, U., Dönderler, M.-E., Saykol, E., Ulusoy, Ö., Güdükbay, U.: A Semi-Automatic Semantic Annotation Tool for Video Databases. In: Workshop on Multimedia Semantics (SOFSEM 2002), pp. 1–10. The Czech Republic (2002)

    Google Scholar 

  3. Assfalg, J., Bertini, M., Colombo, C., Del Bimbo, A.: Semantic Annotation of Sports Videos. IEEE MultiMedia 9(2), 52–60 (2002)

    Article  Google Scholar 

  4. Bertini, M., Del Bimbo, A., Nunziati, W.: Annotation and Retrieval of Structured Video Documents. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 14–16. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Gauvain, J.-L., Lamel, L., Adda, G.: The LIMSI Broadcast News transcription system. Speech Communication 37, 89–108 (2002)

    Article  MATH  Google Scholar 

  6. Gong, Y., Chua, H.C., Guo, X.Y.: Image Indexing and Retrieval Based on Color Histograms. Multimedia Tools and App. II, 133–156 (1996)

    Google Scholar 

  7. Jiang, H., Danilo Montesi, D., Ahmed, K., Elmagarmid, A.k.: Integrated video and text for content-based access to video databases. Multimedia Tools and Applications (1999)

    Google Scholar 

  8. Jiang, H., Abdelsalam Helal, A., Ahmed, K., Elmagarmid, A.k., Joshi, A.: Scene change detection techniques for video database systems. ACM Multimedia Systems 6, 186–195 (1998)

    Article  Google Scholar 

  9. Jiang, H., Danilo Montesi, D., Ahmed, K., Elmagarmid, A.k.: VideoText database systems. In: Int’l Conf. on Multimedia Computing and Systems, pp. 334–351 (1997)

    Google Scholar 

  10. Kemp, T., Schmidt, M., Westphal, M., Waibel, A.: Strategies for Automatic Segmentation of Audio Data. In: ICASSP, pp. 1423–1426 (2000)

    Google Scholar 

  11. Kraaij, W., Smeaton, A., Over, P.: TRECVID 2004– An Overview (2004)

    Google Scholar 

  12. Kwon, S., Narayanan, S.: Speaker Change Detection Using a New Weighted Distance Measure. In: ICSLP, pp. 16–20 (2002)

    Google Scholar 

  13. Lozano, R., Martin, H.: Querying virtual videos using path and temporal expressions. ACM Symposium on Applied Computing (1998)

    Google Scholar 

  14. Ounis, I., Pasca, M.: RELIEF: Combining expressiveness and rapidity into a single system. In: SIGIR, pp. 266–274 (1998)

    Google Scholar 

  15. Sowa, J.F.: Conceptual structures: information processing in mind and machine. Addison-Wesley, Reading (1984)

    MATH  Google Scholar 

  16. Tran, D.A., Hua, K.A., Vu, K.: VideoGraph: A Graphical Object-based Model for Representing and Querying Video Data. In: ICCM, pp. 383–396 (2000)

    Google Scholar 

  17. Quénot, G.: TREC-10 Shot Boundary Detection Task: CLIPS System Description and Evaluation. In: TREC 2001 (2001)

    Google Scholar 

  18. VanRijsbergen, C.J.: A Non-Classical Logic for Information Retrieval. Comput. J. 29(6), 481–485 (1986)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Charhad, M., Belkhatir, M. (2005). Towards a High-Level Audio Framework for Video Retrieval Combining Conceptual Descriptions and Fully-Automated Processes. In: Ho, YS., Kim, H.J. (eds) Advances in Multimedia Information Processing - PCM 2005. PCM 2005. Lecture Notes in Computer Science, vol 3767. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11581772_72

Download citation

  • DOI: https://doi.org/10.1007/11581772_72

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30027-4

  • Online ISBN: 978-3-540-32130-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics