research-article

(Un)Reliability of video concept detection

Authors:

Alexander G. HauptmannAuthors Info & Claims

CIVR '08: Proceedings of the 2008 international conference on Content-based image and video retrieval

Pages 85 - 94

https://doi.org/10.1145/1386352.1386367

Published: 07 July 2008 Publication History

Abstract

Great effort has been made to improve video concept detection and continuous progress has been reported. With the current evaluation method being confined to carefully annotated domains and thus quite forgiving, the reliability of the state-of-the-art concept classifiers remains in question. Adopting a more rigorous evaluation approach, we find that most concept classifiers built using the mainstream approach are unreliable because they generalize poorly to domains other than their training domain. Moreover, evidences show that SVM-based concept classifiers learn little beyond memorizing most of the positive training data, and behave close to memory-based models such as kNN indicated by comparable performance between the two models. Examining the properties of the reliable concept classifiers, we find that the classifiers of frequent concepts, "bloated" classifiers, and classifiers capable of learning the pattern of data, tend to be more reliable. This paper contributes to a better understanding of concept detection, suggests heuristics to identify reliable concept classifiers, and discusses solutions to improving concept detection reliability.

References

[1]

M. Campbell, A. Haubold, S. Ebadollahi, M. Naphade, A. Natsev, J. Smith, J. Tesic, and L. Xie. IBM Research TRECVID-2006 Video Retrieval System. TREC Video Retrieval Evaluation Proceedings, 2006.

[2]

C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001.

[3]

S. Chang, W. Hsu, L. Kennedy, L. Xie, A. Yanagawa, E. Zavesky, and D. Zhang. Columbia University TRECVID-2005 Video Search and High-Level Feature Extraction. TREC Video Retrieval Evaluation Proceedings, 2005.

[4]

S. Chang, W. Jiang, A. Yanagawa, and E. Zavesky. Columbia University TRECVID 2007 High-Level Feature Extraction. TREC Video Retrieval Evaluation Proceedings, 2007.

[5]

D. M. Mount and S. Arya. ANN: A Library for Approximate Nearest Neighbor Searching.

[6]

M. R. Naphade, L. Kennedy, J. R. Kender, S. F. Chang, J. Smith, P. Over, and A. Hauptmann. A light scale concept ontology for multimedia understanding for TRECVID 2005. In IBM Research Technical Report, 2005.

[7]

M. R. Naphade, T. Kristjansson, B. Frey, and T. Huang. Probabilistic multimedia objects Multijects: A novel approach to video indexing and retrieval in multimedia systems. In Proc. of ICIP, 1998.

[8]

C. Ngo, Y. Jiang, X. Wei, F. Wang, W. Zhao, H. Tan, and X. Wu. Experimenting VIREO-374: Bag-of-Visual-Words and Visual-Based Ontology for Semantic Video Indexing and Search. TREC Video Retrieval Evaluation Proceedings, 2007.

[9]

J. Philbin, O. Chum, J. Sivic, V. Ferrari, M. Marin, A. Bosch, N. Apostolof, and A. Zisserman. Oxford TRECVid 2007 Notebook paper. TREC Video Retrieval Evaluation Proceedings, 2007.

[10]

G.-J. Qi, X.-S. Hua, Y. Rui, J. Tang, T. Mei, and H.-J. Zhang. Correlative multi-label video annotation. In Proc. of the 15th ACM Int'l Conf. on Multimedia, pages 17--26, 2007.

Digital Library

[11]

A. Smeaton and P. Over. Trecvid: Benchmarking the effectiveness of infomration retrieval tasks on digital video. In Proc. of Conf. on Image and Video Retrieval, 2003.

Digital Library

[12]

C. Snoek, I. Everts, J. van Gemert, J. Geusebroek, B. Huurnink, D. Koelma, M. van Liempt, O. de Rooij, K. van de Sande, and A. Smeulders. The MediaMill TRECVID 2007 Semantic Video Search Engine. TREC Video Retrieval Evaluation Proceedings, 2007.

[13]

R. Yan, M. yu Chen, and A. G. Hauptmann. Mining relationship between video concepts using probabilistic graphical model. In IEEE Int'l Conf. on Multimedia and Expo, 2006.

[14]

J. Yang, R. Yan, and A. Hauptmann. Cross-domain video concept detection using adaptive svms. Proceedings of the 15th international conference on Multimedia, pages 188--197, 2007.

Digital Library

[15]

J. Yang, R. Yan, and A. Hauptmann. Cross-domain video concept detection using adaptive svms. Proceedings of the 15th international conference on Multimedia, pages 188--197, 2007.

Digital Library

Cited By

Ewerth RSpringstein MPhan-Vogtmann LSchütze J(2017)“Are Machines Better Than Humans in Image Tagging?” - A User Study Adds to the PuzzleAdvances in Information Retrieval10.1007/978-3-319-56608-5_15(186-198)Online publication date: 8-Apr-2017
https://doi.org/10.1007/978-3-319-56608-5_15
Safadi BDerbas NQuénot G(2015)Descriptor optimization for multimedia indexing and retrievalMultimedia Tools and Applications10.1007/s11042-014-2071-674:4(1267-1290)Online publication date: 1-Feb-2015
https://dl.acm.org/doi/10.1007/s11042-014-2071-6
Kordumova SLi XSnoek C(2015)Best practices for learning video concept detectors from social media examplesMultimedia Tools and Applications10.1007/s11042-014-2056-574:4(1291-1315)Online publication date: 1-Feb-2015
https://dl.acm.org/doi/10.1007/s11042-014-2056-5
Show More Cited By

Index Terms

(Un)Reliability of video concept detection
1. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Multi-Layer Multi-Instance Learning for Video Concept Detection

This paper presents a novel learning-based method, called ldquomulti-layer multi-instance (MLMI) learning,rdquo for video concept detection. Most of existing methods have treated video as a flat data sequence and have not investigated the intrinsic ...
Cross-domain video concept detection using adaptive svms
MM '07: Proceedings of the 15th ACM international conference on Multimedia

Many multimedia applications can benefit from techniques for adapting existing classifiers to data with different distributions. One example is cross-domain video concept detection which aims to adapt concept classifiers across various video domains. In ...
Building a comprehensive ontology to refine video concept detection
MIR '07: Proceedings of the international workshop on Workshop on multimedia information retrieval

Recent research has discovered that leveraging ontology is an effective way to facilitate semantic video concept detection. As an explicit knowledge representation, a formal ontology definition usually consists of a lexicon, properties, and relations. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIVR '08: Proceedings of the 2008 international conference on Content-based image and video retrieval

July 2008

674 pages

ISBN:9781605580708

DOI:10.1145/1386352

General Chairs:
Jiebo Luo
Kodak Research Laboratories
,
Ling Guan
Ryerson University
,
Program Chairs:
Alan Hanjalic
Delft University of Technology
,
Mohan Kankanhalli
National University of Singapore
,
Ivan Lee
University of South Australia

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIVR08

Sponsor:

CIVR08: CIVR'08 - International Conference on Content-based Image and Video Retrieval

July 7 - 9, 2008

Niagara Falls, Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

42
Total Citations
View Citations
389
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 07 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ewerth RSpringstein MPhan-Vogtmann LSchütze J(2017)“Are Machines Better Than Humans in Image Tagging?” - A User Study Adds to the PuzzleAdvances in Information Retrieval10.1007/978-3-319-56608-5_15(186-198)Online publication date: 8-Apr-2017
https://doi.org/10.1007/978-3-319-56608-5_15
Safadi BDerbas NQuénot G(2015)Descriptor optimization for multimedia indexing and retrievalMultimedia Tools and Applications10.1007/s11042-014-2071-674:4(1267-1290)Online publication date: 1-Feb-2015
https://dl.acm.org/doi/10.1007/s11042-014-2071-6
Kordumova SLi XSnoek C(2015)Best practices for learning video concept detectors from social media examplesMultimedia Tools and Applications10.1007/s11042-014-2056-574:4(1291-1315)Online publication date: 1-Feb-2015
https://dl.acm.org/doi/10.1007/s11042-014-2056-5
Mühling MEwerth RFreisleben B(2015)Improving Cross-Domain Concept Detection via Object-Based FeaturesProceedings, Part II, of the 16th International Conference on Computer Analysis of Images and Patterns - Volume 925710.1007/978-3-319-23117-4_31(359-370)Online publication date: 2-Sep-2015
https://dl.acm.org/doi/10.1007/978-3-319-23117-4_31
Habibian ASnoek C(2014)Recommendations for recognizing video events by concept vocabulariesComputer Vision and Image Understanding10.1016/j.cviu.2014.02.003124(110-122)Online publication date: Jul-2014
https://doi.org/10.1016/j.cviu.2014.02.003
Aly RLarson M(2014)Detector Performance Prediction Using Set AnnotationsAdaptive Multimedia Retrieval: Semantics, Context, and Adaptation10.1007/978-3-319-12093-5_16(262-275)Online publication date: 29-Oct-2014
https://doi.org/10.1007/978-3-319-12093-5_16
Godin FDe Neve WVan de Walle RJain RPrabhakaran BWorring MSmith JChua T(2013)Towards fusion of collective knowledge and audio-visual content features for annotating broadcast videoProceedings of the 3rd ACM conference on International conference on multimedia retrieval10.1145/2461466.2461530(329-332)Online publication date: 16-Apr-2013
https://dl.acm.org/doi/10.1145/2461466.2461530
Zhu XHuang ZCui JShen H(2013)Video-to-Shot Tag Propagation by Graph Sparse Group LassoIEEE Transactions on Multimedia10.1109/TMM.2012.223372315:3(633-646)Online publication date: 1-Apr-2013
https://dl.acm.org/doi/10.1109/TMM.2012.2233723
Jiang YWang JXue XChang S(2013)Query-Adaptive Image Search With Hash CodesIEEE Transactions on Multimedia10.1109/TMM.2012.223106115:2(442-453)Online publication date: 1-Feb-2013
https://dl.acm.org/doi/10.1109/TMM.2012.2231061
Kordumova SLi XSnoek C(2013)Evaluating sources and strategies for learning video concepts from social media2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI)10.1109/CBMI.2013.6576561(91-96)Online publication date: Jun-2013
https://doi.org/10.1109/CBMI.2013.6576561
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten