short-paper

Semantic video indexing by fusing explicit and implicit context spaces

Authors:

Xiangyang XueAuthors Info & Claims

MM '10: Proceedings of the 18th ACM international conference on Multimedia

Pages 967 - 970

https://doi.org/10.1145/1873951.1874125

Published: 25 October 2010 Publication History

Abstract

This paper addresses the problem of context-based concept fusion (CBCF) for concept detection and semantic video indexing. We introduce a novel framework based on constructing context spaces of concepts, such that the contextual correlations are used to improve the performance of concept detectors. Different from traditional CBCF approach, we present two kinds of such context spaces: explicit context space for modeling the correlation of pairwise concepts, and implicit context space for representing latent themes trained from a set of concepts. The final concept detection scores are then directly fused from explicit and implicit context spaces. Experiments are presented on TRECVid 2006 benchmark and the comparisons with several state-of-the-art approaches demonstrate the effectiveness of proposed framework.

References

[1]

W. Jiang, S.-F. Chang, and A. C. Loui. Active context-based concept fusion with partial user labels. In ICIP, pages 2917--2920, 2006.

[2]

W. Jiang, S.-F. Chang, and A. C. Loui. Context-based concept fusion with boosted conditional random fields. In ICASSP, pages 949--952, 2007.

[3]

Y.-G. Jiang, J. Wang, S.-F. Chang, and C.-W. Ngo. Domain adaptive semantic diffusion for large scale context-based video annotation. In ICCV, 2009.

[4]

Y.-G. Jiang, J. Yang, C.-W. Ngo, and A. G. Hauptmann. Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Transaction on Multimedia, 12(1):42--53, 2010.

Digital Library

[5]

L. S. Kennedy and S.-F. Chang. A reranking approach for context-based concept fusion in video indexing and retrieval. In CIVR, pages 333--340, 2007.

Digital Library

[6]

H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. In NIPS, pages 801--808, 2007.

Digital Library

[7]

M. Naphade, J. R. Smith, J. Tesic, S. F. Chang, W. Hsu, L. Kennedy, A. Hauptmann, and J. Curtis. Large-scale concept ontology for multimedia. IEEE Multimedia, 13(3):86--91, 2006.

Digital Library

[8]

G.-J. Qi, X.-S. Hua, Y. Rui, J. Tang, T. Mei, and H.-J. Zhang. Correlative multi-label video annotation. In ACM Multimedia, pages 17--26, 2007.

Digital Library

[9]

A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and TRECVid. In MIR, pages 321--330, 2006.

Digital Library

[10]

A. F. Smeaton, P. Over, and W. Kraaij. High-level feature detection from video in TRECVid: a 5-year retrospective of achievements. In Multimedia Content Analysis, Theory and Applications, pages 151--174. 2009.

[11]

J. R. Smith, M. Naphade, and A. Natsev. Multimedia semantic indexing using model vectors. In ICME, 2003.

Digital Library

[12]

C. G. M. Snoek, M.Worring, J. C. van Gemert, J.-M. Geusebroek, and A. W. M. Smeulders. The challenge problem for automated detection of 101 semantic concepts in multimedia. In ACM Multimedia, pages 421--430, 2006.

Digital Library

[13]

K. E. A. van de Sande, T. Gevers, and C. G. M. Snoek. Evaluating color descriptors for object and scene recognition. IEEE Transaction on Pattern Analysis and Machine Intelligence, 32(9):1582--1596, 2010.

Digital Library

[14]

X.-Y. Wei, Y.-G. Jiang, and C.-W. Ngo. Exploring inter-concept relationship with context space for semantic video indexing. In CIVR, 2009.

Digital Library

[15]

M. F. Weng and Y. Y. Chuang. Multi-cue fusion for semantic video indexing. In ACM Multimedia, pages 71--80, 2008.

Digital Library

[16]

A. Yanagawa, S.-F. Chang, L. Kennedy, and W. Hsu. Columbia University's baseline detectors for 374 LSCOM semantic visual concepts. Technical report, Columbia University, 2007.

[17]

Z.-J. Zha, T. Mei, X.-S. Hua, G.-J. Qi, and Z. Wang. Refining video annotation by exploiting pairwise concurrent relation. In ACM Multimedia, pages 345--348, 2007.

Digital Library

Cited By

Qiu HZheng YYe HLu YWang FHe LAizawa KLew MSatoh S(2018)Precise Temporal Action Localization by Evolving Temporal ProposalsProceedings of the 2018 ACM on International Conference on Multimedia Retrieval10.1145/3206025.3206029(388-396)Online publication date: 5-Jun-2018
https://dl.acm.org/doi/10.1145/3206025.3206029
Geng JMiao ZZhang X(2015)Efficient Heuristic Methods for Multimodal Fusion and Concept Fusion in Video Concept DetectionIEEE Transactions on Multimedia10.1109/TMM.2015.239819517:4(498-511)Online publication date: Apr-2015
https://doi.org/10.1109/TMM.2015.2398195
Yi JPeng YXiao J(2013)Exploiting Semantic and Visual Context for Effective Video AnnotationIEEE Transactions on Multimedia10.1109/TMM.2013.225026615:6(1400-1414)Online publication date: 1-Oct-2013
https://dl.acm.org/doi/10.1109/TMM.2013.2250266
Show More Cited By

Index Terms

Semantic video indexing by fusing explicit and implicit context spaces
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Video summarization
2. Information systems
  1. Information retrieval
    1. Document representation
    2. Search engine architectures and scalability
      1. Search engine indexing

Recommendations

Exploring inter-concept relationship with context space for semantic video indexing
CIVR '09: Proceedings of the ACM International Conference on Image and Video Retrieval

Semantic concept detectors are often individually and independently developed. Using peripherally related concepts for leveraging the power of joint detection, which is referred to as context-based concept fusion (CBCF), has been one of the focus ...
Explicit and implicit concept-based video retrieval with bipartite graph propagation model
MM '10: Proceedings of the 18th ACM international conference on Multimedia

The major scientific problem for content-based video retrieval is the semantic gap. Generally speaking, there are two appropriate ways to bridge the semantic gap: the first one is from human perspective (top-down) and the other one is from computer ...
Extended conceptual feedback for semantic multimedia indexing

In this paper, we consider the problem of automatically detecting a large number of visual concepts in images or video shots. State of the art systems generally involve feature (descriptor) extraction, classification (supervised learning) and fusion ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '10: Proceedings of the 18th ACM international conference on Multimedia

October 2010

1836 pages

ISBN:9781605589336

DOI:10.1145/1873951

General Chairs:
Alberto del Bimbo
University of Florence, Italy
,
Shih-Fu Chang
Columbia University, USA
,
Program Chair:
Arnold Smeulders
University of Amsterdam, NL

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 October 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

MM '10

Sponsor:

SIGMM

MM '10: ACM Multimedia Conference

October 25 - 29, 2010

Firenze, Italy

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
252
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)1

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Qiu HZheng YYe HLu YWang FHe LAizawa KLew MSatoh S(2018)Precise Temporal Action Localization by Evolving Temporal ProposalsProceedings of the 2018 ACM on International Conference on Multimedia Retrieval10.1145/3206025.3206029(388-396)Online publication date: 5-Jun-2018
https://dl.acm.org/doi/10.1145/3206025.3206029
Geng JMiao ZZhang X(2015)Efficient Heuristic Methods for Multimodal Fusion and Concept Fusion in Video Concept DetectionIEEE Transactions on Multimedia10.1109/TMM.2015.239819517:4(498-511)Online publication date: Apr-2015
https://doi.org/10.1109/TMM.2015.2398195
Yi JPeng YXiao J(2013)Exploiting Semantic and Visual Context for Effective Video AnnotationIEEE Transactions on Multimedia10.1109/TMM.2013.225026615:6(1400-1414)Online publication date: 1-Oct-2013
https://dl.acm.org/doi/10.1109/TMM.2013.2250266
Hamadi AQuenot GMulhem P(2013)Clustering based rescoring for semantic indexing of multimedia documents2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI)10.1109/CBMI.2013.6576550(41-46)Online publication date: Jun-2013
https://doi.org/10.1109/CBMI.2013.6576550
Zhong CMiao Z(2012)A Two-View Concept Correlation Based Video Annotation RefinementIEEE Signal Processing Letters10.1109/LSP.2012.218938619:5(259-262)Online publication date: May-2012
https://doi.org/10.1109/LSP.2012.2189386
Roy SMei TZeng WLi S(2012)Empowering Cross-Domain Internet Media with Real-Time Topic Learning from Social StreamsProceedings of the 2012 IEEE International Conference on Multimedia and Expo10.1109/ICME.2012.105(49-54)Online publication date: 9-Jul-2012
https://dl.acm.org/doi/10.1109/ICME.2012.105
Hamadi AQuenot GMulhem P(2012)Two-layers re-ranking approach based on contextual information for visual concepts detection in videos2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI)10.1109/CBMI.2012.6269837(1-6)Online publication date: Jun-2012
https://doi.org/10.1109/CBMI.2012.6269837
Geng JMiao ZChi H(2012)Temporal-Spatial refinements for video concept fusionProceedings of the 11th Asian conference on Computer Vision - Volume Part III10.1007/978-3-642-37431-9_42(547-559)Online publication date: 5-Nov-2012
https://dl.acm.org/doi/10.1007/978-3-642-37431-9_42

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten