poster

n-gram Models for Video Semantic Indexing

Authors:
Nakamasa Inoue

Tokyo Institute of Technology, Tokyo, Japan

Tokyo Institute of Technology, Tokyo, Japan
View Profile

,
Koichi Shinoda

Tokyo Institute of Technology, Tokyo, Japan

Tokyo Institute of Technology, Tokyo, Japan
View Profile

MM '14: Proceedings of the 22nd ACM international conference on MultimediaNovember 2014Pages 777–780https://doi.org/10.1145/2647868.2654961

Published:03 November 2014Publication History

MM '14: Proceedings of the 22nd ACM international conference on Multimedia

Pages 777–780

ABSTRACT

We propose n-gram modeling of shot sequences for video semantic indexing, in which semantic concepts are extracted from a video shot. Most previous studies for this task have assumed that video shots in a video clip are independent from each other. We model the time-dependency between them assuming that n-consecutive video shots are dependent. Our models improve the robustness against occlusion and camera-angle changes by effectively using information from the previous video shots. In our experiments on the TRECVID 2012 Semantic Indexing Benchmark, we applied the proposed models to a system using Gaussian mixture models and support vector machines. Mean average precision was improved from 30.62% to 32.14%, which is the best performance on the TRECVID 2012 Semantic Indexing to the best of our knowledge.

References

A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain. Content-based image retrieval at the end of the early years. In IEEE Trans. on PAMI, vol.22, no.12, pp.1349--1380, 2000. Figure 4: Comparison of our methods with TRECVID 2012 Semantic Indexing Submissions. Mean AP of the best submission was 32.10%. Google ScholarDigital Library
G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. Proc. ECCV SLCV workshop, pages 59--74, 2004.Google Scholar
F. Perronnin, C. Dance, G. Csurka, and M. Bressan. Adapted vocabularies for generic visual categorization. Proc. ECCV, pages 464--475, 2006. Google ScholarDigital Library
N. Inoue, and K. Shinoda. A Fast and Accurate Video Semantic-Indexing System Using Fast MAP Adaptation and GMM Supervectors. In IEEE Trans. on Multimedia, vol.14, no.4, pages 1196--1205, 2012.Google ScholarDigital Library
F. Perronnin, S. Jorge, and T. Mensink. Improving the fisher kernel for large-scale image classification. Proc. ECCV, pages 143--156, 2010. Google ScholarDigital Library
P. Over, et al. TRECVID 2013 -- An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics. Proc. TRECVID workshop, 2013.Google Scholar
C.G.M. Snoek, et al. The MediaMill TRECVID 2012 Semantic Video Search Engine. Proc. TRECVID workshop, 2012.Google Scholar
N. Inoue, et al., Semantic Indexing Using GMM Supervectors and Tree-structured GMMs (TokyoTech+Canon at TRECVID 2011). Proc. TRECVID workshop, 2011.Google Scholar
R. Ando, K. Shinoda, S. Furui, and T. Mochizuki. Robust scene Recognition Using Language Models for Scene Contexts. Proc. ACM MIR workshop, pp. 99--106, 2004. Google ScholarDigital Library
H. Kuehne, A. Arslan, and T. Serre, The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities Proc. CVPR, 2014. Google ScholarDigital Library
A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and TRECVid. Proc. ACM MIR workshop, pp.321--330, 2006. Google ScholarDigital Library
A. F. Smeaton, P. Over, and W. Kraaij. High-Level Feature Detection from Video in TRECVid: a 5-Year Retrospective of Achievements. In Multimedia Content Analysis, Theory and Applications, Springer Verlag, pp.151--174, 2009.Google Scholar
S. Ayache, and G. Quéenot. Video Corpus Annotation using Active Learning. Proc. ECIR, pp.187--198, 2008. Google ScholarDigital Library

Index Terms

n-gram Models for Video Semantic Indexing
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks

Recommendations

An ontology-based retrieval system using semantic indexing

In this paper, we present an ontology-based information extraction and retrieval system and its application in the soccer domain. In general, we deal with three issues in semantic search, namely, usability, scalability and retrieval performance. We ...
Read More
Multi modal semantic indexing for image retrieval
CIVR '10: Proceedings of the ACM International Conference on Image and Video Retrieval

Popular image retrieval schemes generally rely only on a single mode, (either low level visual features or embedded text) for searching in multimedia databases. Many popular image collections (eg. those emerging over Internet) have associated tags, ...
Read More
Multi-modal CBIR Algorithm Based on Latent Semantic Indexing
ICIW '10: Proceedings of the 2010 Fifth International Conference on Internet and Web Applications and Services

The paper presents a new multiple feature fusion (MFF) based on latent semantic indexing (LSI) method to achieve an improved image retrieval performance. The proposed method extracts different physical features, which come from not the whole image but ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '14: Proceedings of the 22nd ACM international conference on Multimedia
November 2014
1310 pages
ISBN:9781450330633
DOI:10.1145/2647868
General Chairs:
Kien A. Hua
University of Central Florida, USA
,
Yong Rui
Microsoft Research, China
,
Ralf Steinmetz
Technische Universitt Darmstadt, Germany
,
Program Chairs:
Alan Hanjalic
Delft University of Technology, Netherlands
,
Apostol (Paul) Natsev
Google, USA
,
Wenwu Zhu
Tsinghua University, China
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
gaussian mixture models
n-gram models
semantic indexing
video search
Qualifiers
- poster
Conference

Acceptance Rates
MM '14 Paper Acceptance Rate55of286submissions,19%Overall Acceptance Rate995of4,171submissions,24%
More
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 135
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

n-gram Models for Video Semantic Indexing

MM '14: Proceedings of the 22nd ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

An ontology-based retrieval system using semantic indexing

Multi modal semantic indexing for image retrieval

Multi-modal CBIR Algorithm Based on Latent Semantic Indexing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

n-gram Models for Video Semantic Indexing

MM '14: Proceedings of the 22nd ACM international conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

An ontology-based retrieval system using semantic indexing

Multi modal semantic indexing for image retrieval

Multi-modal CBIR Algorithm Based on Latent Semantic Indexing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media