skip to main content
10.1145/2647868.2654961acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
poster

n-gram Models for Video Semantic Indexing

Published:03 November 2014Publication History

ABSTRACT

We propose n-gram modeling of shot sequences for video semantic indexing, in which semantic concepts are extracted from a video shot. Most previous studies for this task have assumed that video shots in a video clip are independent from each other. We model the time-dependency between them assuming that n-consecutive video shots are dependent. Our models improve the robustness against occlusion and camera-angle changes by effectively using information from the previous video shots. In our experiments on the TRECVID 2012 Semantic Indexing Benchmark, we applied the proposed models to a system using Gaussian mixture models and support vector machines. Mean average precision was improved from 30.62% to 32.14%, which is the best performance on the TRECVID 2012 Semantic Indexing to the best of our knowledge.

References

  1. A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain. Content-based image retrieval at the end of the early years. In IEEE Trans. on PAMI, vol.22, no.12, pp.1349--1380, 2000. Figure 4: Comparison of our methods with TRECVID 2012 Semantic Indexing Submissions. Mean AP of the best submission was 32.10%. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. G. Csurka, C. Dance, L. Fan, J. Willamowski, and C. Bray. Visual categorization with bags of keypoints. Proc. ECCV SLCV workshop, pages 59--74, 2004.Google ScholarGoogle Scholar
  3. F. Perronnin, C. Dance, G. Csurka, and M. Bressan. Adapted vocabularies for generic visual categorization. Proc. ECCV, pages 464--475, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Inoue, and K. Shinoda. A Fast and Accurate Video Semantic-Indexing System Using Fast MAP Adaptation and GMM Supervectors. In IEEE Trans. on Multimedia, vol.14, no.4, pages 1196--1205, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. Perronnin, S. Jorge, and T. Mensink. Improving the fisher kernel for large-scale image classification. Proc. ECCV, pages 143--156, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. P. Over, et al. TRECVID 2013 -- An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics. Proc. TRECVID workshop, 2013.Google ScholarGoogle Scholar
  7. C.G.M. Snoek, et al. The MediaMill TRECVID 2012 Semantic Video Search Engine. Proc. TRECVID workshop, 2012.Google ScholarGoogle Scholar
  8. N. Inoue, et al., Semantic Indexing Using GMM Supervectors and Tree-structured GMMs (TokyoTech+Canon at TRECVID 2011). Proc. TRECVID workshop, 2011.Google ScholarGoogle Scholar
  9. R. Ando, K. Shinoda, S. Furui, and T. Mochizuki. Robust scene Recognition Using Language Models for Scene Contexts. Proc. ACM MIR workshop, pp. 99--106, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Kuehne, A. Arslan, and T. Serre, The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities Proc. CVPR, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and TRECVid. Proc. ACM MIR workshop, pp.321--330, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. F. Smeaton, P. Over, and W. Kraaij. High-Level Feature Detection from Video in TRECVid: a 5-Year Retrospective of Achievements. In Multimedia Content Analysis, Theory and Applications, Springer Verlag, pp.151--174, 2009.Google ScholarGoogle Scholar
  13. S. Ayache, and G. Quéenot. Video Corpus Annotation using Active Learning. Proc. ECIR, pp.187--198, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. n-gram Models for Video Semantic Indexing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '14: Proceedings of the 22nd ACM international conference on Multimedia
      November 2014
      1310 pages
      ISBN:9781450330633
      DOI:10.1145/2647868

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 November 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      MM '14 Paper Acceptance Rate55of286submissions,19%Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader