skip to main content
10.1145/3078971.3079044acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
tutorial

Video Indexing, Search, Detection, and Description with Focus on TRECVID

Authors Info & Claims
Published:06 June 2017Publication History

ABSTRACT

There has been a tremendous growth in video data the last decade. People are using mobile phones and tablets to take, share or watch videos more than ever before. Video cameras are around us almost everywhere in the public domain (e.g. stores, streets, public facilities, ...etc). Efficient and effective retrieval methods are critically needed in different applications. The goal of TRECVID is to encourage research in content-based video retrieval by providing large test collections, uniform scoring procedures, and a forum for organizations interested in comparing their results. In this tutorial, we present and discuss some of the most important and fundamental content-based video retrieval problems such as recognizing predefined visual concepts, searching in videos for complex ad-hoc user queries, searching by image/video examples in a video dataset to retrieve specific objects, persons, or locations, detecting events, and finally bridging the gap between vision and language by looking into how can systems automatically describe videos in a natural language. A review of the state of the art, current challenges, and future directions along with pointers to useful resources will be presented by different regular TRECVID participating teams. Each team will present one of the following tasks:

  • Semantic INdexing (SIN)

  • Zero-example (0Ex) Video Search (AVS)

  • Instance Search (INS)

  • Multimedia Event Detection (MED)

  • Video to Text (VTT)

References

  1. George Awad, Jonathan Fiscus, Martial Michel, David Joy, Wessel Kraaij, Alan F Smeaton, Georges Quénot, Maria Eskevich, Robin Aly, and Roeland Ordelman. 2016. Trecvid 2016: Evaluating video search, video event detection, localization, and hyperlinking. In Proceedings of TRECVID, Vol. 2016.Google ScholarGoogle Scholar
  2. George Awad, Wessel Kraaij, Paul Over, and Shinâichi Satoh. 2017. Instance search retrospective with focus on TRECVID. International Journal of Multimedia Information Retrieval 6, 1 (2017), 1--29.Google ScholarGoogle ScholarCross RefCross Ref
  3. George Awad, Cees GM Snoek, Alan F Smeaton, and Georges Quénot. 2016. TRECVid Semantic Indexing of Video: A 6-Year Retrospective. ITE Transactions on Media Technology and Applications 4, 3 (2016), 187--208.Google ScholarGoogle ScholarCross RefCross Ref
  4. Mateusz Budnik, Efrain-Leonardo Gutierrez-Gomez, Bahjat Safadi, Denis Pellerin, and Georges Quénot. 2016. Learned features versus engineered features for multimedia indexing. Multimedia Tools and Applications (2016), 1--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jianfeng Dong, Xirong Li, Weiyu Lan, Yujia Huo, and Cees GM Snoek. 2016. Early Embedding and Late Reranking for Video Captioning. In Proceedings of the 2016 ACM on Multimedia Conference. ACM, 1082--1086. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Jianfeng Dong, Xirong Li, and Cees GM Snoek. 2016. Word2VisualVec: Image and Video to Sentence Matching by Visual Feature Prediction. In ArXive.Google ScholarGoogle Scholar
  7. Amirhossein Habibian, Thomas Mensink, and Cees GM Snoek. 2014. Videostory: A new multimedia embedding for few-example recognition and translation of events. In Proceedings of the 22nd ACM international conference on Multimedia. ACM, 17--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Amirhossein Habibian, Thomas Mensink, and Cees GM Snoek. 2015. Discovering semantic vocabularies for cross-media retrieval. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. ACM, 131--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Amirhossein Habibian, Thomas Mensink, and Cees GM Snoek. 2017. Video2vec Embeddings Recognize Events when Examples are Scarce. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017).Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Amirhossein Habibian and Cees GM Snoek. 2014. Recommendations for rec- ognizing video events by concept vocabularies. Computer Vision and Image Understanding 124 (2014), 110--122.Google ScholarGoogle ScholarCross RefCross Ref
  11. Duy-Dinh Le, S. Phan, V. Nguyen, C. Zhu, D. M. Nguyen, T. D. Ngo, S. Kasamwat- tanarote, P. Sebastien, M. Tran, D. A. Duong, and Shin'ichi Satoh. 2014. National Institute of Informatics, Japan at TRECVID 2014. In TRECVID.Google ScholarGoogle Scholar
  12. Yi-Jie Lu, Phuong Anh Nguyen, Hao Zhang, and Chong-Wah Ngo. 2017. Concept- Based Interactive Search System. In International Conference on Multimedia Modeling. Springer, 463--468.Google ScholarGoogle Scholar
  13. Yi-Jie Lu, Hao Zhang, Maaike de Boer, and Chong-Wah Ngo. 2016. Event detec- tion with zero example: select the right and suppress the wrong concepts. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval. ACM, 127--134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Masoud Mazloom, Efstratios Gavves, and Cees GM Snoek. 2014. Conceptlets: Selective semantics for classifying video events. IEEE Transactions on Multimedia 16, 8 (2014), 2214--2228.Google ScholarGoogle ScholarCross RefCross Ref
  15. Masoud Mazloom, Xirong Li, and Cees GM Snoek. 2016. Tagbook: A semantic video representation without supervision for event detection. IEEE Transactions on Multimedia 18, 7 (2016), 1378--1388. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Pascal Mettes, Dennis C Koelma, and Cees GM Snoek. 2016. The imagenet shuffle: Reorganized pre-training for video event detection. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval. ACM, 175--182. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Xiao-Yong Wei, Yu-Gang Jiang, and Chong-Wah Ngo. 2011. Concept-driven multi-modality fusion for video search. IEEE Transactions on Circuits and Systems for Video Technology 21, 1 (2011), 62--73. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hao Zhang, Yi-Jie Lu, Maaike de Boer, Frank ter Haar, Zhaofan Qiu, Klamer Schutte, Wessel Kraaij, and Chong-Wah Ngo. 2015. VIREO-TNO@ TRECVID 2015: multimedia event detection. In Proc. of TRECVID .Google ScholarGoogle Scholar
  19. Cai-Zhi Zhu, Hervé Jégou, and Shin Ichi Satoh. 2013. Query-adaptive asym- metrical dissimilarities for visual object retrieval. In Proceedings of the IEEE International Conference on Computer Vision. 1705--1712. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Cai-Zhi Zhu and Shin'ichi Satoh. 2012. Large vocabulary quantization for search- ing instances from videos. In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval. ACM, 52. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Video Indexing, Search, Detection, and Description with Focus on TRECVID

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ICMR '17: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval
      June 2017
      524 pages
      ISBN:9781450347013
      DOI:10.1145/3078971
      • General Chairs:
      • Bogdan Ionescu,
      • Nicu Sebe,
      • Program Chairs:
      • Jiashi Feng,
      • Martha Larson,
      • Rainer Lienhart,
      • Cees Snoek

      Copyright © 2017 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 June 2017

      Check for updates

      Qualifiers

      • tutorial

      Acceptance Rates

      ICMR '17 Paper Acceptance Rate33of95submissions,35%Overall Acceptance Rate254of830submissions,31%

      Upcoming Conference

      ICMR '24
      International Conference on Multimedia Retrieval
      June 10 - 14, 2024
      Phuket , Thailand

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader