skip to main content
10.1145/3581783.3612829acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

The ACM Multimedia 2023 Deep Video Understanding Grand Challenge

Published: 27 October 2023 Publication History

Abstract

This is the overview paper for the Deep Video Understanding (DVU) Grand Challenge. In recent years, a growing trend towards working on understanding videos (in particular movies) to a deeper level started to motivate researchers working in multimedia and computer vision to present new approaches and datasets to tackle this problem. This is a challenging research area which aims to develop a deep understanding of the relations which exist between different individuals and entities in movies using all available modalities such as video, audio, text and metadata. The aim of this grand challenge is to foster innovative research in this new direction and to provide benchmarking evaluations to advance technologies in the deep video understanding community.

References

[1]
Paola Cascante-Bonilla, Kalpathy Sitaraman, Mengjia Luo, and Vicente Ordonez. 2019. Moviescope: Large-scale Analysis of Movies using Multiple Modalities. arXiv preprint arXiv:1908.03180 (2019).
[2]
Keith Curtis, George Awad, Shahzad Rajput, and Ian Soboroff. 2020. HLVU: A New Challenge to Test Deep Understanding of Movies the Way Humans do. In Proceedings of the 2020 International Conference on Multimedia Retrieval. 355--361.
[3]
Keith Curtis, George Awad, Shahzad Rajput, and Ian Soboroff. 2022. The ACM Multimedia 2022 Deep Video Understanding Grand Challenge. In Proceedings of the 30th ACM International Conference on Multimedia (Lisboa, Portugal) (MM '22). Association for Computing Machinery, New York, NY, USA, 7075--7078. https: //doi.org/10.1145/3503161.3551582
[4]
Jeremy Debattista, Fahim A Salim, Fasih Haider, Clare Conran, Owen Conlan, Keith Curtis, Wang Wei, Ademar Crotti Junior, and Declan O'Sullivan. 2018. Expressing Multimedia Content Using Semantics-A Vision. In 2018 IEEE 12th International Conference on Semantic Computing (ICSC). IEEE, 302--303.
[5]
Yi Fung, Han Wang, Tong Wang, Ali Kebarighotbi, Mohit Bansal, Heng Ji, and Prem Natarajan. 2023. DeepMaven: Deep question answering on long-distance movie/TV show videos with multimedia knowledge extraction and synthesis. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 3033--3043.
[6]
Erika Loc, Keith Curtis, George Awad, Shahzad Rajput, and Ian Soboroff. 2022. Proceedings of LREC2022 Workshop "People in language, vision and the mind"(PVLAM2022). In Proceedings of LREC2022 Workshop" People in language, vision and the mind"(P-VLAM2022).
[7]
Anna Rohrbach and Jae Sung Park. 2019. Large Scale Movie Description Challenge (LSMDC) 2019. https://sites.google.com/site/describingmovies/lsmdc-2019, Last accessed on 2019--11-06.
[8]
Makarand Tapaswi, Yukun Zhu, Rainer Stiefelhagen, Antonio Torralba, Raquel Urtasun, and Sanja Fidler. 2016. Movieqa: Understanding stories in movies through question-answering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4631--4640.

Cited By

View all
  • (2024)Performance Evaluation in Multimedia RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367888121:1(1-23)Online publication date: 14-Oct-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '23: Proceedings of the 31st ACM International Conference on Multimedia
October 2023
9913 pages
ISBN:9798400701085
DOI:10.1145/3581783
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. computer vision
  2. movie analysis
  3. multimedia understanding

Qualifiers

  • Research-article

Conference

MM '23
Sponsor:
MM '23: The 31st ACM International Conference on Multimedia
October 29 - November 3, 2023
Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)43
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Performance Evaluation in Multimedia RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367888121:1(1-23)Online publication date: 14-Oct-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media