Loading [a11y]/accessibility-menu.js
Regularized Two Granularity Loss Function for Weakly Supervised Video Moment Retrieval | IEEE Journals & Magazine | IEEE Xplore

Regularized Two Granularity Loss Function for Weakly Supervised Video Moment Retrieval


Abstract:

Weakly supervised video moment retrieval or weakly supervised language moment retrieval aims to search the most relevant moment given a language query. In order to guide ...Show More

Abstract:

Weakly supervised video moment retrieval or weakly supervised language moment retrieval aims to search the most relevant moment given a language query. In order to guide the model to capture the most matching video segments with the text description, we design a two-granularity loss function that simultaneously considers both video-level and instance-level relationships. Specifically, we first generate coarse video segments and regard each video segment as an instance. For video-level regularized multiple instance loss (MIL), we leverage the latent alignment between all intra-video segments (ie., positive bag) and text descriptions. Then, we classify these segments by regarding this procedure as a supervised learning task under noisy labels. With the instance-level regularized loss function, our model can learn to correct noisy instance-level labels so as to locate the more accurate frame boundary from all the positive instances. Comprehensive experimental results on ActivityNet and DiDeMo demonstrate that the proposed loss function sets a new state-of-the-art.
Published in: IEEE Transactions on Multimedia ( Volume: 24)
Page(s): 1141 - 1151
Date of Publication: 20 October 2021

ISSN Information:

Funding Agency:


References

References is not available for this document.