Loading [a11y]/accessibility-menu.js
Motion Stimulation for Compositional Action Recognition | IEEE Journals & Magazine | IEEE Xplore

Motion Stimulation for Compositional Action Recognition


Abstract:

Recognizing the unseen combinations of action and different objects, namely (zero-shot) compositional action recognition, is extremely challenging for conventional action...Show More

Abstract:

Recognizing the unseen combinations of action and different objects, namely (zero-shot) compositional action recognition, is extremely challenging for conventional action recognition algorithms in real-world applications. Previous methods focus on enhancing the dynamic clues of objects that appear in the scene by building region features or tracklet embedding from ground-truths or detected bounding boxes. These methods rely heavily on manual annotation or the quality of detectors, which are inflexible for practical applications. In this work, we aim to mining the temporal clues from moving objects or hands without explicit supervision. Thus, we propose a novel Motion Stimulation (MS) block, which is specifically designed to mine dynamic clues of the local regions autonomously from adjacent frames. Furthermore, MS consists of the following three steps: motion feature extraction, motion feature recalibration, and action-centric excitation. The proposed MS block can be directly and conveniently integrated into existing video backbones to enhance the ability of compositional generalization for action recognition algorithms. Extensive experimental results on three action recognition datasets, the Something-Else, IKEA-Assembly and EPIC-KITCHENS datasets, indicate the effectiveness and interpretability of our MS block.
Page(s): 2061 - 2074
Date of Publication: 14 November 2022

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.