Loading [a11y]/accessibility-menu.js
Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition | IEEE Journals & Magazine | IEEE Xplore

Holographic Feature Learning of Egocentric-Exocentric Videos for Multi-Domain Action Recognition


Abstract:

Though existing cross-domain action recognition methods successfully improve the performance on videos of one view (e.g., egocentric videos) by transferring the knowledge...Show More

Abstract:

Though existing cross-domain action recognition methods successfully improve the performance on videos of one view (e.g., egocentric videos) by transferring the knowledge from videos of another view (e.g., exocentric videos), they have limitations in generality because the source and target domains need to be fixed aforehand. In this paper, we propose to solve a more practical task of multi-domain action recognition on egocentric-exocentric videos, which aims to learn a single model to recognize test videos from either egocentric perspective or exocentric perspective by transferring knowledge between two domains. Though previous cross-domain methods can also transfer knowledge from one domain to another one by learning view-invariant representations of two video domains, they are not suitable for the multi-domain action recognition task because they always suffer from the problem of losing view-specific visual information. As a solution to the multi-domain action recognition task, we propose to map a video from either egocentric perspective or exocentric perspective to a global feature space (we call it holographic feature space) that shares both view-invariant and view-specific visual knowledge of two views. Specially, we decompose the video feature into view-invariant component and view-specific component, where view-specific component is written into memory networks for saving view-specific visual knowledge. The final holographic feature combines view-invariant feature and view-specific features of two views based on the memory networks. We demonstrate the effectiveness of the proposed method with extensive experimental results on two public datasets. Moreover, the good performances under the semi-supervised setting show the generality of our model.
Published in: IEEE Transactions on Multimedia ( Volume: 24)
Page(s): 2273 - 2286
Date of Publication: 12 May 2021

ISSN Information:

Funding Agency:


References

References is not available for this document.