Loading [a11y]/accessibility-menu.js
Temporal Refinement Graph Convolutional Network for Skeleton-Based Action Recognition | IEEE Journals & Magazine | IEEE Xplore

Temporal Refinement Graph Convolutional Network for Skeleton-Based Action Recognition


Impact Statement:GCN has been widely used in skeleton-based action recognition. In this article, a temporal refinement graph convolution module with contrastive learning mechanism is prop...Show More

Abstract:

Human skeleton data, which has served in the aspect of human activity recognition, ought to be the most representative biometric characteristics due to its intuitivity an...Show More
Impact Statement:
GCN has been widely used in skeleton-based action recognition. In this article, a temporal refinement graph convolution module with contrastive learning mechanism is proposed to better modeling the latent features of motional dynamics by assigning different importance on channel and spatiotemporal dimension, and maximizing the learned mutual representatives. An interframe correlation matrix is proposed to embed the distant temporal correlations of frame-pairs to the skeletal representatives and generalize GCN operator to temporal domain. A STCA module is proposed to establish the short-range and long-term dependencies of skeletal sequence through hierarchically enlarging the receptive field by the successive feature flows within feature branches. The overall framework consists of the three above designed novelties, which can effectively improve action recognition performances on public datasets.

Abstract:

Human skeleton data, which has served in the aspect of human activity recognition, ought to be the most representative biometric characteristics due to its intuitivity and visuality. The state-of-the-art approaches mainly focus on improving modeling spatial correlations within graph topologies. However, the interframes motional representations are also of vital importance, and we argue that they are worth paying attention to and exploring. Therefore, a temporal refinement module with contrastive learning mechanism is proposed, fuzing as a complementary to the conventional spatial graph convolution layer. In addition, in order to further exploiting the interframe variances and generalizing graph convolutional network (GCN) operation to temporal dimension, a temporal-correlation matrix is introduced to effectively capture dynamic dependencies within frame-pairs, enhancing semantic feature representation. Moreover, since GCN is a typical local operator which lacks of capability to fully m...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 5, Issue: 4, April 2024)
Page(s): 1586 - 1598
Date of Publication: 06 November 2023
Electronic ISSN: 2691-4581

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.