research-article

Temporal-Aware Graph Convolution Network for Skeleton-based Action Recognition

Authors:

Fang RENAuthors Info & Claims

ICCCV '21: Proceedings of the 4th International Conference on Control and Computer Vision

Pages 83 - 90

https://doi.org/10.1145/3484274.3484288

Published: 23 November 2021 Publication History

Abstract

Graph convolutions networks (GCN) have drawn attention for skeleton-based action recognition because a skeleton with joints and bones can be naturally regarded as a graph structure. However, the existing methods are limited in temporal sequence modeling of human actions. To consider temporal factors in action modeling, we present a novel Temporal-Aware Graph Convolution Network (TA-GCN). First, we design a causal temporal convolution (CTCN) layer to ensure no impractical future information leakage to the past. Second, we present a novel cross-spatial-temporal graph convolution (3D-GCN) layer that extends an adaptive graph from the spatial to the temporal domain to capture local cross-spatial-temporal dependencies among joints. Involving the two temporal factors, TA-GCN can model the sequential nature of human actions. Experimental results on two large-scale datasets, NTU-RGB+D and Kinetics-Skeleton, indicate that our network achieves accuracy improvement (about 1% on the two datasets) over previous methods.

References

[1]

S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeleton-based action recognition,” 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 7444–7452, Jan. 2018

[2]

L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Two-stream adaptive graph convolutional networks for skeleton-based action recognition,” in Proceedings of the ieee computer society conference on computer vision and pattern recognition, Jun. 2019, vols. 2019-June, pp. 12018–12027

[3]

C. Li, Z. Cui, W. Zheng, C. Xu, and J. Yang, “Spatio-temporal graph convolution for skeleton based action recognition,” 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 3482–3489, 2018

[4]

M. Li, S. Chen, X. Chen, Y. Zhang, Y. Wang, and Q. Tian, “Actional-structural graph convolutional networks for skeleton-based action recognition,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vols. 2019-June, pp. 3590–3598, Apr. 2019

[5]

B. Li, X. Li, Z. Zhang, and F. Wu, “Spatio-Temporal Graph Routing for Skeleton-Based Action Recognition,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8561–8568, 2019,

Digital Library

[6]

K. Thakkar and P. J. Narayanan, “Part-based graph convolutional network for action recognition,” British Machine Vision Conference 2018, BMVC 2018, 2019,

[7]

X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local Neural Networks,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 7794–7803, Nov. 2018

[8]

A. Shahroudy, J. Liu, T. T. Ng, and G. Wang, “NTU RGB+D: A large scale dataset for 3D human activity analysis,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vols. 2016-Decem, pp. 1010–1019, 2016,

[9]

W. Kay, “The Kinetics Human Action Video Dataset,” 2017

[10]

Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh, “Realtime multi-person 2D pose estimation using part affinity fields,” Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, vols. 2017-Janua, pp. 1302–1310, 2017

[11]

L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Skeleton-Based Action Recognition with Multi-Stream Adaptive Graph Convolutional Networks,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 11, pp. 3247–3257, 2019.

Digital Library

[12]

A. Paszke, “Automatic differentiation in PyTorch,” Advances in Neural Information Processing Systems 32, pp. 8024–8035, 2019

[13]

Y. Tang, Y. Tian, J. Lu, P. Li, and J. Zhou, “Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 5323–5332, 2018,

[14]

T. S. Kim and A. Reiter, “Interpretable 3D Human Action Analysis with Temporal Convolutional Networks,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, vols. 2017-July, pp. 1623–1631, 2017

[15]

Plizzari, C., Cannici, M., & Matteucci, M. (2021). Spatial Temporal Transformer Network for Skeleton-Based Action Recognition. Lecture Notes in Computer Science, 12663 LNCS, 694–701. https://doi.org/10.1007/978-3-030-68796-0_50

Digital Library

[16]

Alzubi, J., Nayyar, A., & Kumar, A. (2018, November). Machine learning from theory to algorithms: an overview. In Journal of physics: conference series (Vol. 1142, No. 1, p. 012012). IOP Publishing.

[17]

Kukkar, A., Mohana, R., Nayyar, A., Kim, J., Kang, B. G., & Chilamkurti, N. (2019). A novel deep-learning-based bug severity classification technique using convolutional neural networks and random forest with boosting. Sensors,19(13), 2964.

Index Terms

Temporal-Aware Graph Convolution Network for Skeleton-based Action Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Theory of computation
  1. Theory and algorithms for application domains

Index terms have been assigned to the content through auto-classification.

Recommendations

Temporal‐enhanced graph convolution network for skeleton‐based action recognition
Abstract
Graph convolution networks (GCNs) have drawn attention for skeleton‐based action recognition. They have achieved remarkable performance by adaptively learning spatial features of human action dynamics. However, the existing methods are limited ...
Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure
Abstract
Skeleton-based action recognition has recently achieved much attention since they can robustly convey the action information. Recently, many studies have shown that graph convolutional networks (GCNs), which generalize CNNs to more generic non-...
Self-Relational Graph Convolution Network for Skeleton-Based Action Recognition
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Using a Graph convolution network (GCN) for constructing and aggregating node features has been helpful for skeleton-based action recognition. The strength of the nodes' relation of an action sequence distinguishes it from other actions. This work ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICCCV '21: Proceedings of the 4th International Conference on Control and Computer Vision

August 2021

207 pages

ISBN:9781450390477

DOI:10.1145/3484274

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 November 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICCCV'21

ICCCV'21: 2021 4th International Conference on Control and Computer Vision

August 13 - 15, 2021

Macau, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
91
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten