skip to main content
10.1145/3484274.3484288acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicccvConference Proceedingsconference-collections
research-article

Temporal-Aware Graph Convolution Network for Skeleton-based Action Recognition

Published: 23 November 2021 Publication History

Abstract

Graph convolutions networks (GCN) have drawn attention for skeleton-based action recognition because a skeleton with joints and bones can be naturally regarded as a graph structure. However, the existing methods are limited in temporal sequence modeling of human actions. To consider temporal factors in action modeling, we present a novel Temporal-Aware Graph Convolution Network (TA-GCN). First, we design a causal temporal convolution (CTCN) layer to ensure no impractical future information leakage to the past. Second, we present a novel cross-spatial-temporal graph convolution (3D-GCN) layer that extends an adaptive graph from the spatial to the temporal domain to capture local cross-spatial-temporal dependencies among joints. Involving the two temporal factors, TA-GCN can model the sequential nature of human actions. Experimental results on two large-scale datasets, NTU-RGB+D and Kinetics-Skeleton, indicate that our network achieves accuracy improvement (about 1% on the two datasets) over previous methods.

References

[1]
S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeleton-based action recognition,” 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 7444–7452, Jan. 2018
[2]
L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Two-stream adaptive graph convolutional networks for skeleton-based action recognition,” in Proceedings of the ieee computer society conference on computer vision and pattern recognition, Jun. 2019, vols. 2019-June, pp. 12018–12027
[3]
C. Li, Z. Cui, W. Zheng, C. Xu, and J. Yang, “Spatio-temporal graph convolution for skeleton based action recognition,” 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 3482–3489, 2018
[4]
M. Li, S. Chen, X. Chen, Y. Zhang, Y. Wang, and Q. Tian, “Actional-structural graph convolutional networks for skeleton-based action recognition,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vols. 2019-June, pp. 3590–3598, Apr. 2019
[5]
B. Li, X. Li, Z. Zhang, and F. Wu, “Spatio-Temporal Graph Routing for Skeleton-Based Action Recognition,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8561–8568, 2019,
[6]
K. Thakkar and P. J. Narayanan, “Part-based graph convolutional network for action recognition,” British Machine Vision Conference 2018, BMVC 2018, 2019,
[7]
X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local Neural Networks,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 7794–7803, Nov. 2018
[8]
A. Shahroudy, J. Liu, T. T. Ng, and G. Wang, “NTU RGB+D: A large scale dataset for 3D human activity analysis,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vols. 2016-Decem, pp. 1010–1019, 2016,
[9]
W. Kay, “The Kinetics Human Action Video Dataset,” 2017
[10]
Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh, “Realtime multi-person 2D pose estimation using part affinity fields,” Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, vols. 2017-Janua, pp. 1302–1310, 2017
[11]
L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Skeleton-Based Action Recognition with Multi-Stream Adaptive Graph Convolutional Networks,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 11, pp. 3247–3257, 2019.
[12]
A. Paszke, “Automatic differentiation in PyTorch,” Advances in Neural Information Processing Systems 32, pp. 8024–8035, 2019
[13]
Y. Tang, Y. Tian, J. Lu, P. Li, and J. Zhou, “Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 5323–5332, 2018,
[14]
T. S. Kim and A. Reiter, “Interpretable 3D Human Action Analysis with Temporal Convolutional Networks,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, vols. 2017-July, pp. 1623–1631, 2017
[15]
Plizzari, C., Cannici, M., & Matteucci, M. (2021). Spatial Temporal Transformer Network for Skeleton-Based Action Recognition. Lecture Notes in Computer Science, 12663 LNCS, 694–701. https://doi.org/10.1007/978-3-030-68796-0_50
[16]
Alzubi, J., Nayyar, A., & Kumar, A. (2018, November). Machine learning from theory to algorithms: an overview. In Journal of physics: conference series (Vol. 1142, No. 1, p. 012012). IOP Publishing.
[17]
Kukkar, A., Mohana, R., Nayyar, A., Kim, J., Kang, B. G., & Chilamkurti, N. (2019). A novel deep-learning-based bug severity classification technique using convolutional neural networks and random forest with boosting. Sensors,19(13), 2964.

Index Terms

  1. Temporal-Aware Graph Convolution Network for Skeleton-based Action Recognition
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        ICCCV '21: Proceedings of the 4th International Conference on Control and Computer Vision
        August 2021
        207 pages
        ISBN:9781450390477
        DOI:10.1145/3484274
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 23 November 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Skeleton-based action recognition
        2. causal convolution
        3. spatial-temporal graph
        4. temporal sequence modeling

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        ICCCV'21

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 91
          Total Downloads
        • Downloads (Last 12 months)4
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 27 Feb 2025

        Other Metrics

        Citations

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media