ABSTRACT
Graph convolutions networks (GCN) have drawn attention for skeleton-based action recognition because a skeleton with joints and bones can be naturally regarded as a graph structure. However, the existing methods are limited in temporal sequence modeling of human actions. To consider temporal factors in action modeling, we present a novel Temporal-Aware Graph Convolution Network (TA-GCN). First, we design a causal temporal convolution (CTCN) layer to ensure no impractical future information leakage to the past. Second, we present a novel cross-spatial-temporal graph convolution (3D-GCN) layer that extends an adaptive graph from the spatial to the temporal domain to capture local cross-spatial-temporal dependencies among joints. Involving the two temporal factors, TA-GCN can model the sequential nature of human actions. Experimental results on two large-scale datasets, NTU-RGB+D and Kinetics-Skeleton, indicate that our network achieves accuracy improvement (about 1% on the two datasets) over previous methods.
- S. Yan, Y. Xiong, and D. Lin, “Spatial temporal graph convolutional networks for skeleton-based action recognition,” 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 7444–7452, Jan. 2018Google ScholarCross Ref
- L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Two-stream adaptive graph convolutional networks for skeleton-based action recognition,” in Proceedings of the ieee computer society conference on computer vision and pattern recognition, Jun. 2019, vols. 2019-June, pp. 12018–12027Google ScholarCross Ref
- C. Li, Z. Cui, W. Zheng, C. Xu, and J. Yang, “Spatio-temporal graph convolution for skeleton based action recognition,” 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 3482–3489, 2018Google ScholarCross Ref
- M. Li, S. Chen, X. Chen, Y. Zhang, Y. Wang, and Q. Tian, “Actional-structural graph convolutional networks for skeleton-based action recognition,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vols. 2019-June, pp. 3590–3598, Apr. 2019Google ScholarCross Ref
- B. Li, X. Li, Z. Zhang, and F. Wu, “Spatio-Temporal Graph Routing for Skeleton-Based Action Recognition,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8561–8568, 2019,Google ScholarDigital Library
- K. Thakkar and P. J. Narayanan, “Part-based graph convolutional network for action recognition,” British Machine Vision Conference 2018, BMVC 2018, 2019,Google Scholar
- X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local Neural Networks,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 7794–7803, Nov. 2018Google ScholarCross Ref
- A. Shahroudy, J. Liu, T. T. Ng, and G. Wang, “NTU RGB+D: A large scale dataset for 3D human activity analysis,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vols. 2016-Decem, pp. 1010–1019, 2016,Google ScholarCross Ref
- W. Kay , “The Kinetics Human Action Video Dataset,” 2017Google Scholar
- Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh, “Realtime multi-person 2D pose estimation using part affinity fields,” Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, vols. 2017-Janua, pp. 1302–1310, 2017Google ScholarCross Ref
- L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Skeleton-Based Action Recognition with Multi-Stream Adaptive Graph Convolutional Networks,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 11, pp. 3247–3257, 2019.Google ScholarDigital Library
- A. Paszke , “Automatic differentiation in PyTorch,” Advances in Neural Information Processing Systems 32, pp. 8024–8035, 2019Google Scholar
- Y. Tang, Y. Tian, J. Lu, P. Li, and J. Zhou, “Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 5323–5332, 2018,Google ScholarCross Ref
- T. S. Kim and A. Reiter, “Interpretable 3D Human Action Analysis with Temporal Convolutional Networks,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, vols. 2017-July, pp. 1623–1631, 2017Google ScholarCross Ref
- Plizzari, C., Cannici, M., & Matteucci, M. (2021). Spatial Temporal Transformer Network for Skeleton-Based Action Recognition. Lecture Notes in Computer Science, 12663 LNCS, 694–701. https://doi.org/10.1007/978-3-030-68796-0_50Google ScholarDigital Library
- Alzubi, J., Nayyar, A., & Kumar, A. (2018, November). Machine learning from theory to algorithms: an overview. In Journal of physics: conference series (Vol. 1142, No. 1, p. 012012). IOP Publishing.Google Scholar
- Kukkar, A., Mohana, R., Nayyar, A., Kim, J., Kang, B. G., & Chilamkurti, N. (2019). A novel deep-learning-based bug severity classification technique using convolutional neural networks and random forest with boosting. Sensors,19(13), 2964.Google Scholar
Index Terms
- Temporal-Aware Graph Convolution Network for Skeleton-based Action Recognition
Recommendations
Temporal‐enhanced graph convolution network for skeleton‐based action recognition
AbstractGraph convolution networks (GCNs) have drawn attention for skeleton‐based action recognition. They have achieved remarkable performance by adaptively learning spatial features of human action dynamics. However, the existing methods are limited in ...
Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure
AbstractSkeleton-based action recognition has recently achieved much attention since they can robustly convey the action information. Recently, many studies have shown that graph convolutional networks (GCNs), which generalize CNNs to more generic non-...
Self-Relational Graph Convolution Network for Skeleton-Based Action Recognition
MM '23: Proceedings of the 31st ACM International Conference on MultimediaUsing a Graph convolution network (GCN) for constructing and aggregating node features has been helpful for skeleton-based action recognition. The strength of the nodes' relation of an action sequence distinguishes it from other actions. This work ...
Comments