Abstract
Graph convolutional network is apt for feature extraction in terms of non-Euclidian human skeleton data, but its adjacency matrix is fixed and the receptive field is small, which results in bias representation for skeleton intrinsic information. In addition, the operation of mean pooling on spatio-temporal features in classification layer will result in losing information and degrade recognition accuracy. To this end, the Decoupled Adaptive Graph Convolutional Network (DAGCN) is proposed. Specifically, a multi-level adaptive adjacency matrix is designed, which can dynamically obtain the rich correlation information among the skeleton nodes by a non-local adaptive algorithm. Whereafter, a new Residual Multi-scale Temporal Convolution Network (RMTCN) is proposed to fully extract temporal feature of the above decoupled skeleton dada. For the second problem in classification, we decompose the spatio-temporal features into three parts as spatial, temporal, spatio-temporal information, they are averagely pooled respectively, and added together for classification, denoted as STMP (spatio-temporal mean pooling) module. Experimental results show that our algorithm achieves accuracy of 96.5%, 90.6%, 96.4% on NTU-RGB+D60, NTU-RGB+D120 and NW-UCLA data sets respectively.
Similar content being viewed by others
Data availability
Our code and data were uploaded onto Githuab at the address: https://github.com/linguo123/DAGCN_behavior_recognition.
References
Huang J, Xiang X, Gong X, Zhang B (2020) Long-short graph memory network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 645–652
Sheng L, Tingting J, Tiejun H, Yonghong T (2020) Global co-occurrence feature learning and active coordinate system conversion for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp 586–59416
Du Y, Fu Y and Wang L (2015) Skeleton based action recognition with convolutional neural network. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) IEEE, pp 579–583
Li C, Zhong Q, Xie D and Pu S (2017) Skeleton-based action recognition with convolutional neural networks. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW) IEEE, pp 597–600
Zhu A, Wu Q, Cui R, Wang T, Hang W, Hua GAND, Snoussi H (2020) Exploring a rich spatial–temporal dependent relational model for skeleton-based action recognition by bidirectional LSTM-CNN. Neurocomputing 414:90–100
Papadopoulos K, Ghorbel E, Aouada D et al. (2021) Vertex feature encoding and hierarchical temporal modeling in a spatio-temporal graph convolutional network for action recognition. In: 25th International Conference on Pattern Recognition (ICPR). IEEE, pp 452–458
Shi L, Zhang Z, Cheng J and Lu H (2019) Two stream adaptive graph convolutional networks for skeleton based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 12026–12035
Cheng K, Zhang Y, Cao C, Shi L, Cheng J and Lu H (2020) Decoupling gcn with dropgraph module for skeleton-based action recognition. In: Proceedings of the European Conference on Computer Vision (ECCV)
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst. https://doi.org/10.1007/978-3-030-58586-0_32
Plizzari C, Cannici M, Matteucci M (2021) Skeleton-based action recognition via spatial and temporal transformer networks. Comput Vis Image Underst 208:103219
Wang Q, Peng J, Shi S et al. (2021) Iip-transformer: Intra-inter-part transformer for skeleton-based action recognition. arXiv preprint arXiv:2110.13385
Sekaran RS, Pang YH, Ling GF et al. (2022) MSTCN: a multiscale temporal convolutional network for user independent human activity recognition. F1000Research. https://doi.org/10.12688/f1000research.73175.2
Du Y, Fu Y, Wang L (2015) Skeleton based action recognition with convolutional neural network. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR). IEEE, pp 579–583
Wang P, Li Z, Hou Y et al. (2016) Action recognition based on joint trajectory maps using convolutional neural networks. In: Proceedings of the 24th ACM international conference on Multimedia. pp 102–106
Zhu K, Wang R, Zhao Q, Cheng J, Tao D (2020) A cuboid CNN model with an attention mechanism for skeleton-based action recognition. IEEE Trans Multimedia 22(11):2977–2989. https://doi.org/10.1109/TMM.2019.2962304
Du Y, Wang W and Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1110–1118
Liu J, Wang G, Duan L-Y, Abdiyeva KAND, Kot AC (2017) Skeleton-based human action recognition with global contextaware attention LSTM networks. IEEE Trans Image Process 27(4):1586–1599
Wei S, Song Y and Zhang Y (2017, September) Human skeleton tree recurrent neural network with joint relative motion feature for skeleton based action recognition. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 91–95
Si C, Chen W, Wang W, Wang L and Tan T (2019) An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1227–1236
Sijie S, Xiong Y and Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence. vol 32. no 1
Lee J, Lee M, Lee D et al. (2023) Hierarchically decomposed graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 10444–10453
Yang Z, Li K, Gan H et al. (2023) HD-GCN: A Hybrid Diffusion Graph Convolutional Network. arXiv preprint arXiv:2303.17966
Zhang Y, Wu B, Li W et al. (2021) STST: Spatial-temporal specialized transformer for skeleton-based action recognition.In: Proceedings of the 29th ACM International Conference on Multimedia. pp 3229–3237
Wei J, Wang Y, Guo M, et al. (2021) Dynamic hypergraph convolutional networks for skeleton-based action recognition. arXiv preprint arXiv: 2112.10570
Haodong D et al. (2022) DG-STGCN: dynamic spatial-temporal modeling for skeleton-based action recognition. arXiv preprint arXiv:2210.05895
Shi L, Zhang Y, Cheng J et al (2020) Skeleton-based action recognition with multi-stream adaptive graph convolutional networks. IEEE Trans Image Process 29:9532–9545
Shi L, Zhang Y, Cheng J et al. (2020) Decoupled spatial-temporal attention network for skeleton-based action-gesture recognition. In: Proceedings of the Asian Conference on Computer Vision
Liu Z, Zhang H, Chen Z, Wang Z and Ouyang W (2020) MS-G3D: disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 143–152
Shahroudy A, Liu J,Ng T-T and Wang G (June 2016) Ntu rgb+d: a large scale dataset for 3d human activity analysis. In: IEEE Conference on Computer Vision and Pattern Recognition
Liu J, Shahroudy A, Perez ML, Wang G, Duan L-Y, Chichung AK (2019) Ntu rgb+d 120: a large-scale benchmark for 3d human activity understanding. IEEE Trans Pattern Anal Mach Intell 42:2684
Wang J, Liu Z, Ying Wu, Yuan J (2013) Learning actionlet ensemble for 3D human action recognition. IEEE Trans Pattern Anal Mach Intell 36(5):914–927
Li S, Li W, Cook C, Zhu C and Gao Y (2018) Independently recurrent neural network (indrnn): building a longer and deeper rnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 5457–5466
Li C, Zhong Q, Xie D et al. (2018) Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. arXiv preprint arXiv:1804.06055, pp 786–792
Zhang P, Lan C, Zeng W, Xing J, Xue J and Zheng N (2020) Semantics-guided neural networks for efficient skeleton-based human action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 1112–1121
Jiang Y, Yu S, Wang T, Sun Z, Wang S (2023) Skeleton-based human action recognition based on single path one-shot neural architecture search. Electronics 12(14):3156
Yu X et al (2024) Skeleton-based action recognition based on multidimensional adaptive dynamic temporal graph convolutional network. Eng Appl Artif Intell 127:107210
Cheng K, Zhang Y, He X, Chen W, Cheng J and Lu H (2020) Skeleton-based action recognition with shift graph convolutional network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 183–192
Song Y-F, Zhang Z, Shan C, and Wang L (2020) Stronger, faster and more explainable: a graph convolutional baseline for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia. pp 1625–1633
Ye F, Pu S, Zhong Q, Li C, Xie D and Tang H (2020) Dynamic gcn: context-enriched topology learning for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia. pp 55–63
Shu Y, Li W, Li D, Gao K, and Jie B (2023, October) Multi-scale dilated attention graph convolutional network for skeleton-based action recognition. In: Chinese Conference on Pattern Recognition and Computer Vision (PRCV). Springer Nature Singapore, Singapore. pp 16–28
Ong YF, Zhang Z, Shan C et al (2023) Constructing stronger and faster baselines for skeleton-based action recognition. IEEE Trans Pattern Anal Mach Intell 45:1474–1488
Liu Y, Zhang H, Li Y, He K, Xu D (2023) Skeleton-based human action recognition via large-kernel attention graph convolutional network. IEEE Trans Visual Comput Graph 29(5):2575–2585
Qiu H, Hou B (2024) Multi-grained clip focus for skeleton-based action recognition. Pattern Recogn 148:110188
Jang S, Lee H, Kim WJ, Lee J, Woo S and Lee S (2024) Multi-scale structural graph convolutional network for skeleton-based action recognition. In: IEEE transactions on circuits and systems for video technology. https://doi.org/10.1109/TCSVT.2024.3375512
Chen Z, Li S, Yang B et al (2021) Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. Proc AAAI Conf Artif Intell 35(2):1113–1122
Chen Y, Zhang Z, Yuan C, et al. (2021) Channel-wise topology refinement graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 13359–13368
Xu K, Ye F, Zhong Q et al (2022) Topology-aware convolutional neural network for efficient skeleton-based action recognition. Proc AAAI Conf Artif Intell 36(3):2866–2874
Gao Z, Wang P, Lv P, Jiang X, Liu Q, Wang P and Li W (2022) Focal and global spatial-temporal transformer for skeleton-based action recognition. In: Proceedings of the Asian Conference on Computer Vision. pp 382–398
Chi H, Ha M- H, Chi S et al. (2022) Infogcn: representation learning for human skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 20186–20196
Dai M et al (2023) Global spatio-temporal synergistic topology learning for skeleton-based action recognition. Pattern Recognition 140:109540
Lee I, Kim D, Lee S (2021) 3-D human behavior understanding using generalized TS-LSTM networks. IEEE Trans Multimed 23:415–428. https://doi.org/10.1109/TMM.2020.2978637
Hu H et al. (2024) Multi-scale Adaptive Graph Convolution Network for Skeleton-based Action Recognition. IEEE Access
Yu Z et al. (2024) Cross-scale spatiotemporal refinement learning for skeleton-based action recognition. IEEE signal processing letters
Zhou H, Liu Q and Wang Y (2023) Learning discriminative representations for skeleton based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 10608–10617
Acknowledgements
All authors agree to submit the manuscript with the name list appeared in the title page.
Funding
National Natural Science Foundation of China, 61672305.
Author information
Authors and Affiliations
Contributions
Haigang Deng and Guocheng Lin designed the Decoupled Adaptive Graph Convolutional Network and Residual Multi-scale Temporal Convolution Network, and parepared corresponding tests; Haigang Deng also wrote Sect. 4.1, 4.2, 4.3; and Guocheng Lin wrote Sect. 4.4, 4.5, 4.6. Chengwei Li wrote introduction Section, related work and conclusion section. Chuanxu Wang wrote Sect. 3.1 and 3.2. Wenting Xu wrote Sect. 3.3.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Deng, H., Lin, G., Li, C. et al. Research on decoupled adaptive graph convolution networks based on skeleton data for action recognition. Pattern Anal Applic 27, 118 (2024). https://doi.org/10.1007/s10044-024-01319-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10044-024-01319-3