Skip to main content

Research on decoupled adaptive graph convolution networks based on skeleton data for action recognition

  • Original Article
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

Graph convolutional network is apt for feature extraction in terms of non-Euclidian human skeleton data, but its adjacency matrix is fixed and the receptive field is small, which results in bias representation for skeleton intrinsic information. In addition, the operation of mean pooling on spatio-temporal features in classification layer will result in losing information and degrade recognition accuracy. To this end, the Decoupled Adaptive Graph Convolutional Network (DAGCN) is proposed. Specifically, a multi-level adaptive adjacency matrix is designed, which can dynamically obtain the rich correlation information among the skeleton nodes by a non-local adaptive algorithm. Whereafter, a new Residual Multi-scale Temporal Convolution Network (RMTCN) is proposed to fully extract temporal feature of the above decoupled skeleton dada. For the second problem in classification, we decompose the spatio-temporal features into three parts as spatial, temporal, spatio-temporal information, they are averagely pooled respectively, and added together for classification, denoted as STMP (spatio-temporal mean pooling) module. Experimental results show that our algorithm achieves accuracy of 96.5%, 90.6%, 96.4% on NTU-RGB+D60, NTU-RGB+D120 and NW-UCLA data sets respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

Our code and data were uploaded onto Githuab at the address: https://github.com/linguo123/DAGCN_behavior_recognition.

References

  1. Huang J, Xiang X, Gong X, Zhang B (2020) Long-short graph memory network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp 645–652

  2. Sheng L, Tingting J, Tiejun H, Yonghong T (2020) Global co-occurrence feature learning and active coordinate system conversion for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp 586–59416

  3. Du Y, Fu Y and Wang L (2015) Skeleton based action recognition with convolutional neural network. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) IEEE, pp 579–583

  4. Li C, Zhong Q, Xie D and Pu S (2017) Skeleton-based action recognition with convolutional neural networks. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW) IEEE, pp 597–600

  5. Zhu A, Wu Q, Cui R, Wang T, Hang W, Hua GAND, Snoussi H (2020) Exploring a rich spatial–temporal dependent relational model for skeleton-based action recognition by bidirectional LSTM-CNN. Neurocomputing 414:90–100

    Article  Google Scholar 

  6. Papadopoulos K, Ghorbel E, Aouada D et al. (2021) Vertex feature encoding and hierarchical temporal modeling in a spatio-temporal graph convolutional network for action recognition. In: 25th International Conference on Pattern Recognition (ICPR). IEEE, pp 452–458

  7. Shi L, Zhang Z, Cheng J and Lu H (2019) Two stream adaptive graph convolutional networks for skeleton based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 12026–12035

  8. Cheng K, Zhang Y, Cao C, Shi L, Cheng J and Lu H (2020) Decoupling gcn with dropgraph module for skeleton-based action recognition. In: Proceedings of the European Conference on Computer Vision (ECCV)

  9. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst. https://doi.org/10.1007/978-3-030-58586-0_32

    Article  Google Scholar 

  10. Plizzari C, Cannici M, Matteucci M (2021) Skeleton-based action recognition via spatial and temporal transformer networks. Comput Vis Image Underst 208:103219

    Article  Google Scholar 

  11. Wang Q, Peng J, Shi S et al. (2021) Iip-transformer: Intra-inter-part transformer for skeleton-based action recognition. arXiv preprint arXiv:2110.13385

  12. Sekaran RS, Pang YH, Ling GF et al. (2022) MSTCN: a multiscale temporal convolutional network for user independent human activity recognition. F1000Research. https://doi.org/10.12688/f1000research.73175.2

  13. Du Y, Fu Y, Wang L (2015) Skeleton based action recognition with convolutional neural network. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR). IEEE, pp 579–583

  14. Wang P, Li Z, Hou Y et al. (2016) Action recognition based on joint trajectory maps using convolutional neural networks. In: Proceedings of the 24th ACM international conference on Multimedia. pp 102–106

  15. Zhu K, Wang R, Zhao Q, Cheng J, Tao D (2020) A cuboid CNN model with an attention mechanism for skeleton-based action recognition. IEEE Trans Multimedia 22(11):2977–2989. https://doi.org/10.1109/TMM.2019.2962304

    Article  Google Scholar 

  16. Du Y, Wang W and Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 1110–1118

  17. Liu J, Wang G, Duan L-Y, Abdiyeva KAND, Kot AC (2017) Skeleton-based human action recognition with global contextaware attention LSTM networks. IEEE Trans Image Process 27(4):1586–1599

    Article  Google Scholar 

  18. Wei S, Song Y and Zhang Y (2017, September) Human skeleton tree recurrent neural network with joint relative motion feature for skeleton based action recognition. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 91–95

  19. Si C, Chen W, Wang W, Wang L and Tan T (2019) An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1227–1236

  20. Sijie S, Xiong Y and Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence. vol 32. no 1

  21. Lee J, Lee M, Lee D et al. (2023) Hierarchically decomposed graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 10444–10453

  22. Yang Z, Li K, Gan H et al. (2023) HD-GCN: A Hybrid Diffusion Graph Convolutional Network. arXiv preprint arXiv:2303.17966

  23. Zhang Y, Wu B, Li W et al. (2021) STST: Spatial-temporal specialized transformer for skeleton-based action recognition.In: Proceedings of the 29th ACM International Conference on Multimedia. pp 3229–3237

  24. Wei J, Wang Y, Guo M, et al. (2021) Dynamic hypergraph convolutional networks for skeleton-based action recognition. arXiv preprint arXiv: 2112.10570

  25. Haodong D et al. (2022) DG-STGCN: dynamic spatial-temporal modeling for skeleton-based action recognition. arXiv preprint arXiv:2210.05895

  26. Shi L, Zhang Y, Cheng J et al (2020) Skeleton-based action recognition with multi-stream adaptive graph convolutional networks. IEEE Trans Image Process 29:9532–9545

    Article  Google Scholar 

  27. Shi L, Zhang Y, Cheng J et al. (2020) Decoupled spatial-temporal attention network for skeleton-based action-gesture recognition. In: Proceedings of the Asian Conference on Computer Vision

  28. Liu Z, Zhang H, Chen Z, Wang Z and Ouyang W (2020) MS-G3D: disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 143–152

  29. Shahroudy A, Liu J,Ng T-T and Wang G (June 2016) Ntu rgb+d: a large scale dataset for 3d human activity analysis. In: IEEE Conference on Computer Vision and Pattern Recognition

  30. Liu J, Shahroudy A, Perez ML, Wang G, Duan L-Y, Chichung AK (2019) Ntu rgb+d 120: a large-scale benchmark for 3d human activity understanding. IEEE Trans Pattern Anal Mach Intell 42:2684

    Article  Google Scholar 

  31. Wang J, Liu Z, Ying Wu, Yuan J (2013) Learning actionlet ensemble for 3D human action recognition. IEEE Trans Pattern Anal Mach Intell 36(5):914–927

    Article  Google Scholar 

  32. Li S, Li W, Cook C, Zhu C and Gao Y (2018) Independently recurrent neural network (indrnn): building a longer and deeper rnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 5457–5466

  33. Li C, Zhong Q, Xie D et al. (2018) Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. arXiv preprint arXiv:1804.06055, pp 786–792

  34. Zhang P, Lan C, Zeng W, Xing J, Xue J and Zheng N (2020) Semantics-guided neural networks for efficient skeleton-based human action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 1112–1121

  35. Jiang Y, Yu S, Wang T, Sun Z, Wang S (2023) Skeleton-based human action recognition based on single path one-shot neural architecture search. Electronics 12(14):3156

    Article  Google Scholar 

  36. Yu X et al (2024) Skeleton-based action recognition based on multidimensional adaptive dynamic temporal graph convolutional network. Eng Appl Artif Intell 127:107210

    Article  Google Scholar 

  37. Cheng K, Zhang Y, He X, Chen W, Cheng J and Lu H (2020) Skeleton-based action recognition with shift graph convolutional network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 183–192

  38. Song Y-F, Zhang Z, Shan C, and Wang L (2020) Stronger, faster and more explainable: a graph convolutional baseline for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia. pp 1625–1633

  39. Ye F, Pu S, Zhong Q, Li C, Xie D and Tang H (2020) Dynamic gcn: context-enriched topology learning for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia. pp 55–63

  40. Shu Y, Li W, Li D, Gao K, and Jie B (2023, October) Multi-scale dilated attention graph convolutional network for skeleton-based action recognition. In: Chinese Conference on Pattern Recognition and Computer Vision (PRCV). Springer Nature Singapore, Singapore. pp 16–28

  41. Ong YF, Zhang Z, Shan C et al (2023) Constructing stronger and faster baselines for skeleton-based action recognition. IEEE Trans Pattern Anal Mach Intell 45:1474–1488

    Article  Google Scholar 

  42. Liu Y, Zhang H, Li Y, He K, Xu D (2023) Skeleton-based human action recognition via large-kernel attention graph convolutional network. IEEE Trans Visual Comput Graph 29(5):2575–2585

    Article  Google Scholar 

  43. Qiu H, Hou B (2024) Multi-grained clip focus for skeleton-based action recognition. Pattern Recogn 148:110188

    Article  Google Scholar 

  44. Jang S, Lee H, Kim WJ, Lee J, Woo S and Lee S (2024) Multi-scale structural graph convolutional network for skeleton-based action recognition. In: IEEE transactions on circuits and systems for video technology. https://doi.org/10.1109/TCSVT.2024.3375512

  45. Chen Z, Li S, Yang B et al (2021) Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. Proc AAAI Conf Artif Intell 35(2):1113–1122

    Google Scholar 

  46. Chen Y, Zhang Z, Yuan C, et al. (2021) Channel-wise topology refinement graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp 13359–13368

  47. Xu K, Ye F, Zhong Q et al (2022) Topology-aware convolutional neural network for efficient skeleton-based action recognition. Proc AAAI Conf Artif Intell 36(3):2866–2874

    Google Scholar 

  48. Gao Z, Wang P, Lv P, Jiang X, Liu Q, Wang P and Li W (2022) Focal and global spatial-temporal transformer for skeleton-based action recognition. In: Proceedings of the Asian Conference on Computer Vision. pp 382–398

  49. Chi H, Ha M- H, Chi S et al. (2022) Infogcn: representation learning for human skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 20186–20196

  50. Dai M et al (2023) Global spatio-temporal synergistic topology learning for skeleton-based action recognition. Pattern Recognition 140:109540

    Article  Google Scholar 

  51. Lee I, Kim D, Lee S (2021) 3-D human behavior understanding using generalized TS-LSTM networks. IEEE Trans Multimed 23:415–428. https://doi.org/10.1109/TMM.2020.2978637

    Article  Google Scholar 

  52. Hu H et al. (2024) Multi-scale Adaptive Graph Convolution Network for Skeleton-based Action Recognition. IEEE Access

  53. Yu Z et al. (2024) Cross-scale spatiotemporal refinement learning for skeleton-based action recognition. IEEE signal processing letters

  54. Zhou H, Liu Q and Wang Y (2023) Learning discriminative representations for skeleton based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp 10608–10617

Download references

Acknowledgements

All authors agree to submit the manuscript with the name list appeared in the title page.

Funding

National Natural Science Foundation of China, 61672305.

Author information

Authors and Affiliations

Authors

Contributions

Haigang Deng and Guocheng Lin designed the Decoupled Adaptive Graph Convolutional Network and Residual Multi-scale Temporal Convolution Network, and parepared corresponding tests; Haigang Deng also wrote Sect. 4.1, 4.2, 4.3; and Guocheng Lin wrote Sect. 4.4, 4.5, 4.6. Chengwei Li wrote introduction Section, related work and conclusion section. Chuanxu Wang wrote Sect. 3.1 and 3.2. Wenting Xu wrote Sect. 3.3.

Corresponding authors

Correspondence to Chengwei Li or Chuanxu Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, H., Lin, G., Li, C. et al. Research on decoupled adaptive graph convolution networks based on skeleton data for action recognition. Pattern Anal Applic 27, 118 (2024). https://doi.org/10.1007/s10044-024-01319-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10044-024-01319-3

Keywords