Skip to main content

Advertisement

Log in

MFGCN: an efficient graph convolutional network based on multi-order feature information for human skeleton action recognition

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

With the development of depth sensors and pose estimation algorithms, human skeleton action recognition based on graph convolutional networks has acquired widespread attention and application. The latest methods achieve dynamically learning different topologies for modeling and use first-order, second-order, and third-order features, i.e., joint, bone, and motion representations, which has led to high accuracy. However, many models are still confused by actions that have similar motion trajectories, and most of the existing methods model the spatial dimension before the temporal dimension, whereas in fact, spatial and temporal information should be interrelated. In this paper, we propose an efficient graph convolutional network based on multi-order feature information (MFGCN) for human skeleton action recognition. Firstly, our method introduces angle features (noted as fourth-order features), which are implicitly embedded in other third-order features by encoding angular features, to powerfully capture detailed features in the spatio-temporal dimension and enhance the ability to distinguish similar actions. Secondly, we use a content-adaptive approach to construct the adjacency matrix and dynamically learn the topology between the skeleton joints. Finally, we develop a spatio-temporal information sliding extraction module (STISE) to improve the inter-correlation of spatial and temporal information. The proposed method has extensively experimented on the NTU-RGB D, NTU-RGB D 120, and Northwestern-UCLA datasets, and the experimental results show that our method can achieve superior performance compared to the current state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

The datasets are available from the ROSE Lab at https://rose1.ntu.edu.sg/dataset/actionRecognition/;

Code availability

The codes are available from the first author on reasonable request.

References

  1. Setiawan F, Yahya BN, Chun S et al (2022) Sequential inter-hop graph convolution neural network (sihgcn) for skeleton-based human action recognition. Expert Syst Appl 195:116566

    Article  Google Scholar 

  2. Ding C, Wen S, Ding W et al (2022) Temporal segment graph convolutional networks for skeleton-based action recognition. Eng Appl Artif Intell 110:104675

    Article  Google Scholar 

  3. Chen J, Li S, Liu D, Lu W (2022) Indoor camera pose estimation via style-transfer 3d models. Comput Aided Civ Infrastruct Eng 37(3):335–353

    Article  Google Scholar 

  4. Ke L, Chang M, Qi H, Lyu S (2022) Detposenet: Improving multi-person pose estimation via coarse-pose filtering. IEEE Trans Image Process 31:2782–2795

    Article  Google Scholar 

  5. Rao H, Xu S, Hu X et al (2021) Augmented skeleton based contrastive action learning with momentum lstm for unsupervised action recognition. Inf Sci 569:90–109

    Article  Google Scholar 

  6. Banerjee A, Singh PK, Sarkar R (2020) Fuzzy integral-based cnn classifier fusion for 3d skeleton action recognition. IEEE Trans Circ Syst Video Technol 31(6):2206–2216

    Article  Google Scholar 

  7. Huynh-The T, Hua C-H, Ngo T-T et al (2020) Image representation of pose-transition feature for 3d skeleton-based action recognition. Inf Sci 513:112–126

    Article  Google Scholar 

  8. Chen Y, Zhang Z, Yuan C et al (2021) Channel-wise topology refinement graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 13359–13368

  9. Song Y, Zhang Z, Shan C, Wang L (2022) Constructing stronger and faster baselines for skeleton-based action recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3157033

    Article  Google Scholar 

  10. Yan S, Xiong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-second AAAI Conference on Artificial Intelligence, pp 7444–7452

  11. Silva V, Soares F, Leão CP et al (2021) Skeleton driven action recognition using an image-based spatial-temporal representation and convolution neural network. Sensors 21(13):4342

    Article  Google Scholar 

  12. Cheng K, Zhang Y, Cao C et al (2020) Decoupling gcn with dropgraph module for skeleton-based action recognition. In: European Conference on Computer Vision, pp 536–553

  13. Huang L, Huang Y, Ouyang W, Wang L (2019) Hierarchical graph convolutional network for skeleton-based action recognition. In: International Conference on Image and Graphics, pp 93–102

  14. Shi L, Zhang Y, Cheng J, Lu H (2019) Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 12026–12035

  15. Yang H, Yan D, Zhang L et al (2021) Feedback graph convolutional network for skeleton-based action recognition. IEEE Trans Image Process 31:164–175

    Article  Google Scholar 

  16. Qin Z, Liu Y, Ji P, et al (2021) Fusing higher-order features in graph neural networks for skeleton-based action recognition. arXiv preprint arXiv:2105.01563

  17. Chen J, Wang Z, Zeng K et al (2022) Rethinking lightweight: multiple angle strategy for efficient video action recognition. IEEE Signal Process Lett 29:498–502

    Article  Google Scholar 

  18. Islam MS, Bakhat K, Khan R et al (2021) Action recognition using interrelationships of 3d joints and frames based on angle sine relation and distance features using interrelationships. Appl Intell 51:6001–6013

    Article  Google Scholar 

  19. Zhang P, Lan C, Zeng W et al (2020) Semantics-guided neural networks for efficient skeleton-based human action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1112–1121

  20. Huynh-The T, Hua C-H, Tu NA et al (2018) Hierarchical topic modeling with pose-transition feature for action recognition using 3d skeleton data. Inf Sci 444:20–35

    Article  MathSciNet  Google Scholar 

  21. Li M, Chen S, Chen X et al (2019) Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3595–3603

  22. Cassinis LP, Fonod R, Gill E (2019) Review of the robustness and applicability of monocular pose estimation systems for relative navigation with an uncooperative spacecraft. Progress Aerosp Sci 110:100548

    Article  Google Scholar 

  23. Du G, Wang K, Lian S, Zhao K (2021) Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review. Artif Intell Rev 54(3):1677–1734

    Article  Google Scholar 

  24. Liu Z, Zhang H, Chen Z et al (2020) Disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 143–152

  25. Xie J, Miao Q, Liu R et al (2021) Attention adjacency matrix based graph convolutional networks for skeleton-based action recognition. Neurocomputing 440:230–239

    Article  Google Scholar 

  26. Li Y, Xia R, Liu X (2020) Learning shape and motion representations for view invariant skeleton-based action recognition. Pattern Recogn 103:107293

    Article  Google Scholar 

  27. Cheng K, Zhang Y, He X et al (2020) Skeleton-based action recognition with shift graph convolutional network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 183–192

  28. Li L, Wang M, Ni B et al (2021) 3d human action representation learning via cross-view consistency pursuit. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4741–4750

  29. Hao X, Li J, Guo Y et al (2021) Hypergraph neural network for skeleton-based action recognition. IEEE Trans Image Process 30:2263–2275

    Article  MathSciNet  Google Scholar 

  30. Jiang X, Xu K, Sun T (2019) Action recognition scheme based on skeleton representation with ds-lstm network. IEEE Trans Circ Syst Video Technol 30(7):2129–2140

    Article  Google Scholar 

  31. Kong J, Bian Y, Jiang M (2022) Mtt: Multi-scale temporal transformer for skeleton-based action recognition. IEEE Signal Process Lett 29:528–532

    Article  Google Scholar 

  32. Plizzari C, Cannici M, Matteucci M (2021) Skeleton-based action recognition via spatial and temporal transformer networks. Comput Vis Image Underst 208:103219

    Article  Google Scholar 

  33. Liu Y, Zhang H, Xu D, He K (2022) Graph transformer network with temporal kernel attention for skeleton-based action recognition. Knowl Based Syst 240:108146

    Article  Google Scholar 

  34. He T, Zhang Z, Zhang H et al (2019) Bag of tricks for image classification with convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 558–567

  35. Shahroudy A, Liu J, Ng T-T, Wang G (2016) Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1010–1019

  36. Liu J, Shahroudy A, Perez M et al (2019) Ntu rgb+ d 120: A large-scale benchmark for 3d human activity understanding. IEEE Trans Pattern Anal Mach Intell 42(10):2684–2701

    Article  Google Scholar 

  37. Wang J, Nie X, Xia Y et al (2014) Cross-view action modeling, learning and recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2649–2656

  38. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

  39. Ye F, Pu S, Zhong Q et al (2020) Dynamic gcn: Context-enriched topology learning for skeleton-based action recognition. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 55–63

  40. Si C, Chen W, Wang W et al (2019) An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1227–1236

  41. Zhang P, Lan C, Xing J et al (2019) View adaptive neural networks for high performance skeleton-based human action recognition. IEEE Trans Pattern Anal Mach Intell 41(8):1963–1978

    Article  Google Scholar 

  42. Peng W, Hong X, Chen H, Zhao G (2020) Learning graph convolutional network for skeleton-based human action recognition by neural searching. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 2669–2676

  43. Huang L, Huang Y, Ouyang W, Wang L (2020) Part-level graph convolutional network for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 11045–11052

  44. Korban M, Li X (2020) Ddgcn: A dynamic directed graph convolutional network for action recognition. In: European Conference on Computer Vision, pp 761–776

  45. Peng W, Hong X, Zhao G (2021) Tripool: Graph triplet pooling for 3d skeleton-based action recognition. Pattern Recogn 115:107921

    Article  Google Scholar 

  46. Si C, Jing Y, Wang W et al (2018) Skeleton-based action recognition with spatial reasoning and temporal stack learning. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 103–118

  47. Song Y-F, Zhang Z, Shan C, Wang L (2020) Richly activated graph convolutional network for robust skeleton-based action recognition. IEEE Trans Circ Syst Video Technol 31(5):1915–1925

    Article  Google Scholar 

  48. Ding C, Liu K, Korhonen J, Belyaev E (2021) Spatio-temporal difference descriptor for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 35, pp 1227–1235

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant 62267007, Gansu Provincial Department of Education Industrial Support Plan Project under Grant 2022CYZC-16.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinlin Hu.

Ethics declarations

Conflict of interest

The authors have no Conflict and competing interests to declare that are relevant to the content of this article.

Consent to participate

Not applicable.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qi, Y., Hu, J., Han, X. et al. MFGCN: an efficient graph convolutional network based on multi-order feature information for human skeleton action recognition. Neural Comput & Applic 35, 19979–19995 (2023). https://doi.org/10.1007/s00521-023-08814-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08814-4

Keywords

Navigation