A lightweight graph convolutional network for skeleton-based action recognition

Pham, Dinh-Tan; Pham, Quang-Tien; Nguyen, Tien-Thanh; Le, Thi-Lan; Vu, Hai

doi:10.1007/s11042-022-13298-w

A lightweight graph convolutional network for skeleton-based action recognition

Track 1: General Multimedia Topics
Published: 18 June 2022

Volume 82, pages 3055–3079, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Dinh-Tan Pham^1,3,
Quang-Tien Pham²,
Tien-Thanh Nguyen²,
Thi-Lan Le^2,3 &
…
Hai Vu ORCID: orcid.org/0000-0003-2880-4417^2,3

484 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Human action recognition has been an attractive research topic in recent years due to its wide range of applications. Among existing methods, the Graph Convolutional Network achieves remarkable results by exploring the graph nature of skeleton data in both spatial and temporal domains. Noise from the pose estimation error is an inherent issue that could seriously degrade action recognition performance. Existing graph-based methods mainly focus on improving recognition accuracy, whereas low-complexity models are required for application development on devices with limited computation capacity. In this paper, a lightweight model is proposed by pruning layers, adding Feature Fusion and Preset Joint Subset Selection modules. The proposed model takes advantages of the recent Graph-based convolution networks (GCN) and selecting informative joints. Two graph topologies are defined for the selected joints. Extensive experiments are implemented on public datasets to evaluate the performance of the proposed method. Experimental results show that the method outperforms the baselines on the datasets with serious noise in skeleton data. In contrast, the number of parameters in the proposed method is 5.6 times less than the baseline. The proposed lightweight models therefore offer feasible solutions for developing practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-Order Graph Convolutional Network for Skeleton-Based Human Action Recognition

Multi-stream ternary enhanced graph convolutional network for skeleton-based action recognition

Article 14 June 2023

Jun Kong, Shengquan Wang, … TianShan Liu

Enhanced decoupling graph convolution network for skeleton-based action recognition

Article 26 October 2023

Yue Gu, Qiang Yu & Wanli Xue

References

Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078
Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1110–1118
Ghorbel E, Boutteau R, Boonaert J, Savatier X, Lecoeuche S (2015) 3D real-time human action recognition using a spline interpolation approach. In: 2015 International conference on image processing theory, tools and applications (IPTA). IEEE, pp 61–66
Heidari N, Iosifidis A (2021) Progressive spatio-temporal graph convolutional network for skeleton-based human action recognition. In: ICASSP 2021-2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 3220–3224
Hoang VN, Le TL, Tran TH, Nguyen VT, et al. (2019) 3D skeleton-based action recognition with convolutional neural networks. In: 2019 International conference on multimedia analysis and pattern recognition (MAPR). IEEE, pp 1–6
Hussein ME, Torki M, Gowayyed MA, El-Saban M (2013) Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations. In: The proceeding of twenty-third international joint conference on artificial intelligence
Johansson G (1973) Visual perception of biological motion and a model for its analysis. Perception & psychophysics 14(2):201–211
Article Google Scholar
Ke Q, Bennamoun M, An S, Sohel F, Boussaid F (2017) A new representation of skeleton sequences for 3D action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3288–3297
Kim TS, Reiter A (2017) Interpretable 3D human action analysis with temporal convolutional networks. In: Conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 1623–1631
Lea C, Flynn MD, Vidal R, Reiter A, Hager GD (2017) Temporal convolutional networks for action segmentation and detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 156–165
Li B, Dai Y, Cheng X, Chen H, Lin Y, He M (2017) Skeleton based action recognition using translation-scale invariant image mapping and multi-scale deep CNN. In: International conference on multimedia & expo workshops (ICMEW). IEEE, pp 601–604
Li C, Wang P, Wang S, Hou Y, Li W (2017) Skeleton-based action recognition using lstm and cnn. In: International conference on multimedia & expo workshops (ICMEW). IEEE, pp 585–590
Li C, Zhong Q, Xie D, Pu S (2018) Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. arXiv:1804.06055
Li L, Zheng W, Zhang Z, Huang Y, Wang L (2018) Skeleton-based relational modeling for action recognition. arXiv:1805.02556 1 (2):3
Google Scholar
Li M, Chen S, Chen X, Zhang Y, Wang Y, Tian Q (2019) Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3595–3603
Li S, Li W, Cook C, Gao Y (2019) Deep independently recurrent neural network (IndRNN). arXiv:1910.06251
Li S, Li W, Cook C, Zhu C, Gao Y (2018) Independently recurrent neural network (IndRNN): building a longer and deeper RNN. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5457–5466
Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. In: Computer society conference on computer vision and pattern recognition-workshops. IEEE, pp 9–14
Liu J, Shahroudy A, Xu D, Wang G (2016) Spatio-temporal LSTM with trust gates for 3D human action recognition. In: European conference on computer vision. Springer, pp 816–833
Liu M, Liu H, Chen C (2017) Enhanced skeleton visualization for view invariant human action recognition. Pattern Recogn 68:346–362
Article Google Scholar
Matplotlib: Choosing colormaps in matplotlib (2021) https://matplotlib.org/stable/tutorials/colors/colormaps.html. Accessed 28 Nov 2021
Nguyen TN, Pham DT, Le TL, Vu H, Tran TH (2018) Novel skeleton-based action recognition using covariance descriptors on most informative joints. In: 2018 10Th international conference on knowledge and systems engineering (KSE). IEEE, pp 50–55
Nguyen VT, Nguyen TN, Le TL, Pham DT, Vu H (2021) Adaptive most joint selection and covariance descriptions for a robust skeleton-based human action recognition. Multimed Tools Appl
Ofli F, Chaudhry R, Kurillo G, Vidal R, Bajcsy R (2014) Sequence of the most informative joints (SMIJ): A new representation for human skeletal action recognition. J Vis Commun Image Represent 25(1):24–38
Article Google Scholar
Pham DT, Dang TP, Nguyen DQ, Le TL, Vu H Skeleton-based action recognition using feature fusion for spatial temporal graph convolutional networks. J Sci Technol, pp 1–19
Pham DT, Nguyen TN, Le TL, Vu H (2019) Analyzing role of joint subset selection in human action recognition. In: 2019 6Th NAFOSTED conference on information and computer science (NICS). IEEE, pp 61–66
Pham DT, Pham QT, Le TL, Vu H (2021) An efficient feature fusion of graph convolutional networks and its application for real-time traffic control gestures recognition. IEEE Access
Ren B, Liu M, Ding R, Liu H (2020) A survey on 3d skeleton-based action recognition using learning method. arXiv:2002.05907
Shahroudy A, Liu J, Ng TT, Wang G (2016) NTU RGB+D: A large scale dataset for 3D human activity analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1010–1019
Shi F, Lee C, Qiu L, Zhao Y, Shen T, Muralidhar S, Han T, Zhu SC, Narayanan V (2021) Star: sparse transformer-based action recognition. arXiv:2107.07089
Shi L, Zhang Y, Cheng J, Lu H (2019) Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12026–12035
Shi L, Zhang Y, Cheng J, Lu H (2020) Skeleton-based action recognition with multi-stream adaptive graph convolutional networks. IEEE Trans Image Process 29:9532–9545
Article MATH Google Scholar
Si C, Jing Y, Wang W, Wang L, Tan T (2018) Skeleton-based action recognition with spatial reasoning and temporal stack learning. In: Proceedings of the european conference on computer vision (ECCV), pp 103–118
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. arXiv:1406.2199
Song S, Lan C, Xing J, Zeng W, Liu J (2016) An end-to-end spatio-temporal attention model for human action recognition from skeleton data. arXiv:1611.06067
Song YF, Zhang Z, Shan C, Wang L (2020) Richly activated graph convolutional network for robust skeleton-based action recognition. IEEE Trans Circuits Syst Video Technol 31(5):1915–1925
Article Google Scholar
Song YF, Zhang Z, Shan C, Wang L (2020) Stronger, faster and more explainable: A graph convolutional baseline for skeleton-based action recognition. In: Proceedings of the 28th ACM international conference on multimedia, pp 1625–1633
Song YF, Zhang Z, Wang L (2019) Richly activated graph convolutional network for action recognition with incomplete skeletons. In: International conference on image processing (ICIP). IEEE, pp 1–5
Tang Y, Tian Y, Lu J, Li P, Zhou J (2018) Deep progressive reinforcement learning for skeleton-based action recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5323–5332
Thi-Lan Le Cao-Cuong Than HQN, Pham VC (2020) Adaptive graph convolutional network with richly activated for skeleton-based human activity recognition. In: International conference on communications and electronics (ICCE), pp 1–6
Tran TH, Le TL, Pham DT, Hoang VN, Khong VM, Tran QT, Nguyen TS, Pham C (2018) A multi-modal multi-view dataset for human fall analysis and preliminary investigation on modality. In: 2018 24Th international conference on pattern recognition (ICPR). IEEE, pp 1947–1952
Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3D skeletons as points in a lie group. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 588–595
Xia L, Chen CC, Aggarwal JK (2012) View invariant human action recognition using histograms of 3D joints. In: Computer society conference on computer vision and pattern recognition workshops. IEEE, pp 20–27
Xiao R, Hou Y, Guo Z, Li C, Wang P, Li W (2019) Self-attention guided deep features for action recognition. In: International conference on multimedia and expo (ICME). IEEE, pp 1060–1065
Yan S, Xiong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. arXiv:1801.07455
Yang Z, Li Y, Yang J, Luo J (2018) Action recognition with spatio–temporal visual attention on skeleton image sequences. IEEE Trans Circuits Syst Video Technol 29(8):2405–2415
Article Google Scholar
Zhang H, Hou Y, Wang P, Guo Z, Li W (2020) Sar-nas: skeleton-based action recognition via neural architecture searching. J Vis Commun Image Represent 73:102942
Article Google Scholar
Zhang P, Lan C, Xing J, Zeng W, Xue J, Zheng N (2017) View adaptive recurrent neural networks for high performance human action recognition from skeleton data. In: Proceedings of the IEEE international conference on computer vision, pp 2117–2126
Zou K, Yin M, Huang W, Zeng Y (2019) Deep stacked bidirectional lstm neural network for skeleton-based action recognition. In: International conference on image and graphics. Springer, pp 676–688

Download references

Acknowledgements

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2017.315.

Author information

Authors and Affiliations

Faculty of IT, Hanoi University of Mining and Geology, Hanoi, Vietnam
Dinh-Tan Pham
School of Electrical and Electronic Engineering (SEEE), Hanoi University of Science and Technology, Hanoi, Vietnam
Quang-Tien Pham, Tien-Thanh Nguyen, Thi-Lan Le & Hai Vu
Computer Vision Department, MICA International Research Institute, Hanoi University of Science and Technology, Hanoi, Vietnam
Dinh-Tan Pham, Thi-Lan Le & Hai Vu

Authors

Dinh-Tan Pham
View author publications
You can also search for this author in PubMed Google Scholar
Quang-Tien Pham
View author publications
You can also search for this author in PubMed Google Scholar
Tien-Thanh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Thi-Lan Le
View author publications
You can also search for this author in PubMed Google Scholar
Hai Vu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hai Vu.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pham, DT., Pham, QT., Nguyen, TT. et al. A lightweight graph convolutional network for skeleton-based action recognition. Multimed Tools Appl 82, 3055–3079 (2023). https://doi.org/10.1007/s11042-022-13298-w

Download citation

Received: 25 January 2022
Revised: 25 January 2022
Accepted: 15 May 2022
Published: 18 June 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s11042-022-13298-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A lightweight graph convolutional network for skeleton-based action recognition

Abstract

Access this article

Similar content being viewed by others

High-Order Graph Convolutional Network for Skeleton-Based Human Action Recognition

Multi-stream ternary enhanced graph convolutional network for skeleton-based action recognition

Enhanced decoupling graph convolution network for skeleton-based action recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A lightweight graph convolutional network for skeleton-based action recognition

Abstract

Access this article

Similar content being viewed by others

High-Order Graph Convolutional Network for Skeleton-Based Human Action Recognition

Multi-stream ternary enhanced graph convolutional network for skeleton-based action recognition

Enhanced decoupling graph convolution network for skeleton-based action recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation