Spatiotemporal smoothing aggregation enhanced multi-scale residual deep graph convolutional networks for skeleton-based gait recognition

Chen, Guanghai; Chen, Xin; Zheng, Chengzhi; Wang, Junshu; Liu, Xinchao; Han, Yuxing

doi:10.1007/s10489-024-05422-0

Spatiotemporal smoothing aggregation enhanced multi-scale residual deep graph convolutional networks for skeleton-based gait recognition

Published: 08 May 2024

Volume 54, pages 6154–6174, (2024)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Guanghai Chen^1,2^na1,
Xin Chen^1,2,3,4,5^na1,
Chengzhi Zheng⁶,
Junshu Wang⁷,
Xinchao Liu^1,2 &
…
Yuxing Han ORCID: orcid.org/0000-0003-3553-6764⁸

314 Accesses
Explore all metrics

Abstract

Gait recognition has a variety of development potentials, such as noncontact potential. The preference for skeleton-based recognition arises due to challenges posed by self-occlusion and environmental factors affecting silhouette-based methods. Addressing the discriminative properties of long-term and short-term temporal cues, we propose spatiotemporal smoothing aggregation enhanced multiscale residual deep graph convolutional networks. This paper considers both long and short gait feature time series, enabling the learning of discriminative multiscale representations. In the baseline network, three scale features are sequentially extracted, followed by a reverse process to extract and fuse multiscale features. This method significantly bolsters the ability of graph convolution to effectively model the context knowledge of human poses effectively. This study investigated multiscale gait feature aggregation, which significantly mitigates oversmoothing effects. A spatiotemporal smoothing aggregation module with an embedded attention mechanism is introduced to hierarchically aggregate and enhance multiscale key joint features. This module alleviates oversmoothing in deep graph convolutional networks. The method underwent rigorous testing on the Chinese Academy of Sciences Institute of Automation(CASIA-B) dataset, achieving an average accuracy of 78.2%, ranking as the second highest performing skeletal-based gait recognition model currently available, and attaining rank-1 accuracies of 14.7 and 8.19 on Gait Recognition in the wild (GREW) and Gait3D datasets, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Multi-scale Spatial and Temporal Feature Aggregation Graph Convolutional Network for Skeleton-Based Action Recognition

Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network

Article 10 January 2023

MGSAN: multimodal graph self-attention network for skeleton-based action recognition

Article Open access 27 November 2024

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability and access

The research in this paper based on three publicly available datasets that are CASIA-B Gait Database(http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp), Gait Recognition Evaluation Workshop (https://www.grew-benchmark.org/) and Gait3D Dataset (https://gait3d.github.io/). Requires permission to use from the data owner.

Code availability

Due to legal considerations, we are unable to open-source the code for this study at this moment. Your understanding and support are greatly appreciated.

References

Li N, Zhao X (2023) A multi-modal dataset for gait recognition under occlusion. Appl Intell 53(2):1517–1534
Article Google Scholar
Li G, Guo L, Zhang R et al (2023) Transgait: Multimodal-based gait recognition with set transformer. Appl Intell 53(2):1535–1547
Article Google Scholar
Ben X, Gong C, Zhang P et al (2019) Coupled bilinear discriminant projection for cross-view gait recognition. IEEE Trans Circuits Syst Video Technol 30(3):734–747
Article Google Scholar
Chao H, He Y, Zhang J, et al (2019) Gaitset: Regarding gait as a set for cross-view gait recognition. In: Proceedings of the AAAI conference on artificial intelligence, pp 8126–8133
Dang L, Nie Y, Long C, et al (2021) Msrgcn: Multi-scale residual graph convolution networks for human motion prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 11467–11476
Huang X, Zhu D, Wang H, et al (2021) Context-sensitive temporal feature learning for gait recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 12909–12918
Fan C, Peng Y, Cao C, et al (2020) Gaitpart: Temporal part-based model for gait recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14225–14233
Yang Y, Yang X, Sakamoto T et al (2022) Unsupervised domain adaptation for disguised-gait-based person identification on micro-doppler signatures. IEEE Trans Circuits Syst Video Technol 32(9):6448–6460
Article Google Scholar
Xing Y, Zhu J, Li Y et al (2023) An improved spatial temporal graph convolutional network for robust skeleton-based action recognition. Appl Intell 53(4):4592–4608
Article Google Scholar
Yu L, Tian L, Du Q et al (2023) Multi-stream adaptive 3d attention graph convolution network for skeleton-based action recognition. Appl Intell 53(12):14838–14854
Article Google Scholar
Yang W, Zhang J, Cai J et al (2023) Hybridnet: Integrating gcn and cnn for skeleton-based action recognition. Appl Intell 53(1):574–585
Article Google Scholar
Sun K, Xiao B, Liu D, et al (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 5693–5703
Gianaria E, Balossino N, Grangetto M, et al (2013) Gait characterization using dynamic skeleton acquisition. In: 2013 IEEE 15th International Workshop on Multimedia Signal Processing (MMSP). IEEE, pp 440–445
Cao Z, Simon T, Wei SE, et al (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp 7291–7299
Fang HS, Xie S, Tai YW, et al (2017) Rmpe: Regional multi-person pose estimation. In: Proceedings of the IEEE international conference on computer vision. pp 2334–2343
Chou CJ, Chien JT, Chen HT (2018) Self adversarial training for human pose estimation. In: 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, pp 17–30
Zheng J, Liu X, Liu W, et al (2022) Gait recognition in the wild with dense 3d representations and a benchmark. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 20228–20237
Liao R, Yu S, An W et al (2020) A model-based gait recognition method with body pose and human prior knowledge. Pattern Recogn 98:107069
Article Google Scholar
Teepe T, Khan A, Gilg J, et al (2021) Gaitgraph: Graph convolutional network for skeleton-based gait recognition. In: 2021 IEEE International Conference on Image Processing (ICIP). IEEE, pp 2314–2318
Cosma A, Radoi IE (2021) Wildgait: Learning gait representations from raw surveillance streams. Sensors 21(24):8387
Article Google Scholar
Pinyoanuntapong E, Ali A, Wang P, et al (2023) Gaitmixer: skeleton-based gait representation learning via wide-spectrum multi-axial mixer. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 1–5
Hua G, Long C, Yang M, et al (2013) Collaborative active learning of a kernel machine ensemble for recognition. In: Proceedings of the IEEE international conference on computer vision. pp 1209–1216
Hu T, Long C, Xiao C (2021) A novel visual representation on text using diverse conditional gan for visual recognition. IEEE Trans Image Process 30:3499–3512
Article Google Scholar
Long C, Hua G (2015) Multi-class multi-annotator active learning with robust gaussian process for visual recognition. In: Proceedings of the IEEE international conference on computer vision. pp 2839–2847
Wang Y, Kitani K, Weng X (2021) Joint object detection and multi-object tracking with graph neural networks. In: 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, pp 13708–13715
Zeng R, Huang W, Tan M, et al (2019) Graph convolutional networks for temporal action localization. In: Proceedings of the IEEE/CVF international conference on computer vision. pp 7094–7103
Islam A, Long C, Radke R (2021) A hybrid attention mechanism for weakly-supervised temporal action localization. In: Proceedings of the AAAI conference on artificial intelligence. pp 1637–1645
Shi L, Wang L, Long C, et al (2021) Sgcn: Sparse graph convolution network for pedestrian trajectory prediction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 8994–9003
Yan S, Xiong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI conference on artificial intelligence
Shi L, Zhang Y, Cheng J, et al (2019) Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12026–12035
Teepe T, Gilg J, Herzog F, et al (2022) Towards a deeper understanding of skeleton-based gait recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp 1569–1577
Liao R, Cao C, Garcia EB, et al (2017) Pose-based temporal-spatial network (ptsn) for gait recognition with carrying and clothing variations. In: Biometric Recognition: 12th Chinese Conference, CCBR 2017, Shenzhen, China, October 28-29, 2017, Proceedings 12. Springer, pp 474–483
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, pp 234–241
Sokolova A, Konushin A (2019) Pose-based deep gait recognition. IET. Biometrics 8(2):134–143
Google Scholar
Yu S, Tan D, Tan T (2006) A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In: 18th international conference on pattern recognition (ICPR’06). IEEE, pp 441–444
Liu X, You Z, He Y et al (2022) Symmetry-driven hyper feature gcn for skeleton-based gait recognition. Pattern Recogn 125:108520
Article Google Scholar
Tian H, Ma X, Wu H et al (2022) Skeleton-based abnormal gait recognition with spatio-temporal attention enhanced gait-structural graph convolutional networks. Neurocomputing 473:116–126
Liao R, Li Z, Bhattacharyya SS et al (2022) Posemapgait: A model-based gait recognition method with pose estimation maps and graph convolutional networks. Neurocomputing 501:514–528
Li Q, Han Z, Wu XM (2018) Deeper insights into graph convolutional networks for semi-supervised learning. In: Proceedings of the AAAI conference on artificial intelligence
Mao W, Liu M, Salzmann M, et al (2019) Learning trajectory dependencies for human motion prediction. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9489–9497
Song YF, Zhang Z, Shan C, et al (2020) Stronger, faster and more explainable: A graph convolutional baseline for skeleton-based action recognition. In: proceedings of the 28th ACM international conference on multimedia. pp 1625–1633
Khosla P, Teterwak P, Wang C et al (2020) Supervised contrastive learning. Adv Neural Inf Process Syst 33:18661–18673
Google Scholar
Cheng K, Zhang Y, Cao C, et al (2020) Decoupling gcn with dropgraph module for skeleton-based action recognition. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIV 16. Springer, pp 536–553
Song C, Huang Y, Huang Y et al (2019) Gaitnet: An end-to-end network for gait based human identification. Pattern Recogn 96:106988
Article Google Scholar
Wu Z, Huang Y, Wang L et al (2016) A comprehensive study on cross-view gait based human identification with deep cnns. IEEE Trans Pattern Anal Mach Intell 39(2):209–226
Article Google Scholar
Han J, Bhanu B (2005) Individual recognition using gait energy image. IEEE Trans Pattern Anal Mach Intell 28(2):316–322
Article Google Scholar
Wang C, Zhang J, Pu J, et al (2010) Chrono-gait image: A novel temporal template for gait recognition. In: Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part I 11. Springer, pp 257–270
Zhang Y, Huang Y, Yu S et al (2019) Cross-view gait recognition by discriminative feature learning. IEEE Trans Image Process 29:1001–1015
Article MathSciNet Google Scholar
Xu C, Makihara Y, Li X et al (2020) Cross-view gait recognition using pairwise spatial transformer networks. IEEE Trans Circuits Syst Video Technol 31(1):260–274
Article Google Scholar
Takemura N, Makihara Y, Muramatsu D et al (2017) On input/output architectures for convolutional neural network-based cross-view gait recognition. IEEE Trans Circuits Syst Video Technol 29(9):2708–2719
Article Google Scholar
Lin B, Zhang S, Bao F (2020) Gait recognition with multiple-temporal-scale 3d convolutional neural network. In: Proceedings of the 28th ACM international conference on multimedia. pp 3054–3062
Si C, Chen W, Wang W, et al (2019) An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1227–1236
Li N, Zhao X, Ma C (2020) Jointsgait: A model-based gait recognition method based on gait graph convolutional networks and joints relationship pyramid mapping. arXiv:2005.08625
Smith LN, Topin N (2019) Super-convergence: Very fast training of neural networks using large learning rates. In: Artificial intelligence and machine learning for multi-domain operations applications, SPIE, pp 369–386
Zhu Z, Guo X, Yang T, et al (2021) Gait recognition in the wild: A benchmark. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 14789–14799
Selvaraju RR, Cogswell M, Das A, et al (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626
Yu S, Chen H, Garcia Reyes EB, et al (2017) Gaitgan: Invariant gait feature extraction using generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 30–37
Yu S, Liao R, An W et al (2019) Gaitganv 2: Invariant gait feature extraction using generative adversarial networks. Pattern Recogn 87:179–189
Article Google Scholar
He Y, Zhang J, Shan H et al (2018) Multi-task gans for view-specific feature learning in gait recognition. IEEE Trans Inf Forensics Secur 14(1):102–113
Article Google Scholar
Shiraga K, Makihara Y, Muramatsu D, et al (2016) Geinet: View-invariant gait recognition using a convolutional neural network. In: 2016 international conference on biometrics (ICB). IEEE, pp 1–8
Wu Z, Huang Y, Wang L et al (2016) A comprehensive study on cross-view gait based human identification with deep cnns. IEEE Trans Pattern Anal Mach Intell 39(2):209–226
Article Google Scholar
Hou S, Cao C, Liu X, et al (2020) Gait lateral network: Learning discriminative and compact representations for gait recognition. In: European conference on computer vision, Springer, pp 382–398
Lin B, Zhang S, Yu X (2021) Gait recognition via effective global-local feature representation and local temporal aggregation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 14648–14656
Wu Y, Wang Y, Li Y et al (2022) Top-k self-adaptive contrast sequential pattern mining. IEEE transactions on cybernetics 52(11):11819–11833
Article Google Scholar

Download references

Funding

This study was funded by Shenzhen Startup Funding (No. QD2023014C) and National Natural Science Fundation of China (No. 61906074).

Author information

Guanghai Chen and Xin Chen contributed equally to this work.

Authors and Affiliations

College of Electronic Engineering, College of Artificial Intelligence, South China Agricultural University, Wushan Street five road No. 483, Guangzhou, 510642, Guangdong, China
Guanghai Chen, Xin Chen & Xinchao Liu
National Center for International Collaboration Research on Precision Agricultural Aviation Pesticides Spraying Technology, South China Agricultural University, Wushan Street five road No. 483, Guangzhou, 510642, Guangdong, China
Guanghai Chen, Xin Chen & Xinchao Liu
Guangdong Engineering Technology Research Center of Smart Agriculture, Guangzhou, 510642, Guangdong, China
Xin Chen
Key Laboratory of Smart Agricultural Technology in Tropical South China, Guangzhou, 510642, Guangdong, China
Xin Chen
Ministry of Agriculture and Rural Affairs, People’s Republic of China, Guangzhou, 510642, Guangdong, China
Xin Chen
Harbin Institute of Technology National Engineering Research Center of Urban Water Resources Co., Ltd., No.73, Huanghe Road, Nangang District, Harbin, 150000, Heilongjiang, China
Chengzhi Zheng
School of robotics, Guangdong Open University, No.1 Xiandang West Road, Yuexiu District, Guangzhou, 510091, Guangdong, China
Junshu Wang
Shenzhen International Graduate School, Tsinghua University, University Town of Shenzhen, Nanshan District, Shenzhen, 518057 , People’s Republic of China
Yuxing Han

Authors

Guanghai Chen
View author publications
You can also search for this author inPubMed Google Scholar
Xin Chen
View author publications
You can also search for this author inPubMed Google Scholar
Chengzhi Zheng
View author publications
You can also search for this author inPubMed Google Scholar
Junshu Wang
View author publications
You can also search for this author inPubMed Google Scholar
Xinchao Liu
View author publications
You can also search for this author inPubMed Google Scholar
Yuxing Han
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Guanghai Chen developed and implemented the algorithms and models used in the study and wrote the first draft. Xin Chen was responsible for the project design and overall plan of the study. Chengzhi Zheng was responsible for data organization. Junshu Wang was responsible for reviewing the paper manuscript and suggesting important changes to the paper content. Xinchao Liu was responsible for the visualization and graphing of the experimental results. Yuxing Han is the corresponding author of this study. First Author and Second Author contribute equally to this work and should be considered co-first authors.

Corresponding author

Correspondence to Yuxing Han.

Ethics declarations

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

Ethical and informed consent for data used

Not applicable. The work in this paper has no ethical or moral implications such as human or animal experimentation. The work presented in this article is entirely original and has not been published in any other journals. This journal is the premiere and exclusive contributing journal for the paper. There are no violations of academic ethics. The right to use the data used in this study has been approved by the owner.

Consent to participate

All participants in this study were informed and consented to participate in the study.

Consent for publication

Participants in this study gave their consent for the results to be used for publication, presentation, or sharing.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chen, G., Chen, X., Zheng, C. et al. Spatiotemporal smoothing aggregation enhanced multi-scale residual deep graph convolutional networks for skeleton-based gait recognition. Appl Intell 54, 6154–6174 (2024). https://doi.org/10.1007/s10489-024-05422-0

Download citation

Accepted: 27 March 2024
Published: 08 May 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s10489-024-05422-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatiotemporal smoothing aggregation enhanced multi-scale residual deep graph convolutional networks for skeleton-based gait recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-scale Spatial and Temporal Feature Aggregation Graph Convolutional Network for Skeleton-Based Action Recognition

Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network

MGSAN: multimodal graph self-attention network for skeleton-based action recognition

Explore related subjects

Data availability and access

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical and informed consent for data used

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now