High-order local connection network for 3D human pose estimation based on GCN

Wu, Wei; Zhou, Dongsheng; Zhang, Qiang; Dong, Jing; Wei, Xiaopeng

doi:10.1007/s10489-022-03312-x

High-order local connection network for 3D human pose estimation based on GCN

Published: 17 March 2022

Volume 52, pages 15690–15702, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Wei Wu¹,
Dongsheng Zhou^1,2,
Qiang Zhang^1,2,
Jing Dong ORCID: orcid.org/0000-0003-3489-6661¹ &
…
Xiaopeng Wei²

597 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Theskeleton structure of human body is a natural undirected graph. Being applied to 3D body pose estimation, graph convolutional network (GCN) has achieved good results. However, the vanilla GCN ignores the differences between joints and the connections between joints with different distances. Based on the above two problems, we propose High-order Local Connection Network (HLCN) for 3D human pose estimation. On one hand, different filters for different joints are assigned to produce different weights. On the other hand, the feature of multi-hop joints synthetically is gathered into HLCN. Furthermore, we study different methods of fusing these multi-hop features and compare their performance. The new network not only takes the differences between the joints in the human skeleton into consideration, but also captures the remote dependencies between human joints. The experiment suggests that this method is superior to vanilla GCN and achieve state-of-the-art performance. The average error on the H36M dataset is 50.9 mm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep learning-based 3D reconstruction: a survey

Article 28 January 2023

Taha Samavati & Mohsen Soryani

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

A review of computer vision-based approaches for physical rehabilitation and assessment

Article Open access 19 June 2021

Bappaditya Debnath, Mary O’Brien, … Ardhendu Behera

References

Feichtenhofer C (2020) X3d:Expanding architectures for efficient video recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 203–213. arXiv:2004.04730
Munro J, Damen D (2020) Multi-modal domain adaptation for fine-grained action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 122–132. https://doi.org/10.1109/CVPR42600.2020.00020
Yang C, Xu Y, Shi J, Dai B, Zhou B (2020) Temporal pyramid network for action recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 591–600. https://doi.org/10.1109/CVPR42600.2020.00067
Porcheron M, Fischer J.E, Reeves S, Sharples S (2018) Voice interfaces in everyday life. In: Proceedings of the 2018 CHI conference on human factors in computing systems, pp 1–12. https://doi.org/10.1145/3X00000.1735743174214
Wu S, Wang Z, Shen B, Wang J-H, Dongdong L (2020) Human-computer interaction based on machine vision of a smart assembly workbench. Assembly Automation. https://doi.org/10.1108/AA-10-2018-0170
Pustejovsky J, Krishnaswamy N (2021) Embodied human computer interaction. KI-Künstliche Intelligenz. https://doi.org/10.1007/s13218-021-00727-5
Chan C, Ginosar S, Zhou T, Efros A.A (2019) Everybody dance now. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5933–5942. arXiv:1808.07371v2
Ma L, Jia X, Sun Q, Schiele B, Tuytelaars T, Van Gool L (2017) Pose guided person image generation. In: Proceedings of the 31st international conference on neural information processing systems. NIPS’17. arXiv:1705.09368v1. Curran Associates Inc., Red Hook, pp 405–415
Siarohin A, Sangineto E, Lathuiliere S, Sebe N (2018) Deformable gans for pose-based human image generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3408–3416. https://doi.org/10.1109/CVPR.2018.00359
Moon G, Lee K.M (2020) I2l-meshnet:Image-to-lixel prediction network for accurate 3d human pose and mesh estimation from a single rgb image. In: Computer Vision–ECCV 2020:16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16, Springer, pp 752–768. arXiv:2008.03713
Pavlakos G, Zhou X, Daniilidis K (2018) Ordinal depth supervision for 3d human pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7307– 7316. https://doi.org/10.1109/CVPR.2018.00763
Pavlakos G, Zhou X, Derpanis K.G, Daniilidis K (2017) Coarse-to-fine volumetric prediction for single-image 3d human pose. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7025–7034. https://doi.org/10.1109/CVPR.2017.139
Li C, Lee G.H (2019) Generating multiple hypotheses for 3d human pose estimation with mixture density network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9887–9895. arXiv:1904.05547
Wang M, Chen X, Liu W, Qian C, Lin L, Ma L (2018) Drpose3d:Depth ranking in 3d human pose estimation. In: Proceedings of the 27th international joint conference on artificial intelligence. IJCAI’18, pp 978–984. arXiv:1805.08973
Martinez J, Hossain R, Romero J, Little J.J (2017) A simple yet effective baseline for 3d human pose estimation. In: Proceedings of the IEEE international conference on computer vision, pp 2640–2649. https://doi.org/10.1109/ICCV.2017.288
Tekin B, Márquez-Neila P, Salzmann M, Fua P (2017) Learning to fuse 2d and 3d image cues for monocular body pose estimation. In: Proceedings of the IEEE international conference on computer vision, pp 3941–3950. arXiv:1611.05708
Zhou K, Han X, Jiang N, Jia K, Lu J (2019) Hemlets pose:Learning part-centric heatmap triplets for accurate 3d human pose estimation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2344–2353. https://doi.org/10.1109/ICCV.2019.00243
Wu Y, Jiang X, Fang Z, Gao Y, Fujita H (2021) Multi-modal 3d object detection by 2d-guided precision anchor proposal and multi-layer fusion. Appl Soft Comput 108:107405. https://doi.org/10.1016/j.asoc.2021.107405
Article Google Scholar
Xiao J, Li H, Qu G, Fujita H, Cao Y, Zhu J, Huang C (2021) Hope:heatmap and offset for pose estimation. Journal of Ambient Intelligence and Humanized Computing, pp 1–13. https://doi.org/10.1007/s12652-021-03124-w
Kipf T.N, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. arXiv:1609.02907
Ci H, Wang C, Ma X, Wang Y (2019) Optimizing network structure for 3d human pose estimation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2262–2271. https://doi.org/10.1109/ICCV.2019.00235
Liu K, Ding R, Zou Z, Wang L, Tang W (2020) A comprehensive study of weight sharing in graph networks for 3d human pose estimation. In: European conference on computer vision, Springer, pp 318–334. https://doi.org/10.1007/978-3-030-58607-2_19
Zhao L, Peng X, Tian Y, Kapadia M, Metaxas D.N (2019) Semantic graph convolutional networks for 3d human pose regression. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3425–3435. https://doi.org/10.1109/CVPR.2019.00354
Xu T, Takano W (2021) Graph stacked hourglass networks for 3d human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16105–16114. arXiv:2103.16385
Liu K, Zou Z, Tang W (2020) Learning global pose features in graph convolutional networks for 3d human pose estimation. In: Proceedings of the Asian conference on computer vision. https://accv2020.github.io/miniconf/poster_167.html
Liu J, Rojas J, Li Y, Liang Z, Guan Y, Xi N, Zhu H (2021) A graph attention spatio-temporal convolutional network for 3d human pose estimation in video. In: 2021 IEEE international conference on robotics and automation (ICRA), IEEE, pp 3374–3380. https://doi.org/10.1109/ICRA48506.2021.9561605
Cai Y, Ge L, Liu J, Cai J, Cham T.-J, Yuan J, Thalmann NM (2019) Exploiting spatial-temporal relationships for 3d pose estimation via graph convolutional networks. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2272–2281. https://doi.org/10.1109/ICCV.2019.00236
Ionescu C, Papava D, Olaru V, Sminchisescu C (2013) Human3.6m:Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Trans Pattern Anal Mach Intell 36 (7):1325–1339. https://doi.org/10.1109/TPAMI.2013.248
Article Google Scholar
Bruna J, Zaremba W, Szlam A, LeCun Y (2014) Spectral networks and locally connected networks on graphs. In: International conference on learning representations (ICLR2014), CBLS, April 2014. arXiv:1312.6203
Xu B, Shen H, Cao Q, Qiu Y, Cheng X (2019) Graph wavelet neural network. In: International conference on learning representations. arXiv:1904.07785v1
Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in neural information processing systems, vol 29. arXiv:1606.09375v2
Monti F, Boscaini D, Masci J, Rodola E, Svoboda J, Bronstein M.M (2017) Geometric deep learning on graphs and manifolds using mixture model cnns. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5115–5124. arXiv:1611.08402
Gilmer J, Schoenholz S.S, Riley P.F, Vinyals O, Dahl G.E (2017) Neural message passing for quantum chemistry. In: International conference on machine learning, PMLR, pp 1263–1272. https://doi.org/10.5555/3305381.3305512
Mehta D, Rhodin H, Casas D, Fua P, Sotnychenko O, Xu W, Theobalt C (2017) Monocular 3d human pose estimation in the wild using improved cnn supervision. In: 2017 international conference on 3D vision (3DV), IEEE, pp 506–516. https://doi.org/10.1109/3DV.2017.00064
Newell A, Yang K, Deng J (2016) Stacked hourglass networks for human pose estimation. In: European conference on computer vision, Springer, pp 483–499. arXiv:1603.06937
Andriluka M, Pishchulin L, Gehler P, Schiele B (2014) 2d human pose estimation:New benchmark and state of the art analysis. In: IEEE Conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/CVPR.2014.471
Pavllo D, Feichtenhofer C, Grangier D, Auli M (2019) 3d human pose estimation in video with temporal convolutions and semi-supervised training. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7753–7762. https://doi.org/10.1109/CVPR.2019.00794
Kingma D.P, Ba J (2015) 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings Bengio Y, LeCun Y (eds). arXiv:1412.6980
Luvizon DC, Picard D, Tabia H (2018) 2d/3d pose estimation and action recognition using multitask deep learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5137–5146. arXiv:1802.09232
Sharma S, Varigonda PT, Bindal P, Sharma A, Jain A (2019) Monocular 3d human pose estimation by generation and ordinal ranking. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2325–2334. arXiv:1904.01324
Wang J, Huang S, Wang X, Tao D (2019) Not all parts are created equal:3d pose estimation by modeling bi-directional dependencies of body parts. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7771–7780. arXiv:1905.07862
Zou Z, Liu K, 0003 LW, Tang W (2020) High-order graph convolutional networks for 3d human pose estimation. In: BMVC. https://www.evl.uic.edu/pubs/2518
Fang H-S, Xu Y, Wang W, Liu X, Zhu S-C (2018) Learning pose grammar to encode human body configuration for 3d pose estimation. In: Proceedings of the AAAI conference on artificial intelligence, vol 32. arXiv:1710.06513
Yang W, Ouyang W, Wang X, Ren J, Li H, Wang X (2018) 3d human pose estimation in the wild by adversarial learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5255–5264. https://doi.org/10.1109/CVPR.2018.00551
Ci H, Ma X, Wang C, Wang Y. (2020) Locally connected network for monocular 3d human pose estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2020.3019139
Johnson S, Everingham M (2010) Clustered pose and nonlinear appearance models for human pose estimation. In: Bmvc, vol 2, pp 5. https://doi.org/10.5244/C.24.12. Citeseer

Download references

Acknowledgements

This work was supported in part by the Key Program of NSFC (Grant No.U1908214), Dalian University Scientific Research Platform Project (No. 202101YB03), Special Project of Central Government Guiding Local Science and Technology Development (Grant No. 2021JH6/10500140), Program for the Liaoning Distinguished Professor, Program for Innovative Research Team in University of Liaoning Province, Dalian and Dalian University, and in part by the Science and Technology Innovation Fund of Dalian (Grant No. 2020JJ25CY001).

Author information

Authors and Affiliations

National and Local Joint Engineering Laboratory of Computer Aided Design, School of Software Engineering, DaLian University, Dalian, 116622, LiaoNing, China
Wei Wu, Dongsheng Zhou, Qiang Zhang & Jing Dong
School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, LiaoNing, China
Dongsheng Zhou, Qiang Zhang & Xiaopeng Wei

Authors

Wei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Dongsheng Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Dong
View author publications
You can also search for this author in PubMed Google Scholar
Xiaopeng Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jing Dong or Xiaopeng Wei.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, W., Zhou, D., Zhang, Q. et al. High-order local connection network for 3D human pose estimation based on GCN. Appl Intell 52, 15690–15702 (2022). https://doi.org/10.1007/s10489-022-03312-x

Download citation

Accepted: 28 January 2022
Published: 17 March 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s10489-022-03312-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-order local connection network for 3D human pose estimation based on GCN

Abstract

Access this article

Similar content being viewed by others

Deep learning-based 3D reconstruction: a survey

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

A review of computer vision-based approaches for physical rehabilitation and assessment

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

Deep learning-based 3D reconstruction: a survey

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

A review of computer vision-based approaches for physical rehabilitation and assessment

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation