Position constrained network for 3D human pose estimation

Dong, Xiena; Yu, Jun; Zhang, Jian

doi:10.1007/s00530-021-00880-9

Position constrained network for 3D human pose estimation

Special Issue Paper
Published: 02 February 2022

Volume 29, pages 459–468, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

445 Accesses
1 Citation
Explore all metrics

Abstract

Human pose estimation is a challenging research task in the field of computer vision. The current mainstream works have made great progress in pose estimation, but these works still have weakness in two aspects: first, the feature extraction module is not competent for representation learning; second, the training process does not take fully advantage of the projection model from 3D space to 2D plane. In this work, we propose a human pose estimation framework which exploits 3D root coordinates as subordinate input to 2D joint coordinates to provide positional reference to the recovered 3D joint coordinates, and employs inner camera parameters to construct additional projection constraints for recovering 3D joint coordinates. Moreover, we enhance the feature learning through residual branch. We tested our method on two benchmark datasets for human pose estimation, and the experimental results show that the proposed model is superior to current mainstream algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Staged cascaded network for monocular 3D human pose estimation

Article 23 April 2022

A Multi-scale Recalibrated Approach for 3D Human Pose Estimation

3D Human Pose Estimation Based on Multi-feature Extraction

References

Cai, Y., Ge, L., Liu, J., Cai, J., Cham, TJ., Yuan, J., halmann, TNM.: Exploiting spatial-temporal relationships for 3d pose estimation via graph convolutional net works. In: ICCV, pp. 2272–2281 (2019)
Chen, CH., Ramanan, D.: 3D human pose estimation = 2D pose estimation + matching. In: CVPR, pp. 5759–5767 (2017)
Chen, Y., Wang, Z., Peng, Y., Zhang, Z.: Cascaded pyramid network for multi-person pose estimation. In: CVPR, pp. 7103–7112 (2018)
Chen, X., Lin, K., Liu, W., Qian, C., Lin, L.: Weakly-supervised discovery of geometry-aware representation for 3D HumanPose estimation. In: CVPR, pp. 10895–10904 (2019)
Gupta, V.: Back to the future: joint aware temporal deep learning 3D human pose estimation. in arXiv preprint arXiv. 2020 (2020)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Hossain, MRI., Little, JJ.: Exploiting temporal information for 3d pose estimation. In: ECCV (2018)
Huang, K., Sui, TQ., Wu, H.: 3D human pose estimation with multi-scale graph convolution and hierarchical body pooling. In: Multimedia Systems (2021)
Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3.6m: Large scale datasets and predictive methods for 3D human sensing in natural environments. In: TPAMI, pp. 1325–1339 (2014)
Iqbal, U., Molchanov, P., Kautz, J.: Weakly-supervised 3D human pose learning via multi-view images in the wild. In: CVPR, pp. 5243–5252 (2020)
Jiang, H.: 3D human pose reconstruction using millions of exemplars. In: ICPR, pp. 1674–1677 (2010)
Kanazawa, A., Black, MJ., Jacobs, DW., Malik, J.: End-to-end recovery of human shape and pose. In: CVPR. pp. 7122–7131 (2018)
Kanazawa, A., Zhang, JY., Felsen, P., Malik, J.: Learning 3d human dynamics from video. In: CVPR, pp. 5614–5623 (2019)
Katircioglu, I., Tekin, B., Salzmann, M., Lepetit, V., Fua, P.: Learning latent representations of 3d human pose with deep neural networks. In: IJCV, pp. 1–16 (2018)
Lee, K., Lee, I., Lee, S.: Propagating LSTM: 3d pose estimation based on joint interdependency. In: ECCV. pp. 119–135 (2018)
Li, C., Lee, GH.: Generating multiple hypotheses for 3d human pose estimation with mixture density network. In: CVPR, pp. 9887–9895 (2019)
Lin, M., Lin, L., Liang, X., Wang, K., Cheng, H.: Recurrent 3d pose sequence machines. In: CVPR, pp. 810–819 (2017)
Lin, J., Lee, GH.: Trajectory space factorization for deep video-based 3d human pose estimation. In: arXiv preprint arXiv. 2019 (2019)
Lin, K., Wang, L., Liu, Z.: End-to-end human pose and mesh reconstruction with transformers. In: arXiv preprint arXiv. 2020 (2020)
Liu, R., Shen, J., Wang, H., Chen, C., Cheung, S., Asari, V.: Attention mechanism exploits temporal contexts: Real-time 3d human pose reconstruction. In: CVPR, pp. 5064–5073 (2020)
Martinez, J., Hossain, R., Romero, J., Little, JJ.: A simple yet effective baseline for 3d human pose estimation. In: ICCV, pp. 2659–2668 (2017)
Moreno-Noguer, F.: 3d human pose estimation from a single image via distance matrix regression. In: CVPR, pp. 2823–2832 (2017)
Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: ECCV, pp. 483–499 (2016)
Pavlakos, G., Zhou, X., Derpanis, KG., Daniilidis, K.: Coarse-to-fine volumetric prediction for single-image 3d human pose. In: CVPR, pp. 1263–1272 (2017)
Pavllo, D., Feichtenhofer, C., Grangier, D., Auli, M.: 3d human pose estimation in video with temporal convolutions and semi-supervised training. In: CVPR, pp. 7753–7762 (2019)
Reddi, SJ., Kale, S., Kumar, S.: On the convergence of Adam and beyond. In: ICLR (2018)
Sigal, L., Balan, AO., Black, MJ.: HumanEva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. In: IJCV (2010)
Skakov, K., Burkov, E., Lempitsky, V., Malkov, Y.: Learnable Triangulation of Human Pose. In: CVPR. pp. 7718–7727 (2019)
Sun, X., Xiao, B., Wei, F., Liang, S., Wei, Y.: Integral human pose regression. In: ECCV, pp. 529–545 (2018)
Tekin, B., Katircioglu, I., Salzmann, M., Lepetit, V., Fua, P.: Structured prediction of 3d human pose with deep neural networks. In: BMVC (2016)
Tekin, B., Rozantsev, A., Lepetit, V., Fua, P.: Direct prediction of 3d body poses from motion compensated sequences. In: CVPR, pp. 991–1000 (2016)
Tekin, B., Marquez-Neila, P., Salzmann, M., Fua, P.: Learning to fuse 2d and 3d image cues for monocular body pose estimation. In: ICCV, pp. 3941–3950 (2017)
Wang, Z., Wei, D., Hu, X., Luo, Y.: Human skeleton mutual learning for person reidentification. In: Neurocomputing, pp. 309–323 (2020)
Wang, J., Yan, S., Xiong, Y., Lin, D.: Motion guided 3d pose estimation from videos. In: ECCV, pp. 764–780 (2020)
Xie, R., Wang, C., Wang, Y.: MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation. In: CVPR, pp. 13686–13695 (2020)
Xu, T., Takano, W.: Graph Stacked Hourglass Networks for 3D Human Pose Estimation. In: CVPR, pp. 16105–16114 (2021)
Yang, Y., Deng, C., Tao, D., Zhang, S., Liu, W., Gao, X.: Latent max-margin multitask learning with skelets for 3-d action recognition. In: IEEE Transactions on Cybernetics, pp. 439–448 (2017)
Yeh, R., Hu, Y., Schwing, A.: Chirality nets for human pose regression. In: NeurIPS (2019)
Yu, J., Rui, Y., Chen, B.: Exploiting Click Constraints and Multi-view Features for Image Re-ranking. In: IEEE Transactions on Multimedia, pp. 159–168 (2013)
Yu, J., Rui, Y., Tao, D.: Click Prediction for Web Image Reranking Using Multimodal Sparse Coding. In: IEEE Transactions on Image Processing, pp. 2019–2032 (2014)
Yu, J., Tan, M., Zhang, H., Tao, D., Rui, Y.: Hierarchical deep click feature prediction for fine-grained image recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–14 (2019)
Yu, Z., Yu, J., Cui, Y., Tao, D., Tian, Q.: Deep modular co-attention networks for visual question answering. In CVPR, pp. 6274–6283 (2019)
Zhang, Z., Wang, C., Qin, W., Zeng, W.: Fusing wearable IMUs with multi-view images for human pose estimation: a geometric approach. In: CVPR, pp. 2200–2209 (2020)
Zhu, J., Zou, W., Zhu, Z., Hu, Y.: Convolutional relation network for skeleton-based action recognition. In: Neurocomputing, pp. 109–117 (2019)

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China under Grant 61836002, Grant 62125201, Grant 62020106007, Grant 62002314 and Grant 61972361

Author information

Authors and Affiliations

Department of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, 310018, China
Xiena Dong & Jun Yu
Department of Science and Technology, Zhejiang International Studies University, Hangzhou, 310023, China
Jian Zhang

Authors

Xiena Dong
View author publications
You can also search for this author in PubMed Google Scholar
Jun Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jian Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by the National Natural Science Foundation of China under Grant 61836002, Grant 62125201, Grant 62020106007, Grant 62002314 and Grant 61972361.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dong, X., Yu, J. & Zhang, J. Position constrained network for 3D human pose estimation. Multimedia Systems 29, 459–468 (2023). https://doi.org/10.1007/s00530-021-00880-9

Download citation

Received: 24 August 2021
Accepted: 09 December 2021
Published: 02 February 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s00530-021-00880-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Position constrained network for 3D human pose estimation

Abstract

Access this article

Similar content being viewed by others

Staged cascaded network for monocular 3D human pose estimation

A Multi-scale Recalibrated Approach for 3D Human Pose Estimation

3D Human Pose Estimation Based on Multi-feature Extraction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Position constrained network for 3D human pose estimation

Abstract

Access this article

Similar content being viewed by others

Staged cascaded network for monocular 3D human pose estimation

A Multi-scale Recalibrated Approach for 3D Human Pose Estimation

3D Human Pose Estimation Based on Multi-feature Extraction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation