HQLC-Overlap: an adaptive low-cost binocular 3D human pose estimation model

Wang, Hao; Sun, Minghui

doi:10.1007/s11042-022-14156-5

HQLC-Overlap: an adaptive low-cost binocular 3D human pose estimation model

Published: 10 November 2022

Volume 82, pages 17159–17173, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hao Wang^1,2 &
Minghui Sun^1,2

222 Accesses
1 Altmetric
Explore all metrics

Abstract

In single-view 3D human pose modeling and analysis, there are always many hard problems of occlusion and blind spots that cannot be completely solved by single-view. Additionally, the multi-view training and complex view fusion greatly increase the training and application cost of the multi-view model. Therefore, we implemented a novel model based on dynamic binocular 3D pose overlap. It filters the views by a view filtering method to get the two best pose observation views. Then, it uses these two views to simulate the process of high-precision 3D collaborative imaging of an object by the human eye. Compared with most current single-view or multi-view models, HQLC-Overlap not only combines the advantages of the single-view model based on the high-quality view attention mechanism, but also solves the inherent problems of the single-view model through the binocular estimation mode. In this article, based on these filtered views in the data, we also counted and visualized the model’s estimation error of a large number of pose images and corrected them. The principle of HQLC-Overlap model shows that it has the advantages of fast, low computational cost and dynamic flexibility for multiple views. In the experiment, we used two large-scale human pose datasets and completed the ablation experiment of this model and the comparison experiment with other models. The experimental results show that it greatly improves the 3D pose estimation accuracy of the model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

Article 16 November 2020

Multi-virtual View Scoring Network for 3D Hand Pose Estimation from a Single Depth Image

A Method of Constructing Fine-Grained Pose Evaluation Model

References

Amin S, Andriluka M, Rohrbach M, Schiele B (2013) Multi-view pictorial structures for 3d human pose estimation. In: Bmvc, vol 1
Burenius M, Sullivan J, Carlsson S (2013) 3d pictorial structures for multiple view articulated pose estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3618–3625
Chen J, Ying H, Liu X, Gu J, Feng R, Chen T, Gao H, Wu J (2020) A transfer learning based super-resolution microscopy for biopsy slice images: the joint methods perspective. IEEE/ACM Trans Comput Biol Bioinform 18 (1):103–113
Google Scholar
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078
Gao H, Xu K, Cao M, Xiao J, Xu Q, Yin Y (2021) The deep features and attention mechanism-based method to dish Healthcare under social IoT systems: an empirical study with a hand-deep local-global net. IEEE Trans Comput Soc Syst(TCSS) 9(1):336–347
Article Google Scholar
Gholami M, Rezaei A, Rhodin H, Ward R, Wang ZJ (2022) Self-supervised 3D human pose estimation from video. Neurocomputing 488(1):97–106
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: NIPS, pp 2672–2680
Guler RA, Kokkinos I (2019) Holopose Holistic 3d human reconstruction in-the-wild. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10884–10894
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, pp 630–645
Ionescu C, Li F, Sminchisescu C (2011) Latent structured models for human pose estimation. In: 2011 International conference on computer vision. IEEE, pp 2220–2227
Ionescu C, Papava D, Olaru V, Sminchisescu C (2013) Human3. 6m: Large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Trans Pattern Anal Mach Intell 36(7):1325–1339
Article Google Scholar
Kanazawa A, Black MJ, Jacobs DW, Malik J (2018) End-to-end recovery of human shape and pose. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7122–7131
Kanazawa A, Zhang J Y, Felsen P, Malik J (2019) Learning 3d human dynamics from video. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5614–5623
Kocabas M, Athanasiou N, Black M.J (2020) Vibe: Video inference for human body pose and shape estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5253–5263
Kocabas M, Karagoz S, Akbas E (2019) Self-supervised learning of 3d human pose using multi-view geometry. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1077–1086
Kolotouros N, Pavlakos G, Black MJ, Daniilidis K (2019) Learning to reconstruct 3d human pose and shape via model-fitting in the loop. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2252–2261
Li Z, Oskarsson M, Heyden A (2021) 3d Human pose and shape estimation through collaborative learning and multi-view model-fitting. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1888–1897
Li Z, Yu T, Zheng Z, Guo K, Liu Y (2021) Posefusion: Pose-guided selective fusion for single-view human volumetric capture. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14162–14172
Loper M, Mahmood N, Romero J, Pons-Moll G, Black MJ (2015) Smpl: A skinned multi-person linear model. ACM Trans Graph 34(6):1–16
Article Google Scholar
Ma X, Su J, Wang C, Ci H, Wang Y (2021) Context modeling in 3d human pose estimation: A unified perspective. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6238–6247
Mehta D, Rhodin H, Casas D, Fua P, Sotnychenko O, Xu W, Theobalt C (2017) Monocular 3d human pose estimation in the wild using improved cnn supervision. In: 2017 International conference on 3D vision (3DV), pp 506–516. IEEE
Mehta D, Sridhar S, Sotnychenko O, Rhodin H, Shafiei M, Seidel H-P, Xu W, Casas D, Theobalt C (2017) Vnect: Real-time 3d human pose estimation with a single rgb camera. ACM Trans Graph 36(4):1–14
Article Google Scholar
Omran M, Lassner C, Pons-Moll G, Gehler P, Schiele B (2018) Neural body fitting: Unifying deep learning and model based human pose and shape estimation. In: 2018 International conference on 3D Vision (3DV), pp 484–494.IEEE
Pavlakos G, Zhou X, Derpanis KG, Daniilidis K (2017) Coarse-to-fine volumetric prediction for single-image 3d human pose. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7025–7034
Pavlakos G, Zhu L, Zhou X, Daniilidis K (2018) Learning to estimate 3d human pose and shape from a single color image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 459–468
Qiu H, Wang C, Wang J, Wang N, Zeng W (2019) Cross view fusion for 3d human pose estimation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4342–4351
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Sun Y, Ye Y, Liu W, Gao W, Fu Y, Mei T (2019) Human mesh recovery from monocular images via a skeleton-disentangled representation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 5349–5358
Wang H, Sun M (2022) Smart-VPoseNet 3D human pose estimation models and methods based on multi-view discriminant network. Knowl Based Syst 239: 107992
Wang H, Sun MH, Zhang H, Dong LY (2022) LHPE-nets: A lightweight 2D and 3D human pose estimation model with well-structural deep networks and multi-view pose sample simplification method. Plos One 17(2):e0264302
Article Google Scholar
Wang C, Wang Y, Lin Z, Yuille AL (2018) Robust 3d human pose estimation from single images or video sequences. IEEE Trans Pattern Anal Mach Intell 41(5):1227–1241
Article Google Scholar
Xiao J, Xu H, Gao H, Bian M, Li Y (2021) A weakly supervised semantic segmentation network by aggregating seed cues: the multi-object proposal generation perspective. ACM Trans Multimed Comput Commun Appl(TOMM) 17 (1s):1–19
Article Google Scholar
Zhang F, Zhu X, Ye M (2019) Fast human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3517–3526

Download references

Acknowledgements

This study has been partially supported by National Natural Science Foundation of China (61872164), Program of Science and Technology Development Plan of Jilin Province of China (20220201147GX) and Fundamental Research Funds for the Central Universities (2022-JCXK-02).

Author information

Authors and Affiliations

College of Computer Science and Technology, Jilin University, Changchun, 130012, Jilin Province, China
Hao Wang & Minghui Sun
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012, Jilin Province, China
Hao Wang & Minghui Sun

Authors

Hao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Minghui Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minghui Sun.

Ethics declarations

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, H., Sun, M. HQLC-Overlap: an adaptive low-cost binocular 3D human pose estimation model. Multimed Tools Appl 82, 17159–17173 (2023). https://doi.org/10.1007/s11042-022-14156-5

Download citation

Received: 22 February 2022
Revised: 18 May 2022
Accepted: 27 October 2022
Published: 10 November 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11042-022-14156-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HQLC-Overlap: an adaptive low-cost binocular 3D human pose estimation model

Abstract

Access this article

Similar content being viewed by others

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

Multi-virtual View Scoring Network for 3D Hand Pose Estimation from a Single Depth Image

A Method of Constructing Fine-Grained Pose Evaluation Model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

HQLC-Overlap: an adaptive low-cost binocular 3D human pose estimation model

Abstract

Access this article

Similar content being viewed by others

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

Multi-virtual View Scoring Network for 3D Hand Pose Estimation from a Single Depth Image

A Method of Constructing Fine-Grained Pose Evaluation Model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation