skip to main content
10.1145/3574131.3574448acmconferencesArticle/Chapter ViewAbstractPublication PagessiggraphConference Proceedingsconference-collections
research-article

Monocular Human Body Shape Estimation: A Generation-aid Approach

Published: 13 January 2023 Publication History

Abstract

Observing human beings from monocular images is one of the basic tasks of computer vision. Reconstructing human bodies from monocular images mainly includes the reconstruction of posture and body shape. However, in the past studies, researchers were more interested in pose estimation, ignoring the study of body shape, and this paper focuses on the estimation of the body shape of a 3D model. Learning body parameters via instance segmentation requires a large number of labels. While the parameters based on pose estimation are completely based on the results of key points detection, which effect is not friendly for pictures with poor angles and low resolution. In response to the above problems, we propose a method to automatically generate datasets. The dataset provides low-resolution images and labels of various angles and blurred shapes. On the generated low-resolution and poorly angled dataset, we propose a generative-assisted deep learning network framework. Experiments show that the framework can effectively estimate the body shape parameters of the model from monocular images.

References

[1]
Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J. Black. 2016. Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image. http://arxiv.org/abs/1607.08128 Number: arXiv:1607.08128 arXiv:1607.08128 [cs].
[2]
Hongsuk Choi, Gyeongsik Moon, Ju Yong Chang, and Kyoung Mu Lee. 2021. Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Nashville, TN, USA, 1964–1973. https://doi.org/10.1109/CVPR46437.2021.00200
[3]
Georgios Georgakis, Ren Li, Srikrishna Karanam, Terrence Chen, Jana Kosecka, and Ziyan Wu. 2020. Hierarchical Kinematic Human Mesh Recovery. http://arxiv.org/abs/2003.04232 arXiv:2003.04232 [cs].
[4]
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. Generative Adversarial Networks. (2014).
[5]
Riza Alp Guler and Iasonas Kokkinos. 2019. HoloPose: Holistic 3D Human Reconstruction In-The-Wild. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Long Beach, CA, USA, 10876–10886. https://doi.org/10.1109/CVPR.2019.01114
[6]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778. https://doi.org/10.1109/CVPR.2016.90
[7]
Angjoo Kanazawa, Michael J. Black, David W. Jacobs, and Jitendra Malik. 2018. End-to-end Recovery of Human Shape and Pose. http://arxiv.org/abs/1712.06584 arXiv:1712.06584 [cs].
[8]
Angjoo Kanazawa, Jason Y. Zhang, Panna Felsen, and Jitendra Malik. 2019. Learning 3D Human Dynamics From Video. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Long Beach, CA, USA, 5607–5616. https://doi.org/10.1109/CVPR.2019.00576
[9]
Muhammed Kocabas, Nikos Athanasiou, and Michael J. Black. 2020. VIBE: Video Inference for Human Body Pose and Shape Estimation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Seattle, WA, USA, 5252–5262. https://doi.org/10.1109/CVPR42600.2020.00530
[10]
Muhammed Kocabas, Chun-Hao P. Huang, Otmar Hilliges, and Michael J. Black. 2021. PARE: Part Attention Regressor for 3D Human Body Estimation. http://arxiv.org/abs/2104.08527 arXiv:2104.08527 [cs].
[11]
Nikos Kolotouros, Georgios Pavlakos, Michael Black, and Kostas Daniilidis. 2019. Learning to Reconstruct 3D Human Pose and Shape via Model-Fitting in the Loop. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Seoul, Korea (South), 2252–2261. https://doi.org/10.1109/ICCV.2019.00234
[12]
Christoph Lassner, Javier Romero, Martin Kiefel, Federica Bogo, Michael J. Black, and Peter V. Gehler. 2017. Unite the People: Closing the Loop Between 3D and 2D Human Representations. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Honolulu, HI, 4704–4713. https://doi.org/10.1109/CVPR.2017.500
[13]
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A skinned multi-person linear model. ACM Transactions on Graphics 34, 6. http://dx.doi.org/10.1145/2816795.2818013 Blendshapes;Body shapes;Graphics pipeline;Linear functions;Rotation matrices;Skinning;Soft tissue;Soft tissue deformation;.
[14]
Zhengyi Luo, S. Alireza Golestaneh, and Kris M. Kitani. 2020. 3D Human Motion Estimation via Motion Compression and Refinement. http://arxiv.org/abs/2008.03789 arXiv:2008.03789 [cs].
[15]
Gyeongsik Moon, Hongsuk Choi, and Kyoung Mu Lee. 2022. Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation. http://arxiv.org/abs/2011.11534 arXiv:2011.11534 [cs].
[16]
Mohamed Omran, Christoph Lassner, Gerard Pons-Moll, Peter Gehler, and Bernt Schiele. 2018. Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation. In 2018 International Conference on 3D Vision (3DV). IEEE, Verona, 484–494. https://doi.org/10.1109/3DV.2018.00062
[17]
Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, and Kostas Daniilidis. 2018. Learning to Estimate 3D Human Pose and Shape from a Single Color Image. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City, UT, 459–468. https://doi.org/10.1109/CVPR.2018.00055
[18]
D. Pavllo, C. Feichtenhofer, D. Grangier, and M. Auli. 2020. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19]
Akash Sengupta, Ignas Budvytis, and Roberto Cipolla. 2020. Synthetic Training for Accurate 3D Human Pose and Shape Estimation in the Wild. http://arxiv.org/abs/2009.10013 arXiv:2009.10013 [cs].
[20]
Yu Sun, Yun Ye, Wu Liu, Wenpeng Gao, Yili Fu, and Tao Mei. 2019. Human Mesh Recovery From Monocular Images via a Skeleton-Disentangled Representation. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Seoul, Korea (South), 5348–5357. https://doi.org/10.1109/ICCV.2019.00545
[21]
Y. Tian, H. Zhang, Y. Liu, and L. Wang. 2022. Recovering 3D Human Mesh from Monocular Images: A Survey. (2022).
[22]
Xuejun Wang, Tiancheng Xu, Xiaoqiang Zhang, chenyang Song, and Qi Lin. 2018. Acupoint coordinate mesurement based on binocular vision. Electronic Mesurement Technology 41, 22 (2018), 5.
[23]
Tiancheng Xu and Youbing Xia. 2021. Guidance for Acupuncture Robot with Potentially Utilizing Medical Robotic Technologies. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2021 (MAR 31 2021). https://doi.org/10.1155/2021/8883598
[24]
Yuanlu Xu, Song-Chun Zhu, and Tony Tung. 2019. DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Seoul, Korea (South), 7759–7769. https://doi.org/10.1109/ICCV.2019.00785
[25]
Hongwen Zhang, Jie Cao, Guo Lu, Wanli Ouyang, and Zhenan Sun. 2019. DaNet: Decompose-and-aggregate Network for 3D Human Shape and Pose Estimation. In Proceedings of the 27th ACM International Conference on Multimedia. ACM, Nice France, 935–944. https://doi.org/10.1145/3343031.3351057
[26]
Hongwen Zhang, Jie Cao, Guo Lu, Wanli Ouyang, and Zhenan Sun. 2020. Learning 3D Human Shape and Pose from Dense Body Parts. http://arxiv.org/abs/1912.13344 arXiv:1912.13344 [cs].
[27]
Hongwen Zhang, Yating Tian, Xinchi Zhou, Wanli Ouyang, Yebin Liu, Limin Wang, and Zhenan Sun. 2021. PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Montreal, QC, Canada, 11426–11436. https://doi.org/10.1109/ICCV48922.2021.01125

Cited By

View all
  • (2023)Location of acupuncture points based on graph convolution and 3D deep learning in virtual humansComputer Animation and Virtual Worlds10.1002/cav.215934:6Online publication date: 21-Jun-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry
December 2022
284 pages
ISBN:9798400700316
DOI:10.1145/3574131
  • Editors:
  • Enhua Wu,
  • Lionel Ming-Shuan Ni,
  • Zhigeng Pan,
  • Daniel Thalmann,
  • Ping Li,
  • Charlie C.L. Wang,
  • Lei Zhu,
  • Minghao Yang
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 January 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. 3D human reconstruction
  2. Body shape
  3. Deep learning
  4. Generative network
  5. Monocular image

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

VRCAI '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 51 of 107 submissions, 48%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Location of acupuncture points based on graph convolution and 3D deep learning in virtual humansComputer Animation and Virtual Worlds10.1002/cav.215934:6Online publication date: 21-Jun-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media