research-article

Monocular Human Body Shape Estimation: A Generation-aid Approach

Authors:

Ling LiAuthors Info & Claims

VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

Article No.: 19, Pages 1 - 8

https://doi.org/10.1145/3574131.3574448

Published: 13 January 2023 Publication History

Abstract

Observing human beings from monocular images is one of the basic tasks of computer vision. Reconstructing human bodies from monocular images mainly includes the reconstruction of posture and body shape. However, in the past studies, researchers were more interested in pose estimation, ignoring the study of body shape, and this paper focuses on the estimation of the body shape of a 3D model. Learning body parameters via instance segmentation requires a large number of labels. While the parameters based on pose estimation are completely based on the results of key points detection, which effect is not friendly for pictures with poor angles and low resolution. In response to the above problems, we propose a method to automatically generate datasets. The dataset provides low-resolution images and labels of various angles and blurred shapes. On the generated low-resolution and poorly angled dataset, we propose a generative-assisted deep learning network framework. Experiments show that the framework can effectively estimate the body shape parameters of the model from monocular images.

References

[1]

Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J. Black. 2016. Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image. http://arxiv.org/abs/1607.08128 Number: arXiv:1607.08128 arXiv:1607.08128 [cs].

[2]

Hongsuk Choi, Gyeongsik Moon, Ju Yong Chang, and Kyoung Mu Lee. 2021. Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Nashville, TN, USA, 1964–1973. https://doi.org/10.1109/CVPR46437.2021.00200

[3]

Georgios Georgakis, Ren Li, Srikrishna Karanam, Terrence Chen, Jana Kosecka, and Ziyan Wu. 2020. Hierarchical Kinematic Human Mesh Recovery. http://arxiv.org/abs/2003.04232 arXiv:2003.04232 [cs].

[4]

I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. 2014. Generative Adversarial Networks. (2014).

[5]

Riza Alp Guler and Iasonas Kokkinos. 2019. HoloPose: Holistic 3D Human Reconstruction In-The-Wild. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Long Beach, CA, USA, 10876–10886. https://doi.org/10.1109/CVPR.2019.01114

[6]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778. https://doi.org/10.1109/CVPR.2016.90

[7]

Angjoo Kanazawa, Michael J. Black, David W. Jacobs, and Jitendra Malik. 2018. End-to-end Recovery of Human Shape and Pose. http://arxiv.org/abs/1712.06584 arXiv:1712.06584 [cs].

[8]

Angjoo Kanazawa, Jason Y. Zhang, Panna Felsen, and Jitendra Malik. 2019. Learning 3D Human Dynamics From Video. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Long Beach, CA, USA, 5607–5616. https://doi.org/10.1109/CVPR.2019.00576

[9]

Muhammed Kocabas, Nikos Athanasiou, and Michael J. Black. 2020. VIBE: Video Inference for Human Body Pose and Shape Estimation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Seattle, WA, USA, 5252–5262. https://doi.org/10.1109/CVPR42600.2020.00530

[10]

Muhammed Kocabas, Chun-Hao P. Huang, Otmar Hilliges, and Michael J. Black. 2021. PARE: Part Attention Regressor for 3D Human Body Estimation. http://arxiv.org/abs/2104.08527 arXiv:2104.08527 [cs].

[11]

Nikos Kolotouros, Georgios Pavlakos, Michael Black, and Kostas Daniilidis. 2019. Learning to Reconstruct 3D Human Pose and Shape via Model-Fitting in the Loop. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Seoul, Korea (South), 2252–2261. https://doi.org/10.1109/ICCV.2019.00234

[12]

Christoph Lassner, Javier Romero, Martin Kiefel, Federica Bogo, Michael J. Black, and Peter V. Gehler. 2017. Unite the People: Closing the Loop Between 3D and 2D Human Representations. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Honolulu, HI, 4704–4713. https://doi.org/10.1109/CVPR.2017.500

[13]

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A skinned multi-person linear model. ACM Transactions on Graphics 34, 6. http://dx.doi.org/10.1145/2816795.2818013 Blendshapes;Body shapes;Graphics pipeline;Linear functions;Rotation matrices;Skinning;Soft tissue;Soft tissue deformation;.

Digital Library

[14]

Zhengyi Luo, S. Alireza Golestaneh, and Kris M. Kitani. 2020. 3D Human Motion Estimation via Motion Compression and Refinement. http://arxiv.org/abs/2008.03789 arXiv:2008.03789 [cs].

[15]

Gyeongsik Moon, Hongsuk Choi, and Kyoung Mu Lee. 2022. Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation. http://arxiv.org/abs/2011.11534 arXiv:2011.11534 [cs].

[16]

Mohamed Omran, Christoph Lassner, Gerard Pons-Moll, Peter Gehler, and Bernt Schiele. 2018. Neural Body Fitting: Unifying Deep Learning and Model Based Human Pose and Shape Estimation. In 2018 International Conference on 3D Vision (3DV). IEEE, Verona, 484–494. https://doi.org/10.1109/3DV.2018.00062

[17]

Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, and Kostas Daniilidis. 2018. Learning to Estimate 3D Human Pose and Shape from a Single Color Image. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City, UT, 459–468. https://doi.org/10.1109/CVPR.2018.00055

[18]

D. Pavllo, C. Feichtenhofer, D. Grangier, and M. Auli. 2020. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]

Akash Sengupta, Ignas Budvytis, and Roberto Cipolla. 2020. Synthetic Training for Accurate 3D Human Pose and Shape Estimation in the Wild. http://arxiv.org/abs/2009.10013 arXiv:2009.10013 [cs].

[20]

Yu Sun, Yun Ye, Wu Liu, Wenpeng Gao, Yili Fu, and Tao Mei. 2019. Human Mesh Recovery From Monocular Images via a Skeleton-Disentangled Representation. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Seoul, Korea (South), 5348–5357. https://doi.org/10.1109/ICCV.2019.00545

[21]

Y. Tian, H. Zhang, Y. Liu, and L. Wang. 2022. Recovering 3D Human Mesh from Monocular Images: A Survey. (2022).

[22]

Xuejun Wang, Tiancheng Xu, Xiaoqiang Zhang, chenyang Song, and Qi Lin. 2018. Acupoint coordinate mesurement based on binocular vision. Electronic Mesurement Technology 41, 22 (2018), 5.

[23]

Tiancheng Xu and Youbing Xia. 2021. Guidance for Acupuncture Robot with Potentially Utilizing Medical Robotic Technologies. EVIDENCE-BASED COMPLEMENTARY AND ALTERNATIVE MEDICINE 2021 (MAR 31 2021). https://doi.org/10.1155/2021/8883598

[24]

Yuanlu Xu, Song-Chun Zhu, and Tony Tung. 2019. DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Seoul, Korea (South), 7759–7769. https://doi.org/10.1109/ICCV.2019.00785

[25]

Hongwen Zhang, Jie Cao, Guo Lu, Wanli Ouyang, and Zhenan Sun. 2019. DaNet: Decompose-and-aggregate Network for 3D Human Shape and Pose Estimation. In Proceedings of the 27th ACM International Conference on Multimedia. ACM, Nice France, 935–944. https://doi.org/10.1145/3343031.3351057

Digital Library

[26]

Hongwen Zhang, Jie Cao, Guo Lu, Wanli Ouyang, and Zhenan Sun. 2020. Learning 3D Human Shape and Pose from Dense Body Parts. http://arxiv.org/abs/1912.13344 arXiv:1912.13344 [cs].

[27]

Hongwen Zhang, Yating Tian, Xinchi Zhou, Wanli Ouyang, Yebin Liu, Limin Wang, and Zhenan Sun. 2021. PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, Montreal, QC, Canada, 11426–11436. https://doi.org/10.1109/ICCV48922.2021.01125

Cited By

Ji XZhou L(2023)Location of acupuncture points based on graph convolution and 3D deep learning in virtual humansComputer Animation and Virtual Worlds10.1002/cav.215934:6Online publication date: 21-Jun-2023
https://doi.org/10.1002/cav.2159

Index Terms

Monocular Human Body Shape Estimation: A Generation-aid Approach
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Reconstruction
  2. Modeling and simulation
    1. Model development and analysis
      1. Modeling methodologies
2. Theory of computation
  1. Models of computation

Recommendations

SHARP: Shape-Aware Reconstruction of People in Loose Clothing
Abstract
Recent advancements in deep learning have enabled 3D human body reconstruction from a monocular image, which has broad applications in multiple domains. In this paper, we propose SHARP (SHape Aware Reconstruction of People in loose clothing), a ...
3D Human Body Shape and Pose Estimation from Depth Image
Pattern Recognition and Computer Vision
Abstract
This work addresses the problem of 3D human body shape and pose estimation from a single depth image. Most 3D human pose estimation methods based on deep learning utilize RGB images instead of depth images. Traditional optimization-based methods ...
Estimation of human body shape and posture under clothing

Estimating the body shape and posture of a dressed human subject in motion represented as a sequence of (possibly incomplete) 3D meshes is important for virtual change rooms and security. To solve this problem, statistical shape spaces encoding human ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

VRCAI '22: Proceedings of the 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

December 2022

284 pages

ISBN:9798400700316

DOI:10.1145/3574131

Editors:
Enhua Wu
SKLCS, Chinese Academy of Sciences / FST, University of Macau / Guangzhou Greater Bay Area Virtual Reality Research Institute, China
,
Lionel Ming-Shuan Ni
The Hong Kong University of Science and Technology (Guangzhou) & The Hong Kong University of Science and Technology, China
,
Zhigeng Pan
Nanjing University of Information Science & Technology / Hangzhou Normal University, China
,
Daniel Thalmann
École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
,
Ping Li
The Hong Kong Polytechnic University, Hong Kong, China
,
Charlie C.L. Wang
The University of Manchester, U.K.
,
Lei Zhu
The Hong Kong University of Science and Technology (Guangzhou) & The Hong Kong University of Science and Technology, China
,
Minghao Yang
Institute of Automation, Chinese Academy of Sciences, China

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

VRCAI '22

Sponsor:

SIGGRAPH

VRCAI '22: The 18th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and its Applications in Industry

December 27 - 29, 2022

Guangzhou, China

Acceptance Rates

Overall Acceptance Rate 51 of 107 submissions, 48%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
67
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ji XZhou L(2023)Location of acupuncture points based on graph convolution and 3D deep learning in virtual humansComputer Animation and Virtual Worlds10.1002/cav.215934:6Online publication date: 21-Jun-2023
https://doi.org/10.1002/cav.2159

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten