Skip to main content

Advertisement

Log in

3D human body reconstruction based on SMPL model

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Recovering 3D human pose and body shape from a monocular image is a challenging task in computer vision. In this paper, we present an optimization-based algorithm and an innovative framework to reconstruct 3D human body from a single monocular image. All the evaluation tasks are established on the basis of the classic parametric 3D body model SMPL. Firstly, a new combined objective function of SMPL parameters is proposed to involve four loss terms on 2D joints, 3D joints, facial landmarks and pose priori, respectively, which increase the reliability of the evaluation results dramatically. Furthermore, we use the estimation results given by an end-to-end regression network as the initial values of the parameters, which has been proved to speed up the optimization process. The experiments on benchmark datasets Human 3.6 M, LSP and a wild dataset demonstrate that our model achieves an accurate and robust estimation of the 3D human body and outperforms the popular competing algorithms in precision and robustness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Izadi, S., Kim, D.: Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology (2011)

  2. Newcombe, R., Fox, D., Seitz, S.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In CVPR (2015)

  3. Guo, K., Xu, F., Wang, Y., Liu, Y., Dai, Q.: Robust non-rigid motion tracking and surface reconstruction using l0 regularization. In ICCV (2015)

  4. Li, H., Adams, B., Guibas, L.J., Pauly, M.: Robust single-view geometry and motion reconstruction. ACM Trans. Graph 28(5), 109:1-109:10 (2009)

    Article  Google Scholar 

  5. Zhang, Q., Fu, B., Ye, M., Yang, R.: Quality dynamic human body modeling using a single lowcost depth camera. In: CVPR (2014)

  6. Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J., Davis, J.: SCAPE: shape completion and animation of people. ACM Trans. Graph 24(3), 408–416 (2005)

    Article  Google Scholar 

  7. Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. 34(6), 248:1-248:16 (2015)

    Article  Google Scholar 

  8. Bogo, F., Kanazawa, A., Lassner, C., Gehler, P.V., Romero, J., Black, M.J.: Keep it SMPL: automatic estimation of 3d human pose and shape from a single image. In: ECCV (2016)

  9. Zhong guo Li, Anders Heyden, Magnus Oskarsson.: Parametric Model-Based 3D Human Shape and Pose Estimation from Multiple Views. In: SCIA (2019)

  10. Alldieck, T., Magnor, M., Xu, W.: Video Based Reconstruction of 3D People Models. In: CVPR (2018).

  11. Varol, G., Ceylan, D., Russell, B.: BodyNet: volumetric inference of 3D human body shapes. In: CVPR (2018)

  12. Jackson, A.S., Manafas, C., Tzimiropoulos, G.: 3D Human Body Reconstruction from a Single Image via Volumetric Regression. In: CVPR (2018)

  13. Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3.6M: large scale datasets and predictive methods for 3D human sensing. In natural environments. IEEE Trans Pattern Anal Mach Intell 36(7), 1325–1339 (2014)

    Article  Google Scholar 

  14. Johnson, S., Everingham, M.: Clustered pose and nonlinear appearance models for human pose estimation. In: Proceedings of the British Machine Vision Conference (2010)

  15. Weiss, A., Hirshberg, D., Black, M.J.: Home 3D body scans from noisy image and range data. In: ICCV (2011)

  16. Bogo, F., Black, M.J., Loper, M., Romero, J.: Detailed full-body reconstructions of moving people from monocular RGB-D sequences. In: ICCV (2015)

  17. Dibra, E., Jain, H., Oztireli, C., Ziegler, R., Gross, M.: Hs-nets: estimating human body shape from silhouettes with convolutional neural networks. In: Proceedings of the 3DV (2016)

  18. Kanazawa, A., Black, M., Jacobs, D., Malik, J.: End-to-end recovery of human shape and pose. In: CVPR (2018)

  19. Wei, S., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: CVPR (2016)

  20. Newell, Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: ECCV (2016)

  21. Zhang, F., Zhu, X., Ye, M.: Fast Human Pose Estimation. In: CVPR (2019)

  22. Fang, H.S., Xie, S., Tai, Y.W. et al.: RMPE: Regional Multi-person Pose Estimation. In: ICCV (2017)

  23. Zheng, G., Wang, S., Yang, B.: Hierarchical structure correlation inference for pose estimation. Neurocomputing 404(3), 186–197 (2020)

    Article  Google Scholar 

  24. Piotr Dollár, Peter Welinder, Pietro Perona. Cascaded Pose Regression. In: IEEE (2010)

  25. Ren, S., Cao, X., Wei, Y. et al.: Face alignment at 3000 FPS via regressing local binary features. In: CVPR (2014)

  26. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: CVPR (2014)

  27. Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: CVPR (2013)

  28. Zhou, E., Fan, H., Cao, Z.: Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In: Computer Vision Workshops (2013)

  29. Zhang, K., Zhang, Z., Li, Z., et al.: Joint face detection and alignment using multitask cascaded convolutional networks. In: Signal Processing Letters (2016)

  30. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: CVPR (2015)

  31. Hangzhou pinhole imaging, http://dface.tech/.

  32. Kowalski, M., Naruniec, J., Trzcinski, T.: Deep alignment network: a convolutional neural network for robust face alignment. In: CVPR (2017)

  33. Merget, D., Rock, M., Rigoll, G.: Robust facial landmark detection via a fully-convolutional local-global context network. In: CVPR (2018)

  34. Wu, W., Qian, C., Yang, S., et al.: Look at boundary: a boundary-aware face alignment algorithm. In: CVPR (2018)

  35. Bulat, A., Tzimiropoulos, G.: Super-FAN: integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with GANs. In: CVPR (2018)

  36. Dong, X., Yu, S.I., Weng, X., et al.: Supervision-by-registration: an unsupervised approach to improve the precision of facial landmark detectors. In: CVPR (2018)

  37. Lassner, C., Romero, J., Kiefel, M., Bogo, F. Black, M.J., Gehler, P.V.: Unite the people: closing the loop between 3D and 2D human representations. In: CVPR (2017)

  38. Zanfir, A., Marinoiu, E., Sminchisescu, C.: Monocular 3D pose and shape estimation of multiple people in natural scenes–the importance of multiple scene constraints. In: CVPR (2018)

  39. Arnab, A., Doersch, C., Zisserman, A.: Exploiting temporal context for 3D human pose estimation in the wild. In: CVPR (2019)

  40. Omran, M., Lassner, C., Pons-Moll, G., Gehler, P.V., Schiele, B.: Neural body fitting: unifying deep learning and model-based human pose and shape estimation. In: 3DV (2018)

  41. Pavlakos, G., Zhu, L., Zhou, X., Daniilidis, K.: Learning to estimate 3D human pose and shape from a single color image. In: CVPR (2018)

  42. Tung, H.-Y., Tung, H.-W., Yumer, E., Fragkiadaki, K.: Self-supervised learning of motion capture. In: NIPS (2017)

  43. Omran, M., Lassner, C., Pons-Moll, G., Gehler, P.V., Schiele, B.: Neural body fitting: unifying deep learning and model-based human pose and shape estimation. In: International Conference on 3DVision (2018)

  44. Vince Tan, J.K., Budvytis, I., Cipolla, R.: Indirect deep structured learning for 3D human shape and pose prediction. In: British Machine Vision Conference (2017)

  45. Nocedal, J., Wright, S.: Numerical Optimization. Springer, New York (2006)

    MATH  Google Scholar 

  46. Gower, J.C.: Generalized procrutes analysis. In: Psychome-trika (1975)

  47. Mehta, D., Rhodin, H., Dan Casas, Fua, P., Sotnychenko, O., Xu, W.: Christian Theobalt. Monocular 3D human pose estimation in the wild using improved cnn supervision. In: 3DV (2017)

  48. Mehta, D., Sridhar, S., Sotnychenko, O., Rhodin, H., Shafiei, M., Seidel, H.-P., Weipeng, Xu., Casas, D., Theobalt, C.: VNect: Real-time 3D human pose estimation with a single RGB camera. ACM Trans. Graph. (TOG) 36(4), 44 (2017)

    Article  Google Scholar 

  49. Martinez, J., Hossain, R., Romero, J., Little, J.J.: A simple yet effective baseline for 3D human pose estimation. In: ICCV (2017)

  50. Kanazawa, A., Zhang, J.Y., Felsen, P., et al.: Learning 3D human dynamics from video. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

  51. Zheng, Z., Yu, T., Wei, Y., et al.: DeepHuman: 3D human reconstruction from a single image. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

  52. Güler, R.A., Kokkinos, I.: HoloPose: holistic 3D human reconstruction in-the-wild. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

  53. Pavlakos, G., Choutas, V., Ghorbani, N., et al.: Expressive body capture: 3D hands, face, and body from a single image. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

  54. Kolotouros, N., Pavlakos, G., Black, M., et al.: Learning to reconstruct 3D human pose and shape via model-fitting in the loop. In: Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

Download references

Funding

This research was funded by the National Natural Science Foundation of China (62173083), the Fundamental Research Funds for the Central Universities (N2104027), the Innovation Fund of Chinese Universities Industry-University-Research (2020HYA06003), and the Guangdong Basic and Applied Basic Research Foundation (2021B1515120064).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tong Jia.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests to this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, D., Song, Y., Liang, F. et al. 3D human body reconstruction based on SMPL model. Vis Comput 39, 1893–1906 (2023). https://doi.org/10.1007/s00371-022-02453-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02453-x

Keywords

Navigation