Skip to main content

Unsupervised Detailed Human Shape Estimation from Multi-view Color Images

  • Conference paper
  • First Online:
Image and Graphics (ICIG 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12889))

Included in the following conference series:

  • 1867 Accesses

Abstract

This paper presents a novel framework to estimate detailed human body shape from color images in an unsupervised manner. It is a challenging task due to factors such as variations in human shapes, occlusion, and cloth details. The existing methods are mainly supervised and require a large number of ground truth real training data which is usually hard to obtain. To solve this problem, we propose an unsupervised detailed human shape estimation method from multi-view color images. Specifically, we first predict the depth map for the source view through robust photometric consistency with different views. Then, we predict the initial SMPL model from the color image and refine it by an iterative error feedback regressor based on point clouds of the predicted depth map. Finally, the refined SMPL model is deformed to fit the details (i.e., clothes and faces) on the point clouds to recover the detailed human shapes which are represented by adding a set of offsets to the SMPL model. The experimental results on different dataset demonstrate that our method outperforms the state-of-the-art methods and achieves higher reconstruction accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alldieck, T., Magnor, M., Bhatnagar, B.L., Theobalt, C., Pons-Moll, G.: Learning to reconstruct people in clothing from a single RGB camera. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1175–1186 (2019)

    Google Scholar 

  2. Alldieck, T., Magnor, M., Xu, W., Theobalt, C., Pons-Moll, G.: Video based reconstruction of 3D people models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8387–8397 (2018)

    Google Scholar 

  3. Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep It SMPL: automatic estimation of 3D human pose and shape from a single image. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 561–578. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_34

    Chapter  Google Scholar 

  4. Carreira, J., Agrawal, P., Fragkiadaki, K., Malik, J.: Human pose estimation with iterative error feedback. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4733–4742 (2016)

    Google Scholar 

  5. Fish Tung, H.Y., Harley, A.W., Seto, W., Fragkiadaki, K.: Adversarial inverse graphics networks: learning 2D-to-3D lifting and image-to-image translation from unpaired supervision. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4354–4362 (2017)

    Google Scholar 

  6. Groueix, T., Fisher, M., Kim, V.G., Russell, B.C., Aubry, M.: 3D-coded: 3D correspondences by deep deformation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 230–246 (2018)

    Google Scholar 

  7. He, T., Collomosse, J.P., Jin, H., Soatto, S.: Geo-pifu: geometry and pixel aligned implicit functions for single-view human reconstruction. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6–12, 2020, virtual (2020)

    Google Scholar 

  8. Jiang, B., Zhang, J., Hong, Y., Luo, J., Liu, L., Bao, H.: BCNet: learning body and cloth shape from a single image. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12365, pp. 18–35. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58565-5_2

    Chapter  Google Scholar 

  9. Joo, H., et al.: Panoptic studio: a massively multiview system for social interaction capture. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017)

    Google Scholar 

  10. Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7122–7131 (2018)

    Google Scholar 

  11. Khot, T., Agrawal, S., Tulsiani, S., Mertz, C., Lucey, S., Hebert, M.: Learning unsupervised multi-view stereopsis via robust photometric consistency. arXiv preprint arXiv:1905.02706 (2019)

  12. Kolotouros, N., Pavlakos, G., Black, M.J., Daniilidis, K.: Learning to reconstruct 3D human pose and shape via model-fitting in the loop. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp. 2252–2261 (2019)

    Google Scholar 

  13. Kolotouros, N., Pavlakos, G., Daniilidis, K.: Convolutional mesh regression for single-image human shape reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4501–4510 (2019)

    Google Scholar 

  14. Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: Smpl: a skinned multi-person linear model. ACM Trans. Graph. (TOG) 34(6), 1–16 (2015)

    Article  Google Scholar 

  15. Lorensen, W.E., Cline, H.E.: Marching cubes: a high resolution 3d surface construction algorithm. In: Stone, M.C. (ed.) Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1987, Anaheim, California, USA, July 27–31, 1987, pp. 163–169. ACM (1987). https://doi.org/10.1145/37401.37422

  16. Ma, Q., et al.: Learning to dress 3D people in generative clothing. In: Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  17. Mahjourian, R., Wicke, M., Angelova, A.: Unsupervised learning of depth and ego-motion from monocular video using 3D geometric constraints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5667–5675 (2018)

    Google Scholar 

  18. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: Deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413 (2017)

  19. Saito, S., Huang, Z., Natsume, R., Morishima, S., Kanazawa, A., Li, H.: Pifu: pixel-aligned implicit function for high-resolution clothed human digitization. In: The IEEE International Conference on Computer Vision (ICCV), October 2019

    Google Scholar 

  20. Saito, S., Simon, T., Saragih, J.M., Joo, H.: Pifuhd: multi-level pixel-aligned implicit function for high-resolution 3D human digitization. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020, pp. 81–90. IEEE (2020)

    Google Scholar 

  21. Sorkine, O.: Differential representations for mesh processing. Comput. Graph. Forum 25(4), 789–807 (2006). https://doi.org/10.1111/j.1467-8659.2006.00999.x

    Article  Google Scholar 

  22. Tang, S., Tan, F., Cheng, K., Li, Z., Zhu, S., Tan, P.: A neural network for detailed human depth estimation from a single image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019

    Google Scholar 

  23. Varol, G., et al.: Learning from synthetic humans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 109–117 (2017)

    Google Scholar 

  24. Wang, K., Zhang, G., Yang, J., Bao, H.: Dynamic human body reconstruction and motion tracking with low-cost depth cameras. Vis. Comput. 37(3), 603–618 (2021)

    Article  Google Scholar 

  25. Wang, Z., Bovik, A., Sheikh, H., Simoncelli, E.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004). https://doi.org/10.1109/TIP.2003.819861

    Article  Google Scholar 

  26. Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: Mvsnet: depth inference for unstructured multi-view stereo. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 767–783 (2018)

    Google Scholar 

  27. Zhang, C., Pujades, S., Black, M.J., Pons-Moll, G.: Detailed, accurate, human shape estimation from clothed 3D scan sequences. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4191–4200 (2017)

    Google Scholar 

  28. Zhu, H., Zuo, X., Wang, S., Cao, X., Yang, R.: Detailed human shape estimation from a single image by hierarchical mesh deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Fundamental Research Funds for the Central Universities (NJ2020023), in part by the Open Project Program of State Key Laboratory of Virtual Reality Technology and Systems of Beihang University (No.VRLAB2021C03), and in part by the Open Project Program of the State Key Lab of CAD&CG of Zhejiang University (Grant No.A2106).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kangkan Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zheng, H., Wang, K., Li, W., Yang, J. (2021). Unsupervised Detailed Human Shape Estimation from Multi-view Color Images. In: Peng, Y., Hu, SM., Gabbouj, M., Zhou, K., Elad, M., Xu, K. (eds) Image and Graphics. ICIG 2021. Lecture Notes in Computer Science(), vol 12889. Springer, Cham. https://doi.org/10.1007/978-3-030-87358-5_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87358-5_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87357-8

  • Online ISBN: 978-3-030-87358-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics