Skip to main content
Log in

Garment reconstruction from a single-view image based on pixel-aligned implicit function

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

3D garment reconstruction has a wide range of applications in apparel design, digital human body, and virtual try-on. Reconstructing 3D shapes from single-view images is a completely undefined and challenging problem. Recent single-view methods require only single-view images of static or dynamic objects and mine the potential multi-view information in single-view images by statistical, geometric, and physical prior knowledge, which is tedious to obtain. In this paper, we use an implicit function of pixel alignment represented by a neural network to correlate 2D image pixels with the corresponding 3D clothing information for end-to-end training without any relevant prior model. The qualitative and quantitative analysis of the experimental results showed that our results reduced the relative error by an average of 2.6% and the chamfer distance by 2.37% compared to the previous method. Experiments show that our model can reasonably reconstruct the 3D model of garments from their collections of single-view images. Our method can not only capture the overall geometry of the garment but also extract the tiny but important wrinkle details of the fabric. Even with low-resolution input images, our model still achieves available results. In addition, experiments show that the reconstructed 3D models of garments can be used for texture migration and virtual fitting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1.
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Bălan AO, Black MJ (2008) “The naked truth: Estimating body shape under clothing,” in European Conference on Computer Vision, pp. 15–29

  2. Baran I, Popović J (2007) Automatic rigging and animation of 3d characters. ACM Trans Graph 26(3):72-es

    Article  Google Scholar 

  3. Bhatnagar BL, Tiwari G, Theobalt C, Pons-Moll G (2019) “Multi-garment net: Learning to dress 3d people from images,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 5420–5430

  4. Bhatti UA, Yu Z, Li J, Nawaz SA, Mehmood A, Zhang K, Yuan L (2020) Hybrid watermarking algorithm using Clifford algebra with Arnold scrambling and chaotic encryption. IEEE Access 8:76386–76398

    Article  Google Scholar 

  5. Bhatti UA, Yu Z, Yuan L, Nawaz SA, Aamir M, Bhatti MA (2022) “A Robust Remote Sensing Image Watermarking Algorithm Based on Region-Specific SURF,” in Proceedings of International Conference on Information Technology and Applications: ICITA 2021, pp. 75–85

  6. Bogo F, Kanazawa A, Lassner C, Gehler P, Romero J, Black MJ (2016) “Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image,” in European conference on computer vision, pp. 561–578

  7. Bradley D, Popa T, Sheffer A, Heidrich W, Boubekeur T (2008) “Markerless garment capture,” in ACM SIGGRAPH 2008 papers, pp. 1–9

  8. Chang AX et al (2015) “Shapenet: An information-rich 3d model repository,” arXiv Prepr. arXiv1512.03012

  9. Chen X, Zhou B, Lu F-X, Wang L, Bi L, Tan P (2015) Garment modeling with a depth camera. ACM Trans Graph 34(6):201–203

    Article  Google Scholar 

  10. Choy CB, Xu D, Gwak J, Chen K, Savarese S (2016) “3d-r2n2: A unified approach for single and multi-view 3d object reconstruction,” in European conference on computer vision, pp. 628–644

  11. Daněřek R, Dibra E, Öztireli C, Ziegler R, Gross M (2017) Deepgarment: 3d garment shape estimation from a single image. Comput Graph Forum 36(2):269–280

    Article  Google Scholar 

  12. De Aguiar E, Stoll C, Theobalt C, Ahmed N, Seidel H-P, Thrun S (2008) “Performance capture from sparse multi-view video,” in ACM SIGGRAPH 2008 papers, pp. 1–10

  13. De Aguiar E, Sigal L, Treuille A, Hodgins JK (2010) Stable spaces for real-time clothing. ACM Trans Graph 29(4):1–9

    Article  Google Scholar 

  14. Dou M, Khamis S, Degtyarev Y, Davidson P, Fanello SR, Kowdle A, Escolano SO, Rhemann C, Kim D, Taylor J, Kohli P, Tankovich V, Izadi S (2016) Fusion4d: real-time performance capture of challenging scenes. ACM Trans Graph 35(4):1–13

    Article  Google Scholar 

  15. Fan H, Su H, Guibas LJ (2017) “A point set generation network for 3d object reconstruction from a single image,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 605–613

  16. Guan P, Freifeld O, Black MJ (2010) “A 2D human body model dressed in eigen clothing,” in European conference on computer vision, pp. 285–298

  17. Habermann M, Xu W, Zollhoefer M, Pons-Moll G, Theobalt C (2019) Livecap: Real-time human performance capture from monocular video. ACM Trans Graph 38(2):1–17

    Article  Google Scholar 

  18. Huang S, Qi S, Zhu Y, Xiao Y, Xu Y, Zhu S-C (2018) “Holistic 3d scene parsing and reconstruction from a single rgb image,” in Proceedings of the European conference on computer vision (ECCV), pp. 187–203

  19. Huynh L et al (2018) “Mesoscopic facial geometry inference using deep neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8407–8416

  20. Izadi S et al (2011) “Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera,” in Proceedings of the 24th annual ACM symposium on User interface software and technology, pp. 559–568

  21. Jackson AS, Manafas C, Tzimiropoulos G (2018) 3d human body reconstruction from a single image via volumetric regression,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, p. 0

  22. Jiang B, Zhang J, Cai J, Zheng J (2020) Disentangled human body embedding based on deep hierarchical neural network. IEEE Trans Vis Comput Graph 26(8):2560–2575

    Article  Google Scholar 

  23. Kanazawa A, Black MJ, Jacobs DW, Malik J (2018) “End-to-end recovery of human shape and pose,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7122–7131

  24. Kocabas M, Athanasiou N, Black MJ (2020) “Vibe: Video inference for human body pose and shape estimation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5253–5263

  25. Kolotouros N, Pavlakos G, Black MJ, Daniilidis K (2019) “Learning to reconstruct 3D human pose and shape via model-fitting in the loop,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2252–2261

  26. Lahner Z, Cremers D, Tung T (2018) “Deepwrinkles: Accurate and realistic clothing modeling,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 667–684

  27. Loper M, Mahmood N, Romero J, Pons-Moll G, Black MJ (2015) SMPL: A skinned multi-person linear model. ACM Trans Graph 34(6):1–16

    Article  Google Scholar 

  28. Natsume R et al (2019) “Siclope: Silhouette-based clothed people,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4480–4490

  29. Niu C, Li J, Xu K (2018) “Im2struct: Recovering 3d shape structure from a single rgb image,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4521–4529

  30. Pavlakos G et al (2019) “Expressive body capture: 3d hands, face, and body from a single image,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10975–10985

  31. Penna MA (1989) A shape from shading analysis for a single perspective image of a polyhedron. IEEE Trans Pattern Anal Mach Intell 11(6):545–554

    Article  Google Scholar 

  32. Pollefeys M, Koch R, Van Gool L (1999) Self-calibration and metric reconstruction inspite of varying and unknown intrinsic camera parameters. Int J Comput Vis 32(1):7–25

    Article  Google Scholar 

  33. Saito S, Huang Z, Natsume R, Morishima S, Kanazawa A, Li H (2019) “Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2304–2314

  34. Scholz V, Stich T, Magnor M, Keckeisen M, Wacker M (2005) “Garment motion capture using color-coded patterns,” in ACM SIGGRAPH 2005 Sketches, pp. 38-es

  35. Sclaroff S, Pentland A (1991) Generalized implicit functions for computer graphics. ACM Siggraph Comput Graph 25(4):247–250

    Article  Google Scholar 

  36. Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) “Multi-view convolutional neural networks for 3d shape recognition,” in Proceedings of the IEEE international conference on computer vision, pp. 945–953

  37. Varol G et al (2018) “Bodynet: Volumetric inference of 3d human body shapes,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 20–36

  38. Vlasic D et al (2009) “Dynamic shape capture using multi-view photometric stereo,” in ACM SIGGRAPH Asia 2009 papers, pp. 1–11

  39. Wang CCL, Wang Y, Yuen MMF (2003) Feature based 3D garment design through 2D sketches. Comput Des 35(7):659–672

    Google Scholar 

  40. Wang F, Kang L, Li Y (2015) “Sketch-based 3d shape retrieval using convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1875–1883

  41. Wang N, Zhang Y, Li Z, Fu Y, Liu W, Jiang Y-G (2018) “Pixel2mesh: Generating 3d mesh models from single rgb images,” in Proceedings of the European conference on computer vision (ECCV), pp. 52–67

  42. White R, Crane K, Forsyth DA (2007) Capturing and animating occluded cloth. ACM Trans Graph 26(3):34-es

    Article  Google Scholar 

  43. Wu Z et al (2015) “3d shapenets: A deep representation for volumetric shapes,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1912–1920

  44. Yan X, Yang J, Yumer E, Guo Y, Lee H (2016) Perspective transformer nets: learning single-view 3d object reconstruction without 3d supervision. In: Lee D, Sugiyama M, Luxburg U, Guyon I, Garnett R (eds.) Advances in Neural Information Processing Systems, vol 29, pp 1696–1704

  45. Yang S et al (2016) “Detailed garment recovery from a single-view image,” arXiv Prepr. arXiv1608.01250

  46. Yu Q, Yang C, Wei H (2022) Part-wise AtlasNet for 3D point cloud reconstruction from a single image. Knowl Based Syst 242:108395

    Article  Google Scholar 

  47. Zhao F, Wang W, Liao S, Shao L (2021) “Learning anchored unsigned distance functions with gradient direction alignment for single-view garment reconstruction,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12674–12683

  48. Zhou B, Chen X, Fu Q, Guo K, Tan P (2013) Garment modeling from a single image. Comput Graph Forum 32(7):85–91

    Article  Google Scholar 

Download references

Data sharing

Data sharing is not applicable to this article, as no new data were created or analyzed in this study.

Funding

This work was supported by National Natural Science Foundation of China (No. 61976105) and Postgraduate Research & Practice Innovation Program of Jiangsu Province (No. KYCX22_2342).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruru Pan.

Ethics declarations

Conflict of interest

Wentao He, Ning Zhang, Bingpeng Song and Ruru Pan* declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, W., Zhang, N., Song, B. et al. Garment reconstruction from a single-view image based on pixel-aligned implicit function. Multimed Tools Appl 82, 30247–30265 (2023). https://doi.org/10.1007/s11042-023-14924-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-14924-x

Keywords

Navigation