Skip to main content

GAN-Powered Model &Landmark-Free Reconstruction: A Versatile Approach for High-Quality 3D Facial and Object Recovery from Single Images

  • Conference paper
  • First Online:
Deep Learning Theory and Applications (DeLTA 2023)

Abstract

In recent years, 3D facial reconstructions from single images have garnered significant interest. Most of the approaches are based on 3D Morphable Model (3DMM) fitting to reconstruct the 3D face shape. Concurrently, the adoption of Generative Adversarial Networks (GAN) has been gaining momentum to improve the texture of reconstructed faces. In this paper, we propose a fundamentally different approach to reconstructing the 3D head shape from a single image by harnessing the power of GAN. Our method predicts three maps of normal vectors of the head’s frontal, left, and right poses. We are thus presenting a model-free method that does not require any prior knowledge of the object’s geometry to be reconstructed.

The key advantage of our proposed approach is the substantial improvement in reconstruction quality compared to existing methods, particularly in the case of facial regions that are self-occluded in the input image. Our method is not limited to 3d face reconstruction. It is generic and applicable to multiple kinds of 3D objects. To illustrate the versatility of our method, we demonstrate its efficacy in reconstructing the entire human body.

By delivering a model-free method capable of generating high-quality 3D reconstructions, this paper not only advances the field of 3D facial reconstruction but also provides a foundation for future research and applications spanning multiple object types. The implications of this work have the potential to extend far beyond facial reconstruction, paving the way for innovative solutions and discoveries in various domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Afzal, H.M.R., Luo, S., Afzal, M.K., Chaudhary, G., Khari, M., Kumar, S.A.P.: 3D face reconstruction from single 2D image using distinctive features. IEEE Access 8, 180681–180689 (2020). https://doi.org/10.1109/ACCESS.2020.3028106

    Article  Google Scholar 

  2. Bagdanov, A.D., Del Bimbo, A., Masi, I.: The Florence 2D/3D hybrid face dataset. In: Proceedings of the 2011 Joint ACM Workshop on Human Gesture and Behavior Understanding, J-HGBU 2011, pp. 79–80. ACM, New York (2011). https://doi.org/10.1145/2072572.2072597, https://doi.acm.org/10.1145/2072572.2072597

  3. Bogo, F., Romero, J., Loper, M., Black, M.J.: FAUST: dataset and evaluation for 3D mesh registration. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, Piscataway (2014)

    Google Scholar 

  4. D Bouafif, O., Khomutenko, B., Daoudi, M.: Monocular 3D head reconstruction via prediction and integration of normal vector field. In: Farinella, G.M., Radeva, P., Braz, J. (eds.) Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2020, Volume 5: VISAPP, Valletta, Malta, 27–29 February 2020, pp. 359–369. SCITEPRESS (2020). https://doi.org/10.5220/0008961703590369

  5. Bresson, G., Alsayed, Z., Yu, L., Glaser, S.: Simultaneous localization and mapping: a survey of current trends in autonomous driving. IEEE Trans. Intell. Veh. 2(3), 194–220 (2017). https://doi.org/10.1109/TIV.2017.2749181

    Article  Google Scholar 

  6. Cignoni, P., Rocchini, C., Scopigno, R.: Metro: measuring error on simplified surfaces. Comput. Graph. Forum 17(2), 167–174 (1998). https://doi.org/10.1111/1467-8659.00236

    Article  Google Scholar 

  7. Dai, H., Pears, N.E., Smith, W.A.P., Duncan, C.: Statistical modeling of craniofacial shape and texture. Int. J. Comput. Vis. 128(2), 547–571 (2020). https://doi.org/10.1007/s11263-019-01260-7

    Article  Google Scholar 

  8. Feng, L., Lai, J., Zhang, L.: 3D surface reconstruction based on one non-symmetric face image. In: Li, S.Z., Lai, J., Tan, T., Feng, G., Wang, Y. (eds.) SINOBIOMETRICS 2004. LNCS, vol. 3338, pp. 268–274. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30548-4_31

    Chapter  Google Scholar 

  9. Gecer, B., Ploumpis, S., Kotsia, I., Zafeiriou, S.: GANFIT: generative adversarial network fitting for high fidelity 3D face reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019, pp. 1155–1164. Computer Vision Foundation/IEEE (2019). https://doi.org/10.1109/CVPR.2019.00125

  10. Häming, K., Peters, G.: The structure-from-motion reconstruction pipeline - a survey with focus on short image sequences. Kybernetika 46(5), 926–937 (2010). https://www.kybernetika.cz/content/2010/5/926

  11. Hassner, T.: Viewing real-world faces in 3d. In: IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, 1–8 December 2013, pp. 3607–3614. IEEE Computer Society (2013). https://doi.org/10.1109/ICCV.2013.448

  12. Hg, R.I., Jasek, P., Rofidal, C., Nasrollahi, K., Moeslund, T.B., Tranchet, G.: An RGB-D database using microsoft’s kinect for windows for face detection. In: Yétongnon, K., Chbeir, R., Dipanda, A., Gallo, L. (eds.) Eighth International Conference on Signal Image Technology and Internet Based Systems, SITIS 2012, Sorrento, Naples, Italy, 25–29 November 2012, pp. 42–46. IEEE Computer Society (2012). https://doi.org/10.1109/SITIS.2012.17

  13. Huber, P., Feng, Z., Christmas, W.J., Kittler, J., Rätsch, M.: Fitting 3D morphable face models using local features. In: 2015 IEEE International Conference on Image Processing, ICIP 2015, Quebec City, QC, Canada, 27–30 September 2015, pp. 1195–1199. IEEE (2015). https://doi.org/10.1109/ICIP.2015.7350989

  14. Huber, P., et al.: A multiresolution 3D morphable face model and fitting framework. In: Magnenat-Thalmann, N., et al. (eds.) Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016) - Volume 4: VISAPP, Rome, Italy, 27–29 February 2016, pp. 79–86. SciTePress (2016). https://doi.org/10.5220/0005669500790086

  15. Litany, O., Bronstein, A.M., Bronstein, M.M., Makadia, A.: Deformable shape completion with graph convolutional autoencoders. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018, pp. 1886–1895. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00202

  16. van Overveld, K., Wyvill, B.: ShrinkWrap: an efficient adaptive algorithm for triangulating an ISO-surface. Vis. Comput. 20(6), 362–379 (2004). https://doi.org/10.1007/s00371-002-0197-4

    Article  Google Scholar 

  17. Peng, B., Wang, W., Dong, J., Tan, T.: Learning pose-invariant 3D object reconstruction from single-view images. Neurocomputing 423, 407–418 (2021). https://doi.org/10.1016/j.neucom.2020.10.089

    Article  Google Scholar 

  18. Quéau, Y., Durou, J., Aujol, J.: Normal integration: a survey. J. Math. Imaging Vis. 60(4), 576–593 (2018). https://doi.org/10.1007/s10851-017-0773-x

    Article  MathSciNet  MATH  Google Scholar 

  19. Romdhani, S., Vetter, T.: Estimating 3D shape and texture using pixel intensity, edges, specular highlights, texture constraints and a prior. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), 20–26 June 2005, San Diego, CA, USA, pp. 986–993. IEEE Computer Society (2005). https://doi.org/10.1109/CVPR.2005.145

  20. Sharma, S., Kumar, V.: Voxel-based 3D face reconstruction and its application to face recognition using sequential deep learning. Multim. Tools Appl. 79(25–26), 17303–17330 (2020). https://doi.org/10.1007/s11042-020-08688-x

    Article  Google Scholar 

  21. Sinha, A., Unmesh, A., Huang, Q., Ramani, K.: SurfNet: generating 3D shape surfaces using deep residual networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 791–800. IEEE Computer Society (2017). https://doi.org/10.1109/CVPR.2017.91

  22. Szeliski, R.: Rapid octree construction from image sequences. CVGIP: Image Underst. 58(1), 23–32 (1993)

    Google Scholar 

  23. Tuan Tran, A., Hassner, T., Masi, I., Medioni, G.: Regressing robust and discriminative 3D morphable models with a very deep neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5163–5172 (2017)

    Google Scholar 

  24. Varol, G., et al.: BodyNet: volumetric inference of 3D human body shapes. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018, Part VII. LNCS, vol. 11211, pp. 20–38. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_2

    Chapter  Google Scholar 

  25. Wu, J., Zhang, C., Xue, T., Freeman, B., Tenenbaum, J.: Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 5–10 December 2016, Barcelona, Spain, pp. 82–90 (2016)

    Google Scholar 

  26. Zhang, S., Xiao, N.: Detailed 3D human body reconstruction from a single image based on mesh deformation. IEEE Access 9, 8595–8603 (2021). https://doi.org/10.1109/ACCESS.2021.3049548

    Article  Google Scholar 

  27. Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S.Z.: High-fidelity pose and expression normalization for face recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 787–796 (2015)

    Google Scholar 

Download references

Acknowledgement

This work is partially supported by a grant of the BMWi ZIM program, no. KK5007201LB0.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Patrik Huber , Muhammad Awais , Matthias Rätsch or Josef Kittler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Danner, M., Huber, P., Awais, M., Rätsch, M., Kittler, J. (2023). GAN-Powered Model &Landmark-Free Reconstruction: A Versatile Approach for High-Quality 3D Facial and Object Recovery from Single Images. In: Conte, D., Fred, A., Gusikhin, O., Sansone, C. (eds) Deep Learning Theory and Applications. DeLTA 2023. Communications in Computer and Information Science, vol 1875. Springer, Cham. https://doi.org/10.1007/978-3-031-39059-3_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-39059-3_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-39058-6

  • Online ISBN: 978-3-031-39059-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics