skip to main content
10.1145/3647649.3647689acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicigpConference Proceedingsconference-collections
research-article

Animatable 3D Facial Detail Reconstruction from In-the-wild Images

Authors Info & Claims
Published:03 May 2024Publication History

ABSTRACT

With the introduction of the "metaverse" concept, digital human technology has garnered widespread attention, and how to quickly model a high-precision 3D human face has become a research focus in the field of digital humans. To address the issue of 3D Morphable Models (3DMM) being unable to reconstruct facial details, this paper proposes an animatable facial detail generation algorithm based on generative adversarial loss, which is used to restore more facial reliefs on the basic face obtained from three-dimensional facial reconstruction, achieving a highly realistic reconstruction of digital human faces. Firstly, this paper employs displacement mapping technology to recover facial detail information, constructing an autoencoder network to predict displacement maps from input images, controlling the vertices of the 3D model to express more relief details. Secondly, to effectively capture the mid-to-high frequency details of the face, generative adversarial loss is introduced to model the high-frequency attribute differences between images, jointly constrained with photometric losses and other content losses to ensure the accuracy of generated facial details. Lastly, to decouple static from dynamic facial details, this paper embeds 3DMM expression-related parameters into the generator model of displacement maps, and adopts an identity averaging strategy to constrain the encoder network to model only the static details of the face, whereas the dynamic details are provided by the 3DMM parameters, enabling the generation of dynamic facial details driven by expressions. The final experimental results show that the algorithm proposed in this paper demonstrates competitive performance in terms of model reconstruction quality and accuracy when compared to other state-of-the-art algorithms in facial detail generation. Additionally, the ability of the model to dynamically construct facial details was verified through expression transfer experiments, and the importance of generative adversarial loss was confirmed through ablation experiments.

References

  1. Volker Blanz and Thomas Vetter. 1999. A morphable model for the synthesis of 3D faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques. 187–194.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. 2013. Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics 20, 3 (2013), 413–425.Google ScholarGoogle Scholar
  3. Qiong Cao, Li Shen, Weidi Xie, Omkar M Parkhi, and Andrew Zisserman. 2018. Vggface2: A dataset for recognising faces across pose and age. In 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, 67–74.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Zenghao Chai, Haoxian Zhang, Jing Ren, Di Kang, Zhengzhuo Xu, Xuefei Zhe, Chun Yuan, and Linchao Bao. 2022. REALY: Rethinking the evaluation of 3D face reconstruction. In European Conference on Computer Vision. Springer, 74–92.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Radek Daněček, Michael J Black, and Timo Bolkart. 2022. EMOCA: Emotion driven monocular face capture and animation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20311–20322.Google ScholarGoogle ScholarCross RefCross Ref
  6. Yao Feng, Haiwen Feng, Michael J Black, and Timo Bolkart. 2021. Learning an animatable detailed 3D face model from in-the-wild images. ACM Transactions on Graphics (ToG) 40, 4 (2021), 1–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yao Feng, Fan Wu, Xiaohu Shao, Yanfeng Wang, and Xi Zhou. 2018. Joint 3d face reconstruction and dense alignment with position map regression network. In Proceedings of the European conference on computer vision (ECCV). 534–551.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Shape from Shading. 1970. A Method for Obtaining the Shape of a Smooth Opaque Object from One View. Massachusetts Institute of Technology 232 (1970).Google ScholarGoogle Scholar
  9. Baris Gecer, Stylianos Ploumpis, Irene Kotsia, and Stefanos Zafeiriou. 2019. Ganfit: Generative adversarial network fitting for high fidelity 3d face reconstruction. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1155–1164.Google ScholarGoogle ScholarCross RefCross Ref
  10. Baris Gecer, Stylianos Ploumpis, Irene Kotsia, and Stefanos Zafeiriou. 2021. Fast-ganfit: Generative adversarial network for high fidelity 3d face reconstruction. IEEE transactions on pattern analysis and machine intelligence 44, 9 (2021), 4879–4893.Google ScholarGoogle Scholar
  11. Kyle Genova, Forrester Cole, Aaron Maschinot, Aaron Sarna, Daniel Vlasic, and William T Freeman. 2018. Unsupervised training for 3d morphable model regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8377–8386.Google ScholarGoogle ScholarCross RefCross Ref
  12. Thomas Gerig, Andreas Morel-Forster, Clemens Blumer, Bernhard Egger, Marcel Luthi, Sandro Schönborn, and Thomas Vetter. 2018. Morphable face models-an open framework. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, 75–82.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Aaron S Jackson, Adrian Bulat, Vasileios Argyriou, and Georgios Tzimiropoulos. 2017. Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In Proceedings of the IEEE international conference on computer vision. 1031–1039.Google ScholarGoogle ScholarCross RefCross Ref
  14. Luo Jiang, Juyong Zhang, Bailin Deng, Hao Li, and Ligang Liu. 2018. 3D face reconstruction with geometry details from a single image. IEEE Transactions on Image Processing 27, 10 (2018), 4756–4770.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Tianye Li, Timo Bolkart, Michael J Black, Hao Li, and Javier Romero. 2017. Learning a model of facial shape and expression from 4D scans.ACM Trans. Graph. 36, 6 (2017), 194–1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2018. Large-scale celebfaces attributes (celeba) dataset. Retrieved August 15, 2018 (2018), 11.Google ScholarGoogle Scholar
  17. Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).Google ScholarGoogle Scholar
  18. Elad Richardson, Matan Sela, and Ron Kimmel. 2016. 3D face reconstruction by learning from synthetic data. In 2016 fourth international conference on 3D vision (3DV). IEEE, 460–469.Google ScholarGoogle Scholar
  19. Elad Richardson, Matan Sela, Roy Or-El, and Ron Kimmel. 2017. Learning detailed face reconstruction from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1259–1268.Google ScholarGoogle ScholarCross RefCross Ref
  20. Soubhik Sanyal, Timo Bolkart, Haiwen Feng, and Michael J Black. 2019. Learning to regress 3D face shape and expression from an image without 3D supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7763–7772.Google ScholarGoogle ScholarCross RefCross Ref
  21. Matan Sela, Elad Richardson, and Ron Kimmel. 2017. Unrestricted facial geometry reconstruction using image-to-image translation. In Proceedings of the IEEE international conference on computer vision. 1576–1585.Google ScholarGoogle ScholarCross RefCross Ref
  22. Ayush Tewari, Michael Zollhofer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Perez, and Christian Theobalt. 2017. Mofa: Model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In Proceedings of the IEEE international conference on computer vision workshops. 1274–1283.Google ScholarGoogle Scholar
  23. Luan Tran and Xiaoming Liu. 2018. Nonlinear 3d face morphable model. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7346–7355.Google ScholarGoogle ScholarCross RefCross Ref
  24. Luan Tran and Xiaoming Liu. 2019. On learning 3d face morphable model from in-the-wild images. IEEE transactions on pattern analysis and machine intelligence 43, 1 (2019), 157–171.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Huynh Cao Tuan, Do Nang Toan, and Lam Thanh Hien. 2022. Automatic Selection of Key Points for 3D-Face Deformation. Journal of Advances in Information Technology 13, 4 (2022), 332 – 337.Google ScholarGoogle ScholarCross RefCross Ref
  26. Haotian Yang, Hao Zhu, Yanru Wang, Mingkai Huang, Qiu Shen, Ruigang Yang, and Xun Cao. 2020. Facescape: a large-scale high quality 3d face dataset and detailed riggable 3d face prediction. In Proceedings of the ieee/cvf conference on computer vision and pattern recognition. 601–610.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Animatable 3D Facial Detail Reconstruction from In-the-wild Images

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics Processing
        January 2024
        480 pages
        ISBN:9798400716720
        DOI:10.1145/3647649

        Copyright © 2024 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 3 May 2024

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)4
        • Downloads (Last 6 weeks)4

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format