Exp-GAN: 3D-Aware Facial Image Generation with Expression Control

Lee, Yeonkyeong; Choi, Taeho; Go, Hyunsung; Lee, Hyunjoon; Cho, Sunghyun; Kim, Junho

doi:10.1007/978-3-031-26293-7_10

Yeonkyeong Lee¹²,
Taeho Choi¹²,
Hyunsung Go¹³,
Hyunjoon Lee¹²,
Sunghyun Cho¹⁴ &
…
Junho Kim¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13847))

Included in the following conference series:

Asian Conference on Computer Vision

395 Accesses
1 Citations

Abstract

This paper introduces Exp-GAN, a 3D-aware facial image generator with explicit control of facial expressions. Unlike previous 3D-aware GANs, Exp-GAN supports fine-grained control over facial shapes and expressions disentangled from poses. To this ends, we propose a novel hybrid approach that adopts a 3D morphable model (3DMM) with neural textures for the facial region and a neural radiance field (NeRF) for non-facial regions with multi-view consistency. The 3DMM allows fine-grained control over facial expressions, whereas the NeRF contains volumetric features for the non-facial regions. The two features, generated separately, are combined seamlessly with our depth-based integration method that integrates the two complementary features through volume rendering. We also propose a training scheme that encourages generated images to reflect control over shapes and expressions faithfully. Experimental results show that the proposed approach successfully synthesizes realistic view-consistent face images with fine-grained controls. Code is available at https://github.com/kakaobrain/expgan.

This work was done when the first author was with Kookmin University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abdal, R., Zhu, P., Mitra, N.J., Wonka, P.: StyleFlow: attribute-conditioned exploration of StyleGAN-generated images using conditional continuous normalizing flows. ACM Trans. Graph. (Proc. SIGGRAPH 2021) 40(3) (2021)
Google Scholar
Bühler, M.C., Meka, A., Li, G., Beeler, T., Hilliges, O.: VariTex: variational neural face textures. In: Proceedings of ICCV, pp. 13890–13899 (2021)
Google Scholar
Chan, E.R., et al.: Efficient geometry-aware 3D generative adversarial networks. In: Proceedings of CVPR, pp. 16123–16133 (2022)
Google Scholar
Chan, E.R., Monteiro, M., Kellnhofer, P., Wu, J., Wetzstein, G.: pi-GAN: periodic implicit generative adversarial networks for 3D-aware image synthesis. In: Proceedings of CVPR, pp. 5799–5809 (2021)
Google Scholar
Chen, A., Liu, R., Xie, L., Chen, Z., Su, H., Yu, J.: SofGAN: a portrait image generator with dynamic styling. ACM Trans. Graph. 42(1), 1–26 (2022)
Google Scholar
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: Proceedings of CVPR, pp. 4690–4699 (2019)
Google Scholar
Deng, Y., Yang, J., Chen, D., Wen, F., Tong, X.: Disentangled and controllable face image generation via 3D imitative-contrastive learning. In: Proceedings of CVPR, pp. 5154–5163 (2020)
Google Scholar
Deng, Y., Yang, J., Xiang, J., Tong, X.: GRAM: generative radiance manifolds for 3D-aware image generation. In: Proceedings of CVPR, pp. 10673–10683 (2022)
Google Scholar
Feng, Y., Feng, H., Black, M.J., Bolkart, T.: Learning an animatable detailed 3D face model from in-the-wild images. ACM Trans. Graph. (Proc. SIGGRAPH 2021) 40(8), Article No. 88 (2021)
Google Scholar
Ghosh, P., Gupta, P.S., Uziel, R., Ranjan, A., Black, M., Bolkart, T.: GIF: generative interpretable faces. In: Proceedings of 3DV, pp. 868–878 (2020)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial nets. In: Proceedings of NIPS, pp. 2672–2680 (2014)
Google Scholar
Gu, J., Liu, L., Wang, P., Theobalt, C.: StyleNeRF: a style-based 3D-aware generator for high-resolution image synthesis. In: Proceedings of ICLR (2022)
Google Scholar
Härkönen, E., Hertzmann, A., Lehtinen, J., Paris, S.: GANSpace: discovering interpretable GAN controls. In: Proceedings of NeurIPS (2020)
Google Scholar
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Proceedings of NIPS, pp. 6629–6640 (2017)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of CVPR, pp. 4401–4410 (2019)
Google Scholar
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. In: Proceedings of CVPR, pp. 8110–8119 (2020)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: Proceedings of ICLR (2014)
Google Scholar
Kowalski, M., Garbin, S.J., Estellers, V., Baltrušaitis, T., Johnson, M., Shotton, J.: CONFIG: controllable neural face image generation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12356, pp. 299–315. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58621-8_18
Chapter Google Scholar
Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. (Proc. SIGGRAPH Asia 2017) 36(4), Article No. 194 (2017)
Google Scholar
Ma, S., et al.: Pixel codec avatars. In: Proceedings of CVPR, pp. 64–73 (2021)
Google Scholar
Mescheder, L., Geiger, A., Nowozin, S.: Which Training Methods for GANs do actually Converge? arXiv preprint arXiv:1801.04406 (2018)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Chapter Google Scholar
Nguyen-Phuoc, T., Li, C., Theis, L., Richardt, C., Yang, Y.L.: HoloGAN: unsupervised learning of 3D representations from natural images. In: Proceedings of ICCV, pp. 7588–7597 (2019)
Google Scholar
Paysan, P., Knothe, R., Amberg, B., Romdhani, S., Vetter, T.: A 3D face model for pose and illumination invariant face recognition. In: IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 296–301 (2009)
Google Scholar
Schwarz, K., Liao, Y., Niemeyer, M., Geiger, A.: GRAF: generative radiance fields for 3D-aware image synthesis. In: Proceedings of NeurIPS (2020)
Google Scholar
Shen, Y., Gu, J., Tang, X., Zhou, B.: Interpreting the latent space of GANs for semantic face editing. In: Proceedings of CVPR, pp. 9243–9252 (2020)
Google Scholar
Shen, Y., Zhou, B.: Closed-form factorization of latent semantics in GANs. In: Proceedings of CVPR, pp. 1532–1540 (2021)
Google Scholar
Shoshan, A., Bhonker, N., Kviatkovsky, I., Medioni, G.: GAN-control: explicitly controllable GANs. In: Proceedings of ICCV, pp. 14083–14093 (2021)
Google Scholar
Sitzmann, V., Martel, J.N.P., Bergman, A., Lindell, D.B., Wetzstein, G.: Implicit neural representations with periodic activation functions. In: Proceedings of NeurIPS (2020)
Google Scholar
Sohl-Dickstein, J., Weiss, E.A., Maheswaranathan, N., Ganguli, S.: Deep unsupervised learning using nonequilibrium thermodynamics. In: Proceedings of ICML (2015)
Google Scholar
Tewari, A., et al.: StyleRig: rigging StyleGAN for 3D control over portrait images. In: Proceedings of CVPR, pp. 6142–6151 (2020)
Google Scholar
Thies, J., Zollhöfer, M., Nießner, M.: Deferred neural rendering: image synthesis using neural textures. ACM Trans. Graph. (Proc. SIGGRAPH 2019) 38(4) (2019)
Google Scholar
Wang, Q., et al.: IBRNet: learning multi-view image-based rendering. In: Proceedings of CVPR, pp. 4690–4699 (2021)
Google Scholar
Wu, Z., Lischinski, D., Shechtman, E.: StyleSpace analysis: disentangled controls for StyleGAN image generation. In: Proceedings of CVPR, pp. 12863–12872 (2021)
Google Scholar
Zhou, P., Xie, L., Ni, B., Tian, Q.: CIPS-3D: A 3D-Aware Generator of GANs Based on Conditionally-Independent Pixel Synthesis. arXiv preprint arXiv:2110.09788 (2021)

Download references

Acknowledgements

This was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (2022R1F1A1074628, 2022R1A5A7000765) and Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-01826, Problem-Based Learning Program for Researchers to Proactively Solve Practical AI Problems (Kookmin University) and No. 2019-0-01906, Artificial Intelligence Graduate School Program (POSTECH)).

Author information

Authors and Affiliations

Kakao Brain, Seongnam, South Korea
Yeonkyeong Lee, Taeho Choi & Hyunjoon Lee
Kookmin University, Seoul, South Korea
Hyunsung Go & Junho Kim
POSTECH, Pohang, South Korea
Sunghyun Cho

Authors

Yeonkyeong Lee
View author publications
You can also search for this author in PubMed Google Scholar
Taeho Choi
View author publications
You can also search for this author in PubMed Google Scholar
Hyunsung Go
View author publications
You can also search for this author in PubMed Google Scholar
Hyunjoon Lee
View author publications
You can also search for this author in PubMed Google Scholar
Sunghyun Cho
View author publications
You can also search for this author in PubMed Google Scholar
Junho Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junho Kim .

Editor information

Editors and Affiliations

University of Wollongong, Wollongong, NSW, Australia
Lei Wang
University of Bonn, Bonn, Germany
Juergen Gall
University of Adelaide, Adelaide, SA, Australia
Tat-Jun Chin
National Institute of Informatics, Tokyo, Japan
Imari Sato
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 42455 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, Y., Choi, T., Go, H., Lee, H., Cho, S., Kim, J. (2023). Exp-GAN: 3D-Aware Facial Image Generation with Expression Control. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13847. Springer, Cham. https://doi.org/10.1007/978-3-031-26293-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-031-26293-7_10
Published: 11 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26292-0
Online ISBN: 978-3-031-26293-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Exp-GAN: 3D-Aware Facial Image Generation with Expression Control