Rayleigh EigenDirections (REDs): Nonlinear GAN Latent Space Traversals for Multidimensional Features

Balakrishnan, Guha; Gadde, Raghudeep; Martinez, Aleix; Perona, Pietro

doi:10.1007/978-3-031-19790-1_31

Rayleigh EigenDirections (REDs): Nonlinear GAN Latent Space Traversals for Multidimensional Features

Guha Balakrishnan^12,13,
Raghudeep Gadde¹³,
Aleix Martinez¹³ &
…
Pietro Perona¹³

Conference paper
First Online: 24 October 2022

2430 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13677))

Abstract

We present a method for finding paths in a deep generative model’s latent space that can maximally vary one set of image features while holding others constant. Crucially, unlike past traversal approaches, ours can manipulate arbitrary multidimensional features of an image such as facial identity and pixels within a specified region. Our method is principled and conceptually simple: optimal traversal directions are chosen by maximizing differential changes to one feature set such that changes to another set are negligible. We show that this problem is nearly equivalent to one of Rayleigh quotient maximization, and provide a closed-form solution to it based on solving a generalized eigenvalue equation. We use repeated computations of the corresponding optimal directions, which we call Rayleigh EigenDirections (REDs), to generate appropriately curved paths in latent space. We empirically evaluate our method using StyleGAN2 and BigGAN on the following image domains: faces, living rooms and ImageNet. We show that our method is capable of controlling various multidimensional features: face identity, geometric and semantic attributes, spatial frequency bands, pixels within a region, and the appearance and position of an object. Our work suggests that a wealth of opportunities lies in the local analysis of the geometry and semantics of latent spaces.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
There is no physical ‘identity’ ground truth behind a GAN-generated portrait. However, human observers or face recognition algorithms can respond to the question “Is this the same person?” and can produce consistent judgments. Therefore ‘identity’ here denotes ‘perceptual identity’.
2.
https://github.com/zllrunning/face-parsing.PyTorch.

References

Mediapipe. https://github.com/google/mediapipe
Antipov, G., Baccouche, M., Dugelay, J.L.: Face aging with conditional generative adversarial networks. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 2089–2093. IEEE (2017)
Google Scholar
Arvanitidis, G., Hansen, L.K., Hauberg, S.: Latent space oddity: on the curvature of deep generative models. arXiv preprint arXiv:1710.11379 (2017)
Balakrishnan, G., Xiong, Y., Xia, W., Perona, P.: Towards causal benchmarking of bias in face analysis algorithms. In: European Conference on Computer Vision, pp. 547–563. Springer (2020). https://doi.org/10.1007/978-3-030-74697-1_15
Balestriero, R., Paris, S., Baraniuk, R.: Max-affine spline insights into deep generative networks. arXiv preprint arXiv:2002.11912 (2020)
Bao, J., Chen, D., Wen, F., Li, H., Hua, G.: Towards open-set identity preserving face synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6713–6722 (2018)
Google Scholar
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. In: International Conference on Learning Representations (2019)
Google Scholar
Chen, N., Klushyn, A., Kurle, R., Jiang, X., Bayer, J., Smagt, P.: Metrics for deep generative models. In: International Conference on Artificial Intelligence and Statistics, pp. 1540–1550. PMLR (2018)
Google Scholar
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8789–8797 (2018)
Google Scholar
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
Google Scholar
Ghojogh, B., Karray, F., Crowley, M.: Eigenvalue and generalized eigenvalue problems: Tutorial. arXiv preprint arXiv:1903.11240 (2019)
Goetschalckx, L., Andonian, A., Oliva, A., Isola, P.: Ganalyze: toward visual definitions of cognitive image properties. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5744–5753 (2019)
Google Scholar
Goodfellow, I., et al: Generative adversarial nets. In: Advances in Neural Information Processing Systems, vol. 27 (2014)
Google Scholar
Härkönen, E., Hertzmann, A., Lehtinen, J., Paris, S.: Ganspace: discovering interpretable gan controls. In: Advances in Neural Information Processing Systems, pp. 9841–9850 (2020)
Google Scholar
He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: Attgan: facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28(11), 5464–5478 (2019)
Article MathSciNet Google Scholar
Jahanian*, A., Chai*, L., Isola, P.: On the “steerability” of generative adversarial networks. In: International Conference on Learning Representations (2020)
Google Scholar
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)
Google Scholar
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of stylegan. arXiv preprint arXiv:1912.04958 (2019)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014)
Google Scholar
Kocaoglu, M., Snyder, C., Dimakis, A.G., Vishwanath, S.: Causalgan: learning causal implicit generative models with adversarial training. In: International Conference on Learning Representations (2018)
Google Scholar
Kuhnel, L., Fletcher, T., Joshi, S., Sommer, S.: Latent space non-linear statistics. arXiv preprint arXiv:1805.07632 (2018)
Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., Ranzato, M.: Fader networks: Manipulating images by sliding attributes. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), pp. 5969–5978 (2017)
Google Scholar
Liu, M., et al.: Stgan: a unified selective transfer network for arbitrary image attribute editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3673–3682 (2019)
Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier gans. In: International Conference on Machine Learning, pp. 2642–2651. PMLR (2017)
Google Scholar
Or-El, R., Sengupta, S., Fried, O., Shechtman, E., Kemelmacher-Shlizerman, I.: Lifespan age transformation synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 739–755. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_44
Chapter Google Scholar
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)
Google Scholar
Plumerault, A., Borgne, H.L., Hudelot, C.: Controlling generative models with continuous factors of variations. In: International Conference on Learning Representations (2020)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Shao, H., Kumar, A., Thomas Fletcher, P.: The riemannian geometry of deep generative models. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 315–323 (2018)
Google Scholar
Shen, Y., Gu, J., Tang, X., Zhou, B.: Interpreting the latent space of gans for semantic face editing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9243–9252 (2020)
Google Scholar
Shen, Y., Luo, P., Yan, J., Wang, X., Tang, X.: Faceid-gan: Learning a symmetry three-player gan for identity-preserving face synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 821–830 (2018)
Google Scholar
Shen, Y., Zhou, B., Luo, P., Tang, X.: Facefeat-gan: a two-stage approach for identity-preserving face synthesis. arXiv preprint arXiv:1812.01288 (2018)
Shoshan, A., Bhonker, N., Kviatkovsky, I., Medioni, G.: Gan-control: Explicitly controllable gans. arXiv preprint arXiv:2101.02477 (2021)
Tran, L., Yin, X., Liu, X.: Disentangled representation learning gan for pose-invariant face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1415–1424 (2017)
Google Scholar
Tzelepis, C., Tzimiropoulos, G., Patras, I.: Warpedganspace: finding non-linear rbf paths in gan latent space. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6393–6402 (2021)
Google Scholar
Upchurch, P., et al.: Deep feature interpolation for image content changes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7064–7073 (2017)
Google Scholar
Voynov, A., Babenko, A.: Unsupervised discovery of interpretable directions in the gan latent space. In: International Conference on Machine Learning, pp. 9786–9796. PMLR (2020)
Google Scholar
Wang, B., Ponce, C.R.: A geometric analysis of deep generative image models and its applications. In: International Conference on Learning Representations (2021)
Google Scholar
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)
Google Scholar
Wu, Z., Lischinski, D., Shechtman, E.: Stylespace analysis: disentangled controls for stylegan image generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12863–12872 (2021)
Google Scholar
Xiao, T., Hong, J., Ma, J.: Elegant: Exchanging latent encodings with gan for transferring multiple face attributes. In: Proceedings of the European conference on computer vision (ECCV), pp. 168–184 (2018)
Google Scholar
Yang, C., Shen, Y., Zhou, B.: Semantic hierarchy emerges in deep generative representations for scene synthesis. Int. J. Comput. Vis. 129(5), 1451–1466 (2021). https://doi.org/10.1007/s11263-020-01429-5
Article Google Scholar
Yang, H., Chai, L., Wen, Q., Zhao, S., Sun, Z., He, S.: Discovering interpretable latent space directions of gans beyond binary attributes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12177–12185 (2021)
Google Scholar
Yin, X., Yu, X., Sohn, K., Liu, X., Chandraker, M.: Towards large-pose face frontalization in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3990–3999 (2017)
Google Scholar
Zhu, J., et al.: Low-rank subspaces in gans. In: Advances in Neural Information Processing Systems 34 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Rice University, Houston, TX, 77005, USA
Guha Balakrishnan
Amazon, Seattle, WA, 98109, USA
Guha Balakrishnan, Raghudeep Gadde, Aleix Martinez & Pietro Perona

Authors

Guha Balakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
Raghudeep Gadde
View author publications
You can also search for this author in PubMed Google Scholar
Aleix Martinez
View author publications
You can also search for this author in PubMed Google Scholar
Pietro Perona
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guha Balakrishnan .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1825 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Balakrishnan, G., Gadde, R., Martinez, A., Perona, P. (2022). Rayleigh EigenDirections (REDs): Nonlinear GAN Latent Space Traversals for Multidimensional Features. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13677. Springer, Cham. https://doi.org/10.1007/978-3-031-19790-1_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-19790-1_31
Published: 24 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19789-5
Online ISBN: 978-3-031-19790-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics