Skip to main content

InvGAN: Invertible GANs

  • Conference paper
  • First Online:
Pattern Recognition (DAGM GCPR 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13485))

Included in the following conference series:

Abstract

Generation of photo-realistic images, semantic editing and representation learning are only a few of many applications of high-resolution generative models. Recent progress in GANs have established them as an excellent choice for such tasks. However, since they do not provide an inference model, downstream tasks such as classification cannot be easily applied on real images using the GAN latent space. Despite numerous efforts to train an inference model or design an iterative method to invert a pre-trained generator, previous methods are dataset (e.g. human face images) and architecture (e.g. StyleGAN) specific. These methods are nontrivial to extend to novel datasets or architectures. We propose a general framework that is agnostic to architecture and datasets. Our key insight is that, by training the inference and the generative model together, we allow them to adapt to each other and to converge to a better quality model. Our InvGAN, short for Invertible GAN, successfully embeds real images in the latent space of a high quality generative model. This allows us to perform image inpainting, merging, interpolation and online data augmentation. We demonstrate this with extensive qualitative and quantitative experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Except for BiGAN [14] and ALI [16]. We discuss the differences in Sect. 2.

References

  1. Seamless color mapping for 3D reconstruction with consumer-grade scanning devices

    Google Scholar 

  2. Abdal, R., Qin, Y., Wonka, P.: Image2StyleGAN: how to embed images into the styleGAN latent space? In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4432–4441 (2019)

    Google Scholar 

  3. Abdal, R., Qin, Y., Wonka, P.: Image2StyleGAN: how to embed images into the styleGAN latent space? arXiv:1904.03189 (2019)

  4. Abdal, R., Qin, Y., Wonka, P.: Image2StyleGAN++: how to edit the embedded images? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  5. Alaluf, Y., Patashnik, O., Cohen-Or, D.: Restyle: a residual-based styleGAN encoder via iterative refinement (2021)

    Google Scholar 

  6. Alaluf, Y., Tov, O., Mokady, R., Gal, R., Bermano, A.H.: Hyperstyle: StyleGAN inversion with hypernetworks for real image editing (2021). arXiv:2111.15666, https://doi.org/10.48550/ARXIV.2111.15666

  7. Balakrishnan, G., Xiong, Y., Xia, W., Perona, P.: Towards causal benchmarking of bias in face analysis algorithms. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12363, pp. 547–563. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58523-5_32

    Chapter  Google Scholar 

  8. Bau, D., Strobelt, H., Peebles, W., Zhou, B., Zhu, J.Y., Torralba, A., et al.: Semantic photo manipulation with a generative image prior. arXiv preprint arXiv:2005.07727 (2020)

  9. Bousquet, O., Gelly, S., Tolstikhin, I., Simon-Gabriel, C.J., Schoelkopf, B.: From optimal transport to generative modeling: the vegan cookbook. arXiv preprint arXiv:1705.07642 (2017)

  10. Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv:1809.11096 (2018)

  11. Chen, M., Radford, A., Child, R., Wu, J., Jun, H., Luan, D., Sutskever, I.: Generative pretraining from pixels. In: III, H.D., Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning, Proceedings of Machine Learning Research, 13–18 Jul 2020, vol. 119, pp. 1691–1703. PMLR (2020). https://proceedings.mlr.press/v119/chen20s.html

  12. Cheng, Y., Gan, Z., Li, Y., Liu, J., Gao, J.: Sequential attention GAN for interactive image editing. arXiv preprint arXiv:1812.08352 (2020)

  13. Child, R.: Very deep VAEs generalize autoregressive models and can outperform them on images. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=RLRXCV6DbEJ

  14. Donahue, J., Krähenbühl, P., Darrell, T.: Adversarial feature learning. arXiv preprint arXiv:1605.09782 (2016)

  15. Donahue, J., Simonyan, K.: Large scale adversarial representation learning. arXiv:1907.02544 (2019)

  16. Dumoulin, V., et al.: Adversarially learned inference. arXiv preprint arXiv:1606.00704 (2016)

  17. Ghosh, P., Sajjadi, M.S.M., Vergari, A., Black, M.J., Schölkopf, B.: From variational to deterministic autoencoders. In: 8th International Conference on Learning Representations (ICLR) (2020). https://openreview.net/forum?id=S1g7tpEYDS

  18. Ghosh, P., Gupta, P.S., Uziel, R., Ranjan, A., Black, M.J., Bolkart, T.: GIF: generative interpretable faces. In: International Conference on 3D Vision (3DV) (2020). http://gif.is.tue.mpg.de/

  19. Ghosh, P., Losalka, A., Black, M.J.: Resisting adversarial attacks using gaussian mixture variational autoencoders. In: Proceedings AAAI Conference Artificial Intelligence, vol. 33, pp. 541–548 (2019). https://doi.org/10.1609/aaai.v33i01.3301541. https://ojs.aaai.org/index.php/AAAI/article/view/3828

  20. Guan, S., Tai, Y., Ni, B., Zhu, F., Huang, F., Yang, X.: Collaborative learning for faster styleGAN embedding. arXiv:2007.01758 (2020)

  21. Johnson, J., Alahi, A., Li, F.: Perceptual losses for real-time style transfer and super-resolution. arXiv:1603.08155 (2016)

  22. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017)

  23. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4401–4410 (2019)

    Google Scholar 

  24. Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8110–8119 (2020)

    Google Scholar 

  25. Lin, C.H., Chang, C., Chen, Y., Juan, D., Wei, W., Chen, H.: COCO-GAN: generation by parts via conditional coordinating. arXiv:1904.00284 (2019)

  26. Lipton, Z.C., Tripathi, S.: Precise recovery of latent vectors from generative adversarial networks. arXiv preprint arXiv:1702.04782 (2017)

  27. Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations. In: International Conference on Machine Learning, pp. 4114–4124. PMLR (2019)

    Google Scholar 

  28. Marriott, R.T., Madiouni, S., Romdhani, S., Gentric, S., Chen, L.: An assessment of GANs for identity-related applications. In: 2020 IEEE International Joint Conference on Biometrics (IJCB), pp. 1–10 (2020). https://doi.org/10.1109/IJCB48548.2020.9304879

  29. Nguyen-Phuoc, T., Li, C., Theis, L., Richardt, C., Yang, Y.L.: HoloGAN: unsupervised learning of 3D representations from natural images. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 7588–7597 (2019)

    Google Scholar 

  30. Perarnau, G., Van De Weijer, J., Raducanu, B., Álvarez, J.M.: Invertible conditional GANs for image editing. arXiv preprint arXiv:1611.06355 (2016)

  31. Pidhorskyi, S., Adjeroh, D.A., Doretto, G.: Adversarial latent autoencoders. arXiv:2004.04467 (2020)

  32. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)

  33. Ramaswamy, V.V., Kim, S.S., Russakovsky, O.: Fair attribute classification through latent space de-biasing. arXiv preprint arXiv:2012.01469 (2020)

  34. Razavi, A., van den Oord, A., Vinyals, O.: Generating diverse high-fidelity images with VQ-VAE-2. In: Advances in Neural Information Processing Systems, pp. 14866–14876 (2019)

    Google Scholar 

  35. Richardson, E., et al.: Encoding in style: a styleGAN encoder for image-to-image translation. arXiv:2008.00951 (2020)

  36. Salimans, T., Goodfellow, I.J., Zaremba, W., Cheung, V., Radford, A., Chen, X.: Improved techniques for training GANs. arXiv:1606.03498 (2016)

  37. dos Santos Tanaka, F.H.K., Aranha, C.: Data augmentation using GANs. arXiv:1904.09135 (2019)

  38. Sattigeri, P., Hoffman, S.C., Chenthamarakshan, V., Varshney, K.R.: Fairness GAN. arXiv preprint arXiv:1805.09910 (2018)

  39. Sharmanska, V., Hendricks, L.A., Darrell, T., Quadrianto, N.: Contrastive examples for addressing the tyranny of the majority. arXiv preprint arXiv:2004.06524 (2020)

  40. Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. arXiv:1212.0402 (2012)

  41. Tewari, A., et al.: Stylerig: rigging StyleGAN for 3D control over portrait images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2020)

    Google Scholar 

  42. Tolstikhin, I., Bousquet, O., Gelly, S., Schoelkopf, B.: Wasserstein auto-encoders. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkL7n1-0b

  43. Voynov, A., Babenko, A.: Unsupervised discovery of interpretable directions in the GAN latent space. arXiv preprint arXiv:2002.03754 (2020)

  44. Wei, T., et al.: A simple baseline for StyleGAN inversion. arXiv:2104.07661 (2021)

  45. Wulff, J., Torralba, A.: Improving inversion and generation diversity in StyleGAN using a gaussianized latent space. arXiv preprint arXiv:2009.06529 (2020)

  46. Xia, W., Zhang, Y., Yang, Y., Xue, J.H., Zhou, B., Yang, M.H.: Gan inversion: a survey. arXiv preprint arXiv:2101.05278 (2021)

  47. Xu, H., et al.: Adversarial attacks and defenses in images, graphs and text: a review. arXiv:1909.08072 (2019). https://doi.org/10.48550/ARXIV.1909.08072

  48. Xu, Y., Shen, Y., Zhu, J., Yang, C., Zhou, B.: Generative hierarchical features from synthesizing images. In: CVPR (2021)

    Google Scholar 

  49. Yu, J., et al.: Vector-quantized image modeling with improved VQGAN. In: International Conference on Learning Representations (ICLR) (2022)

    Google Scholar 

  50. Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)

    Google Scholar 

  51. Zhu, Jiapeng, Shen, Yujun, Zhao, Deli, Zhou, Bolei: In-domain GAN inversion for real image editing. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 592–608. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_35

    Chapter  Google Scholar 

  52. Zhu, J., Zhao, D., Zhang, B.: LIA: latently invertible autoencoder with adversarial learning. arXiv:1906.08090 (2019)

  53. Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 597–613. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_36

    Chapter  Google Scholar 

  54. Zietlow, D., et al.: Leveling down in computer vision: pareto inefficiencies in fair deep classifiers. arXiv:2203.04913 (2022). https://doi.org/10.48550/ARXIV.2203.04913

Download references

Acknowledgement

We thank Alex Vorobiov, Javier Romero, Betty Mohler Tesch and Soubhik Sanyal for their insightful comments and intriguing discussions. While PG and DZ are affiliated with Max Planck Institute for Intelligent Systems, this project was completed during PG’s and DZ’s internship at Amazon. MJB performed this work while at Amazon.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Partha Ghosh .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ghosh, P., Zietlow, D., Black, M.J., Davis, L.S., Hu, X. (2022). InvGAN: Invertible GANs. In: Andres, B., Bernard, F., Cremers, D., Frintrop, S., Goldlücke, B., Ihrke, I. (eds) Pattern Recognition. DAGM GCPR 2022. Lecture Notes in Computer Science, vol 13485. Springer, Cham. https://doi.org/10.1007/978-3-031-16788-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16788-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16787-4

  • Online ISBN: 978-3-031-16788-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics