Skip to main content

StyleAutoEncoder for Manipulating Image Attributes Using Pre-trained StyleGAN

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2024)

Abstract

Deep conditional generative models are excellent tools for creating high-quality images and editing their attributes. However, training modern generative models from scratch is very expensive and requires large computational resources. In this paper, we introduce StyleAutoEncoder (StyleAE), a lightweight AutoEncoder module, which works as a plugin for pre-trained generative models and allows for manipulating the requested attributes of images. The proposed method offers a cost-effective solution for training deep generative models with limited computational resources, making it a promising technique for a wide range of applications. We evaluate StyleAE by combining it with StyleGAN, which is currently one of the top generative models. Our experiments demonstrate that StyleAE is at least as effective in manipulating image attributes as the state-of-the-art algorithms based on invertible normalizing flows. However, it is simpler, faster, and gives more freedom in designing neural architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abdal, R., Qin, Y., Wonka, P.: Image2StyleGAN: how to embed images into the StyleGAN latent space? CoRR abs/1904.03189 (2019)

    Google Scholar 

  2. Abdal, R., Zhu, P., Femiani, J., Mitra, N.J., Wonka, P.: CLIP2StyleGAN: unsupervised extraction of StyleGAN edit directions. CoRR abs/2112.05219 (2021)

    Google Scholar 

  3. Abdal, R., Zhu, P., Mitra, N.J., Wonka, P.: StyleFlow: attribute-conditioned exploration of StyleGAN-generated images using conditional continuous normalizing flows. CoRR abs/2008.02401 (2020)

    Google Scholar 

  4. Cha, J., Thiyagalingam, J.: Disentangling autoencoders (DAE) (2022)

    Google Scholar 

  5. Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets (2016)

    Google Scholar 

  6. Choi, Y., Uh, Y., Yoo, J., Ha, J.: StarGAN v2: diverse image synthesis for multiple domains. CoRR abs/1912.01865 (2019)

    Google Scholar 

  7. Deng, J., Guo, J., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. CoRR abs/1801.07698 (2018)

    Google Scholar 

  8. Dinh, L., Sohl-Dickstein, J., Bengio, S.: Density estimation using real NVP (2016)

    Google Scholar 

  9. Gao, Y., et al.: High-fidelity and arbitrary face editing. CoRR abs/2103.15814 (2021)

    Google Scholar 

  10. Goodfellow, I.J., et al.: Generative adversarial networks (2014)

    Google Scholar 

  11. Härkönen, E., Hertzmann, A., Lehtinen, J., Paris, S.: GANSpace: discovering interpretable GAN controls. CoRR abs/2004.02546 (2020)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015)

    Google Scholar 

  13. He, Z., Zuo, W., Kan, M., Shan, S., Chen, X.: AttGAN: facial attribute editing by only changing what you want. IEEE Trans. Image Process. 28(11), 5464–5478 (2019)

    Article  MathSciNet  Google Scholar 

  14. Ho, J., Chen, X., Srinivas, A., Duan, Y., Abbeel, P.: Flow++: improving flow-based generative models with variational dequantization and architecture design. CoRR abs/1902.00275 (2019)

    Google Scholar 

  15. Karras, T., et al.: Alias-free generative adversarial networks. In: Proceedings of the NeurIPS (2021)

    Google Scholar 

  16. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. CoRR abs/1812.04948 (2018)

    Google Scholar 

  17. Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. CoRR abs/1912.04958 (2019)

    Google Scholar 

  18. Kingma, D.P., Dhariwal, P.: Glow: generative flow with invertible 1 \(\times \) 1 convolutions (2018)

    Google Scholar 

  19. Kingma, D.P., Rezende, D.J., Mohamed, S., Welling, M.: Semi-supervised learning with deep generative models. CoRR abs/1406.5298 (2014)

    Google Scholar 

  20. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)

  21. Lample, G., Zeghidour, N., Usunier, N., Bordes, A., Denoyer, L., Ranzato, M.: Fader networks: manipulating images by sliding attributes. CoRR abs/1706.00409 (2017)

    Google Scholar 

  22. Liu, R., Liu, Y., Gong, X., Wang, X., Li, H.: Conditional adversarial generative flow for controllable image synthesis. CoRR abs/1904.01782 (2019)

    Google Scholar 

  23. Preechakul, K., Chatthee, N., Wizadwongsa, S., Suwajanakorn, S.: Diffusion autoencoders: toward a meaningful and decodable representation. In: CVPR (2022)

    Google Scholar 

  24. Shen, Y., Yang, C., Tang, X., Zhou, B.: InterFaceGAN: interpreting the disentangled face representation learned by GANs. CoRR abs/2005.09635 (2020)

    Google Scholar 

  25. Suwała, A., Wójcik, B., Proszewska, M., Tabor, J., Spurek, P., Śmieja, M.: Face identity-aware disentanglement in StyleGAN. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 5222–5231 (2024)

    Google Scholar 

  26. Tewari, A., et al.: PIE: portrait image embedding for semantic control. ACM Trans. Graph. 39(6), 1–14 (2020)

    Article  Google Scholar 

  27. Vidal, A., Wu Fung, S., Tenorio, L., Osher, S., Nurbekyan, L.: Taming hyperparameter tuning in continuous normalizing flows using the JKO scheme. Sci. Rep. 13, 4501 (2023)

    Article  Google Scholar 

  28. Wang, H., Yu, N., Fritz, M.: Hijack-GAN: unintended-use of pretrained, black-box GANs. CoRR abs/2011.14107 (2020)

    Google Scholar 

  29. Wołczyk, M., et al.: PluGeN: multi-label conditional generation from pre-trained models. In: AAAI 2022 (2022)

    Google Scholar 

Download references

Acknowledgements

This research has been supported by the flagship project entitled “Artificial Intelligence Computing Center Core Facility” from the Priority Research Area Digi World under the Strategic Programme Excellence Initiative at Jagiellonian University. The work of M. Śmieja was supported by the National Science Centre (Poland), grant no. 2022/45/B/ST6/01117.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrzej Bedychaj .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bedychaj, A., Tabor, J., Śmieja, M. (2024). StyleAutoEncoder for Manipulating Image Attributes Using Pre-trained StyleGAN. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14646. Springer, Singapore. https://doi.org/10.1007/978-981-97-2253-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-2253-2_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-2252-5

  • Online ISBN: 978-981-97-2253-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics