Skip to main content
Log in

Enhanced self-supervised GANs with blend ratio classification

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Generative adversarial networks (GANs) have achieved remarkable success in image generation, especially training conditional GANs for deriving reliable representations. However, the main downside of conditional GANs is the requirement of labeled data. Using self-supervision information can meet such needs, but the challenges remain for discovering more reliable self-supervised signals and ways to couple different signals to describe characteristics of the training data more precisely. In this paper, we propose a novel self-supervised learning approach to automatically generating pseudo labels for autoencoder-based GANs. Specifically, we blend the input images and the corresponding reconstructed results to produce transformed samples controlled by the blend ratio. Then, an additional classifier attached to the discriminator needs to distinguish the ratio of real images from the transformed samples to derive meaningful representations. Next, we enhance GANs with multiple self-supervision guidances by two different means to further improve the capacity of the discriminator. One merges multiple supervision signals and requires the classifier to predict the mixed probability, whereas the other one utilizes these signals independently. In experiments, we evaluate the quality of the generated image and the learned representation using three datasets. Empirical results prove the effectiveness of our methods on both image synthesis and representation learning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Brock A, Donahue J, Simonyan K (2019) Large scale GAN training for high fidelity natural image synthesis. In: 7th international conference on learning representations, ICLR 2019, May 6-9, 2019, New Orleans, LA, USA

  2. Chen T, Zhai X, Ritter M, Lucic M, Houlsby N (2019) Self-supervised gans via auxiliary rotation loss. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, June 16-20, 2019, Long Beach, CA, USA, pp 12154–12163

  3. Doersch C, Zisserman A (2017) Multi-task self-supervised visual learning. In: IEEE international conference on computer vision, ICCV 2017, October 22-29, 2017, Venice, Italy, pp 2070–2079

  4. Doersch C, Gupta A, Efros AA (2015) Unsupervised visual representation learning by context prediction. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, December 7-13, 2015, Santiago, Chile, pp 1422–1430

  5. Donahue J, Krähenbühl P, Darrell T (2017) Adversarial feature learning. In: Conference track proceedings 5th international conference on learning representations, ICLR 2017, April 24-26, 2017, Toulon, France

  6. Dumoulin V, Belghazi I, Poole B, Lamb A, Arjovsky M, Mastropietro O, Courville AC (2017) Adversarially learned inference. In: Conference track proceedings 5th international conference on learning representations, ICLR 2017, April 24-26, 2017, Toulon, France

  7. Feng Z, Xu C, Tao D (2019) Self-supervised representation learning by rotation feature decoupling. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, June 16-20, 2019, Long Beach, CA, USA, pp 10364–10374

  8. Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. In: 6th international conference on learning representations, ICLR 2018, April 30 - May 3, 2018, Vancouver, BC, Canada

  9. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems 27: Annual conference on neural information processing systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pp 2672–2680

  10. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp 5767–5777

  11. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, June 27-30, 2016, Las Vegas, NV, USA, pp 770–778

  12. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp 6626–6637

  13. Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability, and variation. In: 6th international conference on learning representations, ICLR 2018, April 30 - May 3, 2018, Vancouver, BC, Canada

  14. Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, June 16-20, 2019, Long Beach, CA, USA, pp 4401–4410

  15. Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: 2nd international conference on learning representations, ICLR 2014, April 14-16, 2014, Banff, AB, Canada

  16. Larsen ABL, SK Sønderby, Larochelle H, Winther O (2016) Autoencoding beyond pixels using a learned similarity metric. In: Proceedings of the 33nd international conference on machine learning, ICML 2016, June 19-24, 2016, New York City, NY, USA, pp 1558–1566

  17. Mescheder L, Geiger A, Nowozin S (2018) Which training methods for gans do actually converge?. In: Proceedings of the 35th international conference on machine learning, ICML 2018, Stockholmsmässan, July 10-15, 2018, Stockholm, Sweden, pp 3478–3487

  18. Metz L, Poole B, Pfau D, Sohl-Dickstein J (2017) Unrolled generative adversarial networks. In: 5th international conference on learning representations, ICLR 2017, April 24-26, 2017, Toulon, France

  19. Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. In: 6th international conference on learning representations, ICLR 2018, April 30 - May 3, 2018, Vancouver, BC, Canada

  20. Noroozi M, Favaro P (2016) Unsupervised learning of visual representations by solving jigsaw puzzles. In: Proceedings on computer vision - ECCV 2016 - 14th european conference, October 11-14, 2016, Part VI, Amsterdam, The Netherlands, pp 69–84

  21. Pidhorskyi S, Adjeroh DA, Doretto G (2020) Adversarial latent autoencoders. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, June 13-19, 2020, Seattle, WA, USA, pp 14092–14101

  22. Qiao T, Zhang J, Xu D, Tao D (2019) Mirrorgan: Learning text-to-image generation by redescription. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, June 16-20, 2019, Long Beach, CA, USA, pp 1505–1514

  23. Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: 4th international conference on learning representations, ICLR 2016, May 2-4, 2016, San Juan, Puerto Rico

  24. Roccetti M, Delnevo G, Casini L, Cappiello G (2019) Is bigger always better? a controversial journey to the center of machine learning design, with uses and misuses of big data for predicting water meter failures. J Big Data 6(1):1–23

    Article  Google Scholar 

  25. Somasundaram A, Reddy US (2016) Data imbalance: effects and solutions for classification of large and highly imbalanced data. In: International conference on research in engineering, computers and technology (ICRECT 2016), pp 1–16

  26. Sun Y, Wong AK, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 23(04):687–719

    Article  Google Scholar 

  27. Thanh-Tung H, Tran T (2020) Catastrophic forgetting and mode collapse in gans. In: 2020 international joint conference on neural networks, IJCNN 2020, July 19-24, 2020, Glasgow, United Kingdom, pp 1–10

  28. Thanh-Tung H, Tran T, Venkatesh S (2019) Improving generalization and stability of generative adversarial networks. In: 7th international conference on learning representations, ICLR 2019, May 6-9, 2019, New Orleans, LA, USA

  29. Tran NT, Bui TA, Cheung NM (2018) Dist-gan: An improved GAN using distance constraints. In: Computer Vision - ECCV 2018 - 15th European Conference, September 8-14, 2018, Munich, Germany, pp 387–401

  30. Tran NT, Tran VH, Nguyen NB, Yang L, Cheung NM (2019) Self-supervised GAN: analysis and improvement with multi-class minimax game. In: Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pp 13232–13243

  31. Wang J, Zhou W, Qi G, Fu Z, Tian Q, Li H (2020a) Transformation GAN for unsupervised image synthesis and representation learning. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, June 13-19, 2020, Seattle, WA, USA, pp 469–478

  32. Wang Q, Huang W, Xiong Z, Li X (2020b) Looking closer at the scene: Multiscale representation learning for remote sensing image scene classification. IEEE Trans Neural Netw Learn Syst

  33. Wang Q, Gao J, Lin W, Yuan Y (2021) Pixel-wise crowd understanding via synthetic data. Int J Comput Vis 129(1):225–245

    Article  Google Scholar 

  34. Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, June 18-22, 2018, Salt Lake City, UT, USA, pp 8798–8807

  35. Yu J, Lin Z, Yang J, Shen X, Lu X, Huang TS (2018) Generative image inpainting with contextual attention. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, June 18-22, 2018, Salt Lake City, UT, USA, pp 5505–5514

  36. Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: Proceedings of the 36th international conference on machine learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pp 7354–7363

  37. Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: Computer Vision - ECCV 2016 - 14th european conference, October 11-14, 2016, Amsterdam, The Netherlands, pp 649–666

  38. Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE international conference on computer vision, ICCV 2017, October 22-29, 2017, Venice, Italy, pp 2242–2251

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianwei Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Q., Zhang, J., Han, G. et al. Enhanced self-supervised GANs with blend ratio classification. Multimed Tools Appl 81, 7651–7667 (2022). https://doi.org/10.1007/s11042-022-12056-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12056-2

Keywords

Navigation