Abstract
Generative adversarial networks (GANs) have achieved remarkable success in image generation, especially training conditional GANs for deriving reliable representations. However, the main downside of conditional GANs is the requirement of labeled data. Using self-supervision information can meet such needs, but the challenges remain for discovering more reliable self-supervised signals and ways to couple different signals to describe characteristics of the training data more precisely. In this paper, we propose a novel self-supervised learning approach to automatically generating pseudo labels for autoencoder-based GANs. Specifically, we blend the input images and the corresponding reconstructed results to produce transformed samples controlled by the blend ratio. Then, an additional classifier attached to the discriminator needs to distinguish the ratio of real images from the transformed samples to derive meaningful representations. Next, we enhance GANs with multiple self-supervision guidances by two different means to further improve the capacity of the discriminator. One merges multiple supervision signals and requires the classifier to predict the mixed probability, whereas the other one utilizes these signals independently. In experiments, we evaluate the quality of the generated image and the learned representation using three datasets. Empirical results prove the effectiveness of our methods on both image synthesis and representation learning.
Similar content being viewed by others
References
Brock A, Donahue J, Simonyan K (2019) Large scale GAN training for high fidelity natural image synthesis. In: 7th international conference on learning representations, ICLR 2019, May 6-9, 2019, New Orleans, LA, USA
Chen T, Zhai X, Ritter M, Lucic M, Houlsby N (2019) Self-supervised gans via auxiliary rotation loss. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, June 16-20, 2019, Long Beach, CA, USA, pp 12154–12163
Doersch C, Zisserman A (2017) Multi-task self-supervised visual learning. In: IEEE international conference on computer vision, ICCV 2017, October 22-29, 2017, Venice, Italy, pp 2070–2079
Doersch C, Gupta A, Efros AA (2015) Unsupervised visual representation learning by context prediction. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, December 7-13, 2015, Santiago, Chile, pp 1422–1430
Donahue J, Krähenbühl P, Darrell T (2017) Adversarial feature learning. In: Conference track proceedings 5th international conference on learning representations, ICLR 2017, April 24-26, 2017, Toulon, France
Dumoulin V, Belghazi I, Poole B, Lamb A, Arjovsky M, Mastropietro O, Courville AC (2017) Adversarially learned inference. In: Conference track proceedings 5th international conference on learning representations, ICLR 2017, April 24-26, 2017, Toulon, France
Feng Z, Xu C, Tao D (2019) Self-supervised representation learning by rotation feature decoupling. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, June 16-20, 2019, Long Beach, CA, USA, pp 10364–10374
Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. In: 6th international conference on learning representations, ICLR 2018, April 30 - May 3, 2018, Vancouver, BC, Canada
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems 27: Annual conference on neural information processing systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pp 2672–2680
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp 5767–5777
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, June 27-30, 2016, Las Vegas, NV, USA, pp 770–778
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems 30: Annual conference on neural information processing systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp 6626–6637
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of gans for improved quality, stability, and variation. In: 6th international conference on learning representations, ICLR 2018, April 30 - May 3, 2018, Vancouver, BC, Canada
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, June 16-20, 2019, Long Beach, CA, USA, pp 4401–4410
Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: 2nd international conference on learning representations, ICLR 2014, April 14-16, 2014, Banff, AB, Canada
Larsen ABL, SK Sønderby, Larochelle H, Winther O (2016) Autoencoding beyond pixels using a learned similarity metric. In: Proceedings of the 33nd international conference on machine learning, ICML 2016, June 19-24, 2016, New York City, NY, USA, pp 1558–1566
Mescheder L, Geiger A, Nowozin S (2018) Which training methods for gans do actually converge?. In: Proceedings of the 35th international conference on machine learning, ICML 2018, Stockholmsmässan, July 10-15, 2018, Stockholm, Sweden, pp 3478–3487
Metz L, Poole B, Pfau D, Sohl-Dickstein J (2017) Unrolled generative adversarial networks. In: 5th international conference on learning representations, ICLR 2017, April 24-26, 2017, Toulon, France
Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. In: 6th international conference on learning representations, ICLR 2018, April 30 - May 3, 2018, Vancouver, BC, Canada
Noroozi M, Favaro P (2016) Unsupervised learning of visual representations by solving jigsaw puzzles. In: Proceedings on computer vision - ECCV 2016 - 14th european conference, October 11-14, 2016, Part VI, Amsterdam, The Netherlands, pp 69–84
Pidhorskyi S, Adjeroh DA, Doretto G (2020) Adversarial latent autoencoders. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, June 13-19, 2020, Seattle, WA, USA, pp 14092–14101
Qiao T, Zhang J, Xu D, Tao D (2019) Mirrorgan: Learning text-to-image generation by redescription. In: IEEE conference on computer vision and pattern recognition, CVPR 2019, June 16-20, 2019, Long Beach, CA, USA, pp 1505–1514
Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: 4th international conference on learning representations, ICLR 2016, May 2-4, 2016, San Juan, Puerto Rico
Roccetti M, Delnevo G, Casini L, Cappiello G (2019) Is bigger always better? a controversial journey to the center of machine learning design, with uses and misuses of big data for predicting water meter failures. J Big Data 6(1):1–23
Somasundaram A, Reddy US (2016) Data imbalance: effects and solutions for classification of large and highly imbalanced data. In: International conference on research in engineering, computers and technology (ICRECT 2016), pp 1–16
Sun Y, Wong AK, Kamel MS (2009) Classification of imbalanced data: a review. Int J Pattern Recognit Artif Intell 23(04):687–719
Thanh-Tung H, Tran T (2020) Catastrophic forgetting and mode collapse in gans. In: 2020 international joint conference on neural networks, IJCNN 2020, July 19-24, 2020, Glasgow, United Kingdom, pp 1–10
Thanh-Tung H, Tran T, Venkatesh S (2019) Improving generalization and stability of generative adversarial networks. In: 7th international conference on learning representations, ICLR 2019, May 6-9, 2019, New Orleans, LA, USA
Tran NT, Bui TA, Cheung NM (2018) Dist-gan: An improved GAN using distance constraints. In: Computer Vision - ECCV 2018 - 15th European Conference, September 8-14, 2018, Munich, Germany, pp 387–401
Tran NT, Tran VH, Nguyen NB, Yang L, Cheung NM (2019) Self-supervised GAN: analysis and improvement with multi-class minimax game. In: Advances in neural information processing systems 32: annual conference on neural information processing systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pp 13232–13243
Wang J, Zhou W, Qi G, Fu Z, Tian Q, Li H (2020a) Transformation GAN for unsupervised image synthesis and representation learning. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, June 13-19, 2020, Seattle, WA, USA, pp 469–478
Wang Q, Huang W, Xiong Z, Li X (2020b) Looking closer at the scene: Multiscale representation learning for remote sensing image scene classification. IEEE Trans Neural Netw Learn Syst
Wang Q, Gao J, Lin W, Yuan Y (2021) Pixel-wise crowd understanding via synthetic data. Int J Comput Vis 129(1):225–245
Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, June 18-22, 2018, Salt Lake City, UT, USA, pp 8798–8807
Yu J, Lin Z, Yang J, Shen X, Lu X, Huang TS (2018) Generative image inpainting with contextual attention. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, June 18-22, 2018, Salt Lake City, UT, USA, pp 5505–5514
Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: Proceedings of the 36th international conference on machine learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pp 7354–7363
Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: Computer Vision - ECCV 2016 - 14th european conference, October 11-14, 2016, Amsterdam, The Netherlands, pp 649–666
Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE international conference on computer vision, ICCV 2017, October 22-29, 2017, Venice, Italy, pp 2242–2251
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhou, Q., Zhang, J., Han, G. et al. Enhanced self-supervised GANs with blend ratio classification. Multimed Tools Appl 81, 7651–7667 (2022). https://doi.org/10.1007/s11042-022-12056-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12056-2