Abstract
Generative adversarial networks (GANs), as a leading class of generative models, have shown remarkable capacity in producing photo-realistic images. In practice, challenges remain in learning complex representations regarding data distribution. In this paper, we propose the structural pattern classification task, which enriches the training of GANs in a self-supervised manner. We first leverage the self-attention layer added to the generator to synthesize images with different structural patterns. Three distinct feature matrices are randomly swapped before calculating the self-attention feature maps. Each pattern stands for one possible combination. Meanwhile, we annotate real images with a fixed pattern. Then, the adversarial training is coupled with an auxiliary classification task. The discriminator needs to tell the correct structural pattern of input images. This auxiliary task provides an additional perspective for the discriminator to learn valuable representations of the data distribution. Empirical studies on CIFAR-10, STL-10, and CELEB-A demonstrate the effectiveness of our proposed structural pattern classification in improving the quality and diversity of the generated images, but limitations remain on corresponding interpretability research and exploration of images with higher resolution.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availibility
Not applicable.
Code Availability
Not applicable.
References
Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Lucas A, Lopez-Tapia S, Molina R, Katsaggelos AK (2019) Generative adversarial networks and perceptual losses for video super-resolution. IEEE Trans Image Process 28(7):3312–3327
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: NIPS
Thanh-Tung H, Tran T (2020) Catastrophic forgetting and mode collapse in gans. In: 2020 International joint conference on neural networks (ijcnn), IEEE, pp 1–10
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, vol 30
Thanh-Tung H, Tran T, Venkatesh S (2019) Improving generalization and stability of generative adversarial networks. arXiv preprint arXiv:1902.03984
Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784
Brock A, Donahue J, Simonyan K (2018) Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096
Donahue J, Simonyan K (2019) Large scale adversarial representation learning. In: Advances in neural information processing systems, vol 32
Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728
Lučić M, Tschannen M, Ritter M, Zhai X, Bachem O, Gelly S (2019) High-fidelity image generation with fewer labels. In: International conference on machine learning, PMLR, pp 4183–4192
Chen T, Zhai X, Ritter M, Lucic M, Houlsby N (2019) Self-supervised gans via auxiliary rotation loss. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12154–12163
Radford A, Metz L, Chintala S(2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, PMLR, pp 448–456
Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International conference on machine learning, PMLR, pp 7354–7363
Daras G, Odena A, Zhang H, Dimakis AG (2020) Your local gan: designing two dimensional local attention mechanisms for generative models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14531–14539
Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Dumoulin V, Belghazi I, Poole B, Mastropietro O, Lamb A, Arjovsky M, Courville A (2016) Adversarially learned inference. arXiv preprint arXiv:1606.00704
Tran N-T, Bui T-A, Cheung N-M (2018) Dist-gan: an improved gan using distance constraints. In: Proceedings of the European conference on computer vision (ECCV), pp 370–385
Li S, Li W, Wen S, Shi K, Yang Y, Zhou P, Huang T (2021) Auto-fernet: a facial expression recognition network with architecture search. IEEE Trans Netw Sci Eng 8(3):2213–2222
Li W, Wen S, Shi K, Yang Y, Huang T (2022) Neural architecture search with a lightweight transformer for text-to-image synthesis. IEEE Trans Netw Sci Eng 9(3):1567–1576
Doersch C, Gupta A, Efros AA (2015) Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE international conference on computer vision, pp 1422–1430
Noroozi M, Favaro P (2016) Unsupervised learning of visual representations by solving jigsaw puzzles. In: European conference on computer vision, Springer, pp 69–84
Noroozi M, Vinjimoor A, Favaro P, Pirsiavash H (2018) Boosting self-supervised learning via knowledge transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9359–9367
Caron M, Bojanowski P, Joulin A, Douze M (2018) Deep clustering for unsupervised learning of visual features. In: Proceedings of the European conference on computer vision (ECCV), pp 132–149
Doersch C, Zisserman A (2017) Multi-task self-supervised visual learning. In: Proceedings of the IEEE International conference on computer vision, pp 2051–2060
Tran N-T, Tran V-H, Nguyen B-N, Yang L, Cheung N-MM (2019) Self-supervised gan: analysis and improvement with multi-class minimax game. In: Advances in neural information processing systems, vol 32
Huang R, Xu W, Lee T-Y, Cherian A, Wang Y, Marks T (2020) Fx-gan: self-supervised gan learning via feature exchange. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3194–3202
Patel P, Kumari N, Singh M, Krishnamurthy B (2021) Lt-gan: Self-supervised gan with latent transformation detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3189–3198
Tran N-T, Tran V-H, Nguyen N-B, Nguyen T-K, Cheung N-M (2021) On data augmentation for gan training. IEEE Trans Image Process 30:1882–1897
Zhou Q, Zhang J, Han G, Ruan Z, Wei Y (2022) Enhanced self-supervised gans with blend ratio classification. Multimed Tools Appl 81(6):7651–7667
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) ans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, vol 30
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
QZ and JZ contributed to the conception of the study; QZ performed the experiment; QZ and JZ wrote the manuscript; All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
Not applicable.
Ethics approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, Q., Zhang, J. & Han, G. Improved Generative Adversarial Network Learning via Structural Pattern Classification. Neural Process Lett 55, 9685–9697 (2023). https://doi.org/10.1007/s11063-023-11221-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-023-11221-4