Skip to main content

Advertisement

Log in

Improved Generative Adversarial Network Learning via Structural Pattern Classification

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Generative adversarial networks (GANs), as a leading class of generative models, have shown remarkable capacity in producing photo-realistic images. In practice, challenges remain in learning complex representations regarding data distribution. In this paper, we propose the structural pattern classification task, which enriches the training of GANs in a self-supervised manner. We first leverage the self-attention layer added to the generator to synthesize images with different structural patterns. Three distinct feature matrices are randomly swapped before calculating the self-attention feature maps. Each pattern stands for one possible combination. Meanwhile, we annotate real images with a fixed pattern. Then, the adversarial training is coupled with an auxiliary classification task. The discriminator needs to tell the correct structural pattern of input images. This auxiliary task provides an additional perspective for the discriminator to learn valuable representations of the data distribution. Empirical studies on CIFAR-10, STL-10, and CELEB-A demonstrate the effectiveness of our proposed structural pattern classification in improving the quality and diversity of the generated images, but limitations remain on corresponding interpretability research and exploration of images with higher resolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availibility

Not applicable.

Code Availability

Not applicable.

References

  1. Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

  2. Lucas A, Lopez-Tapia S, Molina R, Katsaggelos AK (2019) Generative adversarial networks and perceptual losses for video super-resolution. IEEE Trans Image Process 28(7):3312–3327

    Article  MathSciNet  MATH  Google Scholar 

  3. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville AC, Bengio Y (2014) Generative adversarial nets. In: NIPS

  4. Thanh-Tung H, Tran T (2020) Catastrophic forgetting and mode collapse in gans. In: 2020 International joint conference on neural networks (ijcnn), IEEE, pp 1–10

  5. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, vol 30

  6. Thanh-Tung H, Tran T, Venkatesh S (2019) Improving generalization and stability of generative adversarial networks. arXiv preprint arXiv:1902.03984

  7. Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957

  8. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784

  9. Brock A, Donahue J, Simonyan K (2018) Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096

  10. Donahue J, Simonyan K (2019) Large scale adversarial representation learning. In: Advances in neural information processing systems, vol 32

  11. Gidaris S, Singh P, Komodakis N (2018) Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728

  12. Lučić M, Tschannen M, Ritter M, Zhai X, Bachem O, Gelly S (2019) High-fidelity image generation with fewer labels. In: International conference on machine learning, PMLR, pp 4183–4192

  13. Chen T, Zhai X, Ritter M, Lucic M, Houlsby N (2019) Self-supervised gans via auxiliary rotation loss. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12154–12163

  14. Radford A, Metz L, Chintala S(2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434

  15. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, PMLR, pp 448–456

  16. Zhang H, Goodfellow I, Metaxas D, Odena A (2019) Self-attention generative adversarial networks. In: International conference on machine learning, PMLR, pp 7354–7363

  17. Daras G, Odena A, Zhang H, Dimakis AG (2020) Your local gan: designing two dimensional local attention mechanisms for generative models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14531–14539

  18. Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

  19. Dumoulin V, Belghazi I, Poole B, Mastropietro O, Lamb A, Arjovsky M, Courville A (2016) Adversarially learned inference. arXiv preprint arXiv:1606.00704

  20. Tran N-T, Bui T-A, Cheung N-M (2018) Dist-gan: an improved gan using distance constraints. In: Proceedings of the European conference on computer vision (ECCV), pp 370–385

  21. Li S, Li W, Wen S, Shi K, Yang Y, Zhou P, Huang T (2021) Auto-fernet: a facial expression recognition network with architecture search. IEEE Trans Netw Sci Eng 8(3):2213–2222

    Article  Google Scholar 

  22. Li W, Wen S, Shi K, Yang Y, Huang T (2022) Neural architecture search with a lightweight transformer for text-to-image synthesis. IEEE Trans Netw Sci Eng 9(3):1567–1576

    Article  Google Scholar 

  23. Doersch C, Gupta A, Efros AA (2015) Unsupervised visual representation learning by context prediction. In: Proceedings of the IEEE international conference on computer vision, pp 1422–1430

  24. Noroozi M, Favaro P (2016) Unsupervised learning of visual representations by solving jigsaw puzzles. In: European conference on computer vision, Springer, pp 69–84

  25. Noroozi M, Vinjimoor A, Favaro P, Pirsiavash H (2018) Boosting self-supervised learning via knowledge transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9359–9367

  26. Caron M, Bojanowski P, Joulin A, Douze M (2018) Deep clustering for unsupervised learning of visual features. In: Proceedings of the European conference on computer vision (ECCV), pp 132–149

  27. Doersch C, Zisserman A (2017) Multi-task self-supervised visual learning. In: Proceedings of the IEEE International conference on computer vision, pp 2051–2060

  28. Tran N-T, Tran V-H, Nguyen B-N, Yang L, Cheung N-MM (2019) Self-supervised gan: analysis and improvement with multi-class minimax game. In: Advances in neural information processing systems, vol 32

  29. Huang R, Xu W, Lee T-Y, Cherian A, Wang Y, Marks T (2020) Fx-gan: self-supervised gan learning via feature exchange. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3194–3202

  30. Patel P, Kumari N, Singh M, Krishnamurthy B (2021) Lt-gan: Self-supervised gan with latent transformation detection. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 3189–3198

  31. Tran N-T, Tran V-H, Nguyen N-B, Nguyen T-K, Cheung N-M (2021) On data augmentation for gan training. IEEE Trans Image Process 30:1882–1897

    Article  MathSciNet  Google Scholar 

  32. Zhou Q, Zhang J, Han G, Ruan Z, Wei Y (2022) Enhanced self-supervised gans with blend ratio classification. Multimed Tools Appl 81(6):7651–7667

    Article  Google Scholar 

  33. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) ans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, vol 30

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

QZ and JZ contributed to the conception of the study; QZ performed the experiment; QZ and JZ wrote the manuscript; All authors reviewed the manuscript.

Corresponding author

Correspondence to Jianwei Zhang.

Ethics declarations

Conflict of interest

Not applicable.

Ethics approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Q., Zhang, J. & Han, G. Improved Generative Adversarial Network Learning via Structural Pattern Classification. Neural Process Lett 55, 9685–9697 (2023). https://doi.org/10.1007/s11063-023-11221-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-023-11221-4

Keywords