SA-SinGAN: self-attention for single-image generation adversarial networks

Chen, Xi; Zhao, Hongdong; Yang, Dongxu; Li, Yueyuan; Kang, Qing; Lu, Haiyan

doi:10.1007/s00138-021-01228-z

SA-SinGAN: self-attention for single-image generation adversarial networks

Original Paper
Published: 09 July 2021

Volume 32, article number 104, (2021)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Xi Chen¹,
Hongdong Zhao¹,
Dongxu Yang¹,
Yueyuan Li²,
Qing Kang¹ &
…
Haiyan Lu¹

853 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Single-image training is a research hotspot task of generating adversarial networks, especially in tasks such as image editing and image coordination. However, the existing network has a series of problems such as a long training time, poor image quality, and an unstable training model. Based on the research hot issues, we propose a single-image generation adversarial network of the self-attention mechanism and discuss the changes of the model when the self-attention mechanism is placed in different positions of the generator. We introduced the spectral normalization in the generator and discriminator networks to stabilize the training process and compared the influence of the learning rate on the network. We used artificial vision and model evaluation methods to test the performance of the model on three representative datasets and compared with the current more advanced models. Experiments show that our proposed model has better performance than single-sample generative adversarial networks, reducing Single Image Fréchet Inception Distance (SIFID) from 4.80 to 2.057 on the challenging Generation datasets, reducing SIFID from 0.06 to 0.02 on the Places datasets, and reducing SIFID from 0.23 to 0.04 on the LSUN datasets. The training time of our model is one-ninth of the single-sample generation adversarial network, which can obtain the overall structure of the single training sample, which has great research significance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DerainAttentionGAN: unsupervised single-image deraining using attention-guided generative adversarial networks

Article 08 July 2021

Generative residual block for image generation

Article 12 October 2021

Effective shortcut technique for generative adversarial networks

Article 05 May 2022

References

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ACM. 63, 139–144 (2020). https://doi.org/10.1145/3422622
Article Google Scholar
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3730–3738 (2015). https://doi.org/10.1109/ICCV.2015.425
Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., Metaxas, D.: StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 5908–5916 (2017). https://doi.org/10.1109/ICCV.2017.629
Cheng, P., He, S., Stojanovic, V., Luan, X., Liu, F.: Fuzzy fault detection for Markov jump systems with partly accessible hidden information: an event-triggered approach. IEEE Trans. Cybernet. (2021). https://doi.org/10.1109/TCYB.2021.3050209
Article Google Scholar
Wei, T., Li, X., Stojanovic, V.: Input-to-state stability of impulsive reaction–diffusion neural networks with infinite distributed delays. Nonlinear Dyn. 103, 1733–1755 (2021). https://doi.org/10.1007/s11071-021-06208-6
Article Google Scholar
Tao, H., Li, X., Paszke, W., Stojanovic, V., Yang, H.: Robust PD-type iterative learning control for discrete systems with multiple time-delays subjected to polytopic uncertainty and restricted frequency-domain. Multidim. Syst. Sign Process. 32, 671–692 (2021). https://doi.org/10.1007/s11045-020-00754-9
Article MathSciNet MATH Google Scholar
Pathak, D., Krähenbühl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2536–2544 (2016). https://doi.org/10.1109/CVPR.2016.278
Shocher, A., Bagon, S., Isola, P., Irani, M.: InGAN: capturing and retargeting the “DNA” of a natural image. IEEE Comput. Soc. (2019). https://doi.org/10.1109/ICCV.2019.00459
Article Google Scholar
Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., Shi, W.: Photo-realistic single image super-resolution using a generative adversarial network. IEEE Comput. Soc. (2017). https://doi.org/10.1109/CVPR.2017.19
Article Google Scholar
Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. IEEE Comput. Soc. (2017). https://doi.org/10.1109/CVPR.2017.632
Article Google Scholar
Zeng, W., Zhao, M., Gao, Y., Zhang, Z.: TileGAN: category-oriented attention-based high-quality tiled clothes generation from dressed person. Neural Comput. Appl. 32, 17587–17600 (2020). https://doi.org/10.1007/s00521-020-04928-1
Article Google Scholar
Wang, C., Xing, X., Yao, G., Su, Z.: Single image deraining via deep shared pyramid network. Vis. Comput. (2020). https://doi.org/10.1007/s00371-020-01944-z
Article Google Scholar
Shaham, T.R., Dekel, T., Michaeli, T.: SinGAN: learning a generative model from a single natural image. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 4569–4579 (2019). https://doi.org/10.1109/ICCV.2019.00467
Fang, H., Zhu, G., Stojanovic, V., Nie, R., He, S., Luan, X., Liu, F.: Adaptive optimization algorithm for nonlinear Markov jump systems with partial unknown dynamics. Int. J. Robust Nonlinear Control 31, 2126–2140 (2021). https://doi.org/10.1002/rnc.5350
Article Google Scholar
Zhou, D., Liu, Y., Li, X., Zhang, C.: Single-image super-resolution based on local biquadratic spline with edge constraints and adaptive optimization in transform domain. Vis. Comput. (2020). https://doi.org/10.1007/s00371-020-02007-z
Article Google Scholar
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGan. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8107–8116 (2020). https://doi.org/10.1109/CVPR42600.2020.00813
Wang, M., Chen, Z., Wu, Q.M.J., Jian, M.: Improved face super-resolution generative adversarial networks. Mach. Vis. Appl. 31, 22 (2020). https://doi.org/10.1007/s00138-020-01073-6
Article Google Scholar
Simakov, D., Caspi, Y., Shechtman, E., Irani, M.: Summarizing visual data using bidirectional similarity. IEEE Comput. Soc. (2008). https://doi.org/10.1109/CVPR.2008.4587842
Article Google Scholar
Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., Metaxas, D.N.: StackGAN++: realistic image synthesis with stacked generative adversarial networks. IEEE Trans. Pattern Anal. Mach. Intell. 41, 1947–1962 (2019). https://doi.org/10.1109/TPAMI.2018.2856256
Article Google Scholar
Karnewar, A., Wang, O.: MSG-GAN: Multi-scale gradients for generative adversarial networks. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7796–7805 (2020). https://doi.org/10.1109/CVPR42600.2020.00782
Dudhane, A., Aulakh, H.S., Murala, S.: RI-GAN: An end-to-end network for single image haze removal. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 2014–2023 (2019). https://doi.org/10.1109/CVPRW.2019.00253
Torfi, A., Beyki, M., Fox, E.A.: On the evaluation of generative adversarial networks by discriminative models. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 991–998 (2021). https://doi.org/10.1109/ICPR48806.2021.9412214
Wang, W., Wang, A., Ai, Q., Liu, C., Liu, J.: AAGAN: enhanced single image dehazing with attention-to-attention generative adversarial network. IEEE Access 7, 173485–173498 (2019). https://doi.org/10.1109/ACCESS.2019.2957057
Article Google Scholar
Shocher, A., Cohen, N., Irani, M.: Zero-shot super-resolution using deep internal learning. IEEE Comput. Soc. (2018). https://doi.org/10.1109/CVPR.2018.00329
Article Google Scholar
Zhou, Y., Zhu, Z., Bai, X., Lischinski, D., Cohen-Or, D., Huang, H.: Non-stationary texture synthesis by adversarial expansion. ACM Trans Graph 37, 49:1-49:13 (2018). https://doi.org/10.1145/3197517.3201285
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010. Curran Associates Inc., Red Hook (2017)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. Comput. Res. Repos. (CoRR) arXiv:1511.06434 (2015)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4396–4405 (2019). https://doi.org/10.1109/CVPR.2019.00453
Zhu, J.-Y., Zhang, R., Pathak, D., Darrell, T., Efros, A., Wang, O., Shechtman, E.: Toward multimodal image-to-image translation (2017)
Huang, X., Li, Y., Poursaeed, O., Hopcroft, J., Belongie, S.: Stacked generative adversarial networks. IEEE Comput. Soc. (2017). https://doi.org/10.1109/CVPR.2017.202
Article Google Scholar
Wang, T.-C., Liu, M.-Y., Zhu, J.-Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018). https://doi.org/10.1109/CVPR.2018.00917
Wang, W., Cui, Y., Li, G., Jiang, C., Deng, S.: A self-attention-based destruction and construction learning fine-grained image classification method for retail product recognition. Neural Comput. Appl. 32, 14613–14622 (2020). https://doi.org/10.1007/s00521-020-05148-3
Article Google Scholar
Li, H., Zhang, H., Qi, X., Ruigang, Y., Huang, G.: Improved techniques for training adaptive deep networks. IEEE Comput. Soc. (2019). https://doi.org/10.1109/ICCV.2019.00198
Article Google Scholar
Zhang, T., Li, Z., Zhu, Q., Zhang, D.: Improved procedures for training primal wasserstein GANs. In: 2019 IEEE SmartWorld, ubiquitous intelligence computing, advanced trusted computing, scalable computing communications, cloud big data computing, internet of people and smart city innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp. 1601–1607 (2019). https://doi.org/10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00286
Xiaopeng, C., Jiangzhong, C., Yuqin, L., Qingyun, D.: Improved training of spectral normalization generative adversarial networks. In: 2020 2nd World Symposium on Artificial Intelligence (WSAI), pp. 24–28 (2020). https://doi.org/10.1109/WSAI49636.2020.9143310
Roth, K., Lucchi, A., Nowozin, S., Hofmann, T.: Stabilizing training of generative adversarial networks through regularization. In: Advances in Neural Information Processing Systems 30, pp. 2019–2029. Curran (2018). https://doi.org/10.3929/ethz-b-000223162
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv:1802.05957 [cs, stat]. (2018)
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 5767–5777 (2017)
Luan, F., Paris, S., Shechtman, E., Bala, K.: Deep painterly harmonization. Comput. Graph. Forum. 37, 95–106 (2018). https://doi.org/10.1111/cgf.13478
Article Google Scholar
Hinz, T., Fisher, M., Wang, O., Wermter, S.: Improved techniques for training single-image GANs (2020)
Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer vision—ECCV 2016, pp. 649–666. Springer International Publishing, Cham (2016)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronic and Information Engineering, Hebei University of Technology, Tianjin, 300401, China
Xi Chen, Hongdong Zhao, Dongxu Yang, Qing Kang & Haiyan Lu
School of Physics and Electronic Engineering, Northwest Normal University, LanZhou, 730071, China
Yueyuan Li

Authors

Xi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hongdong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Dongxu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yueyuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Qing Kang
View author publications
You can also search for this author in PubMed Google Scholar
Haiyan Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongdong Zhao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, X., Zhao, H., Yang, D. et al. SA-SinGAN: self-attention for single-image generation adversarial networks. Machine Vision and Applications 32, 104 (2021). https://doi.org/10.1007/s00138-021-01228-z

Download citation

Received: 17 March 2021
Revised: 27 June 2021
Accepted: 28 June 2021
Published: 09 July 2021
DOI: https://doi.org/10.1007/s00138-021-01228-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SA-SinGAN: self-attention for single-image generation adversarial networks

Abstract

Access this article

Similar content being viewed by others

DerainAttentionGAN: unsupervised single-image deraining using attention-guided generative adversarial networks

Generative residual block for image generation

Effective shortcut technique for generative adversarial networks

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SA-SinGAN: self-attention for single-image generation adversarial networks

Abstract

Access this article

Similar content being viewed by others

DerainAttentionGAN: unsupervised single-image deraining using attention-guided generative adversarial networks

Generative residual block for image generation

Effective shortcut technique for generative adversarial networks

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation