Generative adversarial network with hybrid attention and compromised normalization for multi-scene image conversion

Xiao, Jinsheng; Zhang, Shuhao; Yao, Yuntao; Wang, Zhongyuan; Zhang, Yongqin; Wang, Yuan-Fang

doi:10.1007/s00521-021-06841-7

Generative adversarial network with hybrid attention and compromised normalization for multi-scene image conversion

Original Article
Published: 29 January 2022

Volume 34, pages 7209–7225, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Jinsheng Xiao ORCID: orcid.org/0000-0002-5403-1895¹,
Shuhao Zhang¹,
Yuntao Yao¹,
Zhongyuan Wang¹,
Yongqin Zhang² &
…
Yuan-Fang Wang³

503 Accesses
4 Altmetric
Explore all metrics

Abstract

In order to generate high-quality realistic images, this paper proposes an image conversion algorithm based on hybrid attention generation adversarial network. This network is composed of the generator and discriminator, both of which are jointly trained through the loss function. The generator is constructed by using a three-stage structure of down-sampling, residual and up-sampling blocks where the residual block uses a hybrid attention mechanism. The compromised instance and layer normalization is also proposed by weighting the output of the fully connected layer. The multi-scale PatchGAN is introduced as the discriminator. The proposed network can produce more realistic images using a new loss function, which comprises four iterms: generation adversarial loss, $L_1$ regularization loss, VGG loss and feature matching loss. The experimental results demonstrated that the proposed method can produce more realistic and detailed images than the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image Scene Conversion Algorithm Based on Generative Adversarial Networks

Generative residual block for image generation

Article 12 October 2021

Image Enhancement Using Optimized Generative Adversarial Networks

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks, in: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232
Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV), pp 172–189
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Wang C, Zheng H, Yu Z, Zheng Z, Gu Z, Zheng B (2018) Discriminative region proposal adversarial networks for high-quality image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV), pp 770–785
Wang W, Li Z (2018) Advances in generative adversarial network. Tongxin Xuebao/J Commun 39(2):133–146
Google Scholar
Xiao J, Zhou J, Lei J, Xu C, Sui H (2020) Image hazing algorithm based on generative adversarial networks. IEEE Access 8:15883–15894
Article Google Scholar
Xun Huang SB (2016) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE international conference on computer vision, pp 1501–1510
Goodfellow IJ, Jean P-A, Mehdi M, Bing X, Yoshua B (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, pp 214–223
Iizuka S, Simo-Serra E, Ishikawa H (2017) Globally and locally consistent image completion. ACM Trans Graph (TOG) 36(4):1–14
Article Google Scholar
Bhattacharjee D, Kim S,Vizier G, Salzmann M (2020) Dunit: detection-based unsupervised image-to-image translation. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Chen Y-C, Xu X, Jia J (2020) Domain adaptive image-to-image translation. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Jiang S, Chen Y, Yang J, Zhang C, Zhao T (2019) Mixture variational autoencoders. Lecture Notes in Computer Science 128:263–269
Google Scholar
Larsen ABL, Boesen, Sønderby SK, Larochelle H, Winther O (2016) Autoencoding beyond pixels using a learned similarity metric. In: 33rd international conference on machine learning, pp 2341–2349
Xu W, Shawn K, Wang G (2019) Toward learning a unified many-to-many mapping for diverse image translation. Pattern Recognit 93:570–580
Article Google Scholar
Li W, Fan L, Wang Z, Ma C, Cui X (2021) Tackling mode collapse in multi-generator GANs with orthogonal vectors. Pattern Recognit 110:107646
Article Google Scholar
Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8798–8807
Mao X, Li Q, Xie H, Lau R, Zhen W, Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2813–2821
Zhang Y, Xiao J, Peng J, Ding Y, Liu J, Guo Z, Zong X (2018) Kernel wiener filtering model with low-rank approximation for image denoising. Inf Sci 462:402–416
Article MathSciNet Google Scholar
Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2414–2423
Li R, Cao W, Jiao Q, Wu S, Wong H-S (2020) Simplified unsupervised image translation for semantic segmentation adaptation. Pattern Recognit 105:107343
Article Google Scholar
Li Y, Fang C, Yang J, Wang Z, Lu X, Yang M-H (2017) Diversified texture synthesis with feed-forward networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3920–3928
Chen D, Yuan L, Liao J, Yu N, Hua G (2017) Stylebank: an explicit representation for neural image style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1897–1906
Liu M-Y, Tuzel O ( 2016) Coupled generative adversarial networks. In: Advances in neural information processing systems, pp 469–477
Luan F, Paris S, Shechtman E, Bala K (2017) Deep photo style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4990–4998
Bousmalis K, Silberman N, Dohan D, Erhan D, Krishnan D (2017) Unsupervised pixel-level domain adaptation with generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3722–3731
Chen Q, Koltun V (2017) Photographic image synthesis with cascaded refinement networks. In: Proceedings of the IEEE international conference on computer vision, pp 1511–1520
Dosovitskiy A, Brox T (2016) Generating images with perceptual similarity metrics based on deep networks. In: Advances in neural information processing systems, pp 658–666
Dai T, Cai J, Zhang Y, Xia S-T, Zhang L (2019) Second-order attention network for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11057–11066
Zhang Y, Tian Y, Kong Y,Zhong B, Fu Y (2018)Residual dense network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2472–2481
Zhang H, Goodfellow I, Metaxas DN, Odena A. Self-attention generative adversarial networks. Machine Learning. arXiv:1805.08318
Miyato T, Koyama M (2018) cGANs with projection discriminator. In: Proceedings of the international conference on learning representations
Park T, Liu M-Y, Wang T-C, Zhu J-Y (2019) Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2337–2346
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023
Article Google Scholar
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. Lecture Notes in Computer Science 11211:3–19
Article Google Scholar
Li B, Ren W et al (2018) Benchmarking single-image dehazing and beyond. IEEE Trans Image Process 28(1):492–504
Article MathSciNet Google Scholar
Scharstein D, Hirschmüller H, Kitajima Y, Krathwohl G, Nešić N, Wang X, Westling P (2014) High-resolution stereo datasets with subpixel-accurate ground truth. In: German conference on pattern recognition, pp 31–42
Choi LK, You J, Bovik AC (2015) Referenceless prediction of perceptual fog density and perceptual image defogging. IEEE Trans Image Process 24(11):3888–3901
Article MathSciNet Google Scholar
Bi’nkowski M, Sutherland DJ, Arbel M, Gretton A (2018) Demystifying MMD GANs. In: International conference on learning representations

Download references

Acknowledgements

This work is funded by the Social Science Foundation of Shaanxi Province (Grant No. 2019H010), the New Star of Youth Science and Technology of Shaanxi Province (Grant No. 2020KJXX-007) and the Open Project Program Foundation of the Key Laboratory of Opto-Electronics Information Processing, Chinese Academy of Sciences (OEIP-O-202009). The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University.

Author information

Authors and Affiliations

School of Electronic Information, Wuhan University, Wuhan, 430072, People’s Republic of China
Jinsheng Xiao, Shuhao Zhang, Yuntao Yao & Zhongyuan Wang
School of Information Science and Technology, Northwest University, Xi’an, 710127, People’s Republic of China
Yongqin Zhang
Department of Computer Science, University of California, Santa Barbara, CA, 93106, USA
Yuan-Fang Wang

Authors

Jinsheng Xiao
View author publications
You can also search for this author inPubMed Google Scholar
Shuhao Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Yuntao Yao
View author publications
You can also search for this author inPubMed Google Scholar
Zhongyuan Wang
View author publications
You can also search for this author inPubMed Google Scholar
Yongqin Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Yuan-Fang Wang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence to Jinsheng Xiao or Yongqin Zhang.

Ethics declarations

Conflict of interest

We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, J., Zhang, S., Yao, Y. et al. Generative adversarial network with hybrid attention and compromised normalization for multi-scene image conversion. Neural Comput & Applic 34, 7209–7225 (2022). https://doi.org/10.1007/s00521-021-06841-7

Download citation

Received: 26 June 2021
Accepted: 12 December 2021
Published: 29 January 2022
Issue Date: May 2022
DOI: https://doi.org/10.1007/s00521-021-06841-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generative adversarial network with hybrid attention and compromised normalization for multi-scene image conversion

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Image Scene Conversion Algorithm Based on Generative Adversarial Networks

Generative residual block for image generation

Image Enhancement Using Optimized Generative Adversarial Networks

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now