Abstract
In order to generate high-quality realistic images, this paper proposes an image conversion algorithm based on hybrid attention generation adversarial network. This network is composed of the generator and discriminator, both of which are jointly trained through the loss function. The generator is constructed by using a three-stage structure of down-sampling, residual and up-sampling blocks where the residual block uses a hybrid attention mechanism. The compromised instance and layer normalization is also proposed by weighting the output of the fully connected layer. The multi-scale PatchGAN is introduced as the discriminator. The proposed network can produce more realistic images using a new loss function, which comprises four iterms: generation adversarial loss, \(L_1\) regularization loss, VGG loss and feature matching loss. The experimental results demonstrated that the proposed method can produce more realistic and detailed images than the state-of-the-art methods.














Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks, in: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232
Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV), pp 172–189
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Wang C, Zheng H, Yu Z, Zheng Z, Gu Z, Zheng B (2018) Discriminative region proposal adversarial networks for high-quality image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV), pp 770–785
Wang W, Li Z (2018) Advances in generative adversarial network. Tongxin Xuebao/J Commun 39(2):133–146
Xiao J, Zhou J, Lei J, Xu C, Sui H (2020) Image hazing algorithm based on generative adversarial networks. IEEE Access 8:15883–15894
Xun Huang SB (2016) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE international conference on computer vision, pp 1501–1510
Goodfellow IJ, Jean P-A, Mehdi M, Bing X, Yoshua B (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of the 34th international conference on machine learning, pp 214–223
Iizuka S, Simo-Serra E, Ishikawa H (2017) Globally and locally consistent image completion. ACM Trans Graph (TOG) 36(4):1–14
Bhattacharjee D, Kim S,Vizier G, Salzmann M (2020) Dunit: detection-based unsupervised image-to-image translation. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Chen Y-C, Xu X, Jia J (2020) Domain adaptive image-to-image translation. In: IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Jiang S, Chen Y, Yang J, Zhang C, Zhao T (2019) Mixture variational autoencoders. Lecture Notes in Computer Science 128:263–269
Larsen ABL, Boesen, Sønderby SK, Larochelle H, Winther O (2016) Autoencoding beyond pixels using a learned similarity metric. In: 33rd international conference on machine learning, pp 2341–2349
Xu W, Shawn K, Wang G (2019) Toward learning a unified many-to-many mapping for diverse image translation. Pattern Recognit 93:570–580
Li W, Fan L, Wang Z, Ma C, Cui X (2021) Tackling mode collapse in multi-generator GANs with orthogonal vectors. Pattern Recognit 110:107646
Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8798–8807
Mao X, Li Q, Xie H, Lau R, Zhen W, Smolley S (2017) Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2813–2821
Zhang Y, Xiao J, Peng J, Ding Y, Liu J, Guo Z, Zong X (2018) Kernel wiener filtering model with low-rank approximation for image denoising. Inf Sci 462:402–416
Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2414–2423
Li R, Cao W, Jiao Q, Wu S, Wong H-S (2020) Simplified unsupervised image translation for semantic segmentation adaptation. Pattern Recognit 105:107343
Li Y, Fang C, Yang J, Wang Z, Lu X, Yang M-H (2017) Diversified texture synthesis with feed-forward networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3920–3928
Chen D, Yuan L, Liao J, Yu N, Hua G (2017) Stylebank: an explicit representation for neural image style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1897–1906
Liu M-Y, Tuzel O ( 2016) Coupled generative adversarial networks. In: Advances in neural information processing systems, pp 469–477
Luan F, Paris S, Shechtman E, Bala K (2017) Deep photo style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4990–4998
Bousmalis K, Silberman N, Dohan D, Erhan D, Krishnan D (2017) Unsupervised pixel-level domain adaptation with generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3722–3731
Chen Q, Koltun V (2017) Photographic image synthesis with cascaded refinement networks. In: Proceedings of the IEEE international conference on computer vision, pp 1511–1520
Dosovitskiy A, Brox T (2016) Generating images with perceptual similarity metrics based on deep networks. In: Advances in neural information processing systems, pp 658–666
Dai T, Cai J, Zhang Y, Xia S-T, Zhang L (2019) Second-order attention network for single image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11057–11066
Zhang Y, Tian Y, Kong Y,Zhong B, Fu Y (2018)Residual dense network for image super-resolution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2472–2481
Zhang H, Goodfellow I, Metaxas DN, Odena A. Self-attention generative adversarial networks. Machine Learning. arXiv:1805.08318
Miyato T, Koyama M (2018) cGANs with projection discriminator. In: Proceedings of the international conference on learning representations
Park T, Liu M-Y, Wang T-C, Zhu J-Y (2019) Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2337–2346
Hu J, Shen L, Albanie S, Sun G, Wu E (2020) Squeeze-and-excitation networks. IEEE Trans Pattern Anal Mach Intell 42(8):2011–2023
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: convolutional block attention module. Lecture Notes in Computer Science 11211:3–19
Li B, Ren W et al (2018) Benchmarking single-image dehazing and beyond. IEEE Trans Image Process 28(1):492–504
Scharstein D, Hirschmüller H, Kitajima Y, Krathwohl G, Nešić N, Wang X, Westling P (2014) High-resolution stereo datasets with subpixel-accurate ground truth. In: German conference on pattern recognition, pp 31–42
Choi LK, You J, Bovik AC (2015) Referenceless prediction of perceptual fog density and perceptual image defogging. IEEE Trans Image Process 24(11):3888–3901
Bi’nkowski M, Sutherland DJ, Arbel M, Gretton A (2018) Demystifying MMD GANs. In: International conference on learning representations
Acknowledgements
This work is funded by the Social Science Foundation of Shaanxi Province (Grant No. 2019H010), the New Star of Youth Science and Technology of Shaanxi Province (Grant No. 2020KJXX-007) and the Open Project Program Foundation of the Key Laboratory of Opto-Electronics Information Processing, Chinese Academy of Sciences (OEIP-O-202009). The numerical calculations in this paper have been done on the supercomputing system in the Supercomputing Center of Wuhan University.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xiao, J., Zhang, S., Yao, Y. et al. Generative adversarial network with hybrid attention and compromised normalization for multi-scene image conversion. Neural Comput & Applic 34, 7209–7225 (2022). https://doi.org/10.1007/s00521-021-06841-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06841-7