skip to main content
10.1145/3581807.3581904acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccprConference Proceedingsconference-collections
research-article

A Symmetric Dual-Attention Generative Adversarial Network with Channel and Spatial Features Fusion

Published: 22 May 2023 Publication History

Abstract

Many existing generative adversarial networks (GANs) lack effective semantic modeling, leading to unnatural local details and blurring in generated images. In this work, based on DivCo, we propose a Symmetric Dual-Attention Generative Adversarial Network (DivCo-SDAGAN) with channel and spatial feature fusion in which the Dual-Attention Module (DAM) is introduced to strengthen the feature representation ability of the network to synthesize photo-realistic images with more natural local details. The Channel Weighted Aggregation Module (CWAM) and the Spatial Attention Module (SAM) of the DAM are designed to capture the semantic information of channel dimension and spatial dimension, respectively, and they can be easily integrated into other GANs-based models. Extensive experiments show that the proposed DivCo-SDAGAN can produce more diverse images under the same input, achieving more satisfactory results than other existing methods.

References

[1]
Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. "Generative adversarial nets." Advances in neural information processing systems, 27.
[2]
Park, Taesung, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. 2019. "Semantic image synthesis with spatially-adaptive normalization." In Proceedings of the IEEE conference on computer vision and pattern recognition. 2337-2346.
[3]
Karnewar, Animesh, and Oliver Wang. 2020. "Msg-gan: Multi-scale gradients for generative adversarial networks." In Proceedings of the IEEE conference on computer vision and pattern recognition. 7799-7808.
[4]
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S. Huang. 2018. Generative Image Inpainting with Contextual Attention. arXiv:1801.07892. Retrieved from https://arxiv.org/abs/1801.07892.
[5]
Brian Dolhansky and Cristian Canton Ferrer. 2018. Eye in-painting with exemplar generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7902–7911.
[6]
Ledig, Christian, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken 2017. "Photo-realistic single image super-resolution using a generative adversarial network." In Proceedings of the IEEE conference on computer vision and pattern recognition, 4681-4690.
[7]
Dwarikanath Mahapatra, Behzad Bozorgtabar, and Rahil Garnavi. 2019. Image super-resolution using progressive generative adversarial networks for medical image analysis. Computerized Medical Imaging and Graphics, 71 (2019), 30–39.
[8]
Park, Hyoung Suk, Jineon Baek, Sun Kyoung You, Jae Kyu Choi, and Jin Keun Seo. 2019. "Unpaired image denoising using a generative adversarial network in X-ray CT." IEEE Access, 7, 110414-110425.
[9]
Tero Karras, Samuli Laine, and Timo Aila. 2018. A Style-based Generator Architecture for Generative Adversarial Networks. arXiv:1812.04948. Retrieved from https://arxiv.org/abs/1812.04948.
[10]
Ling, Huan, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, and Sanja Fidler. 2021. "Editgan: High-precision semantic image editing." Advances in Neural Information Processing Systems, 34 (2021), 16331-16345.
[11]
Kazeminia, Salome, Christoph Baur, Arjan Kuijper, Bram van Ginneken, Nassir Navab, Shadi Albarqouni, and Anirban Mukhopadhyay. 2020. "GANs for medical image analysis." Artificial Intelligence in Medicine 109. 101938.
[12]
Liu, Rui, Yixiao Ge, Ching Lam Choi, Xiaogang Wang, and Hongsheng Li. 2021. "Divco: Diverse conditional image synthesis via contrastive generative adversarial network." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 16377-16386.
[13]
Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, and Eli Shechtman. 2017. Toward multimodal image-to-image translation. In Advances in Neural Information Processing Systems. 465–476.
[14]
Xu, Jian, Cunzhao Shi, Chengzuo Qi, Chunheng Wang, and Baihua Xiao. 2018. "Unsupervised part-based weighting aggregation of deep convolutional features for image retrieval." In Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 32, no. 1.
[15]
Tang, Hao, Song Bai, and Nicu Sebe. 2020. "Dual attention gans for semantic image synthesis." In Proceedings of the 28th ACM International Conference on Multimedia. 1994-2002.
[16]
Tang, Hao, Dan Xu, Yan Yan, Philip HS Torr, and Nicu Sebe. 2020. "Local class-specific and global image-level generative adversarial networks for semantic-guided scene generation." In Proceedings of the IEEE conference on computer vision and pattern recognition. 7870-7879.
[17]
Cai, Mu, Hong Zhang, Huijuan Huang, Qichuan Geng, Yixuan Li, and Gao Huang. 2021. "Frequency domain image translation: More photo-realistic, better identity-preserving." In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13930-13940.
[18]
Yu, Ning, Guilin Liu, Aysegul Dundar, Andrew Tao, Bryan Catanzaro, Larry S. Davis, and Mario Fritz. 2021. "Dual contrastive loss and attention for gans." In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6731-6742.
[19]
Guibas, John T., Tejpal S. Virdi, and Peter S. Li. 2017. "Synthetic medical images from dual generative adversarial networks." arXiv preprint arXiv:1709.01872.
[20]
Esser, Patrick, Robin Rombach, and Bjorn Ommer. 2021. "Taming transformers for high-resolution image synthesis." In Proceedings of the IEEE conference on computer vision and pattern recognition. 12873-12883.
[21]
Mirza, Mehdi, and Simon Osindero. 2014. "Conditional generative adversarial nets." arXiv preprint arXiv:1411.1784
[22]
Cai, Yali, Xiaoru Wang, Zhihong Yu, Fu Li, Peirong Xu, Yueli Li, and Lixian Li. 2019. "Dualattn-GAN: Text to image synthesis with dual attentional generative adversarial network." IEEE Access. 10, 7, 183706-183716.
[23]
Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. 2016. Generative Adversarial Text to Image Synthesis. arXiv:1605.05396. Retrieved from https://arxiv.org/abs/1605.05396.
[24]
Sun, Wei, and Tianfu Wu. 2019. "Image synthesis from reconfigurable layout and style." In Proceedings of the IEEE International Conference on Computer Vision. 10531-10540.
[25]
Li, Yandong, Yu Cheng, Zhe Gan, Licheng Yu, Liqiang Wang, and Jingjing Liu. 2020. "Bachgan: High-resolution image synthesis from salient object layout." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8365-8374.
[26]
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-resolution image synthesis and semantic manipulation with conditional GANs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8798-8807.
[27]
Odena, Augustus, Christopher Olah, and Jonathon Shlens. 2017. "Conditional image synthesis with auxiliary classifier gans." In International conference on machine learning, PMLR, 17 (July 2017), 2642-2651.
[28]
Kushwaha, Vandana, and G. C. Nandi. 2020. "Study of prevention of mode collapse in generative adversarial network (GAN)." In 2020 IEEE 4th Conference on Information & Communication Technology (CICT), IEEE, 1-6.
[29]
Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. 2015. "U-net: Convolutional networks for biomedical image segmentation." In International Conference on Medical image computing and computer-assisted intervention. Springer, Cham. 234-241.
[30]
Demir, Ugur, and Gozde Unal. 2018. "Patch-based image inpainting with generative adversarial networks." arXiv preprint arXiv:1803.07422.
[31]
Xu, Jian, Chunheng Wang, Chengzuo Qi, Cunzhao Shi, and Baihua Xiao. 2018. "Unsupervised semantic-based aggregation of deep convolutional features." IEEE Transactions on Image Processing. 28, 2 (2018), 601-611.
[32]
Woo, Sanghyun, Jongchan Park, Joon-Young Lee, and In So Kweon. 2018. "Cbam: Convolutional block attention module." In Proceedings of the European conference on computer vision (ECCV), 3-19.
[33]
Zagoruyko, Sergey, and Nikos Komodakis. 2016. "Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer." arXiv preprint arXiv:1612.03928.
[34]
Heusel, Martin, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. "Gans trained by a two time-scale update rule converge to a local nash equilibrium." Advances in neural information processing systems, 30.
[35]
Zhang, Richard, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. "The unreasonable effectiveness of deep features as a perceptual metric." In Proceedings of the IEEE conference on computer vision and pattern recognition, 586-595.
[36]
Berthelot, David, Thomas Schumm, and Luke Metz. 2017. "Began: Boundary equilibrium generative adversarial networks." arXiv preprint arXiv:1703.10717.

Index Terms

  1. A Symmetric Dual-Attention Generative Adversarial Network with Channel and Spatial Features Fusion

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCPR '22: Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition
    November 2022
    683 pages
    ISBN:9781450397056
    DOI:10.1145/3581807
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 May 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Channel Attention
    2. Generative Adversarial Networks
    3. Semantic Image Synthesis
    4. Spatial Attention

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICCPR 2022

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 32
      Total Downloads
    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 18 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media