research-article

DiGAN: Directional Generative Adversarial Network for Object Transfiguration

Authors:
Zhen Luo

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College, Zhuhai, China

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College, Zhuhai, China
View Profile

,
Yingfang Zhang

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College, Zhuhai, China

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College, Zhuhai, China
View Profile

,
Peihao Zhong

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College, Zhuhai, China

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College, Zhuhai, China
View Profile

,
Jingjing Chen

School of Computer Science, Fudan University, Shanghai, China

School of Computer Science, Fudan University, Shanghai, China
View Profile

,
Donglong Chen

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College, Zhuhai, China

Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College, Zhuhai, China
View Profile

ICMR '22: Proceedings of the 2022 International Conference on Multimedia RetrievalJune 2022Pages 471–479https://doi.org/10.1145/3512527.3531400

Published:27 June 2022Publication History

ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval

Pages 471–479

ABSTRACT

The concept of cycle consistency in couple mapping has helped CycleGAN illustrate remarkable performance in the context of image-to-image translation. However, its limitations in object transfiguration have not been ideally solved yet. In order to alleviate previous problems of wrong transformation position, degeneration, and artifacts, this work presents a new approach called Directional Generative Adversarial Network (DiGAN) in the field of object transfiguration. The major contribution of this work is threefold. First, paired directional generators are designed for both intra-domain and inter-domain generations. Second, a segmentation network based on Mask R-CNN is introduced to build conditional inputs for both generators and discriminators. Third, a feature loss and a segmentation loss are added to optimize the model. Experimental results indicate that DiGAN surpasses CycleGAN and AttentionGAN by 17.2% and 60.9% higher on Inception Score, 15.5% and 2.05% lower on Fréchet Inception Distance, and 14.2% and 15.6% lower on VGG distance, respectively, in horse-to-zebra mapping.

Supplemental Material

ICMR22-fp218.mp4

mp4

25.1 MB

Download

References

Youssef Alami Mejjati, Christian Richardt, James Tompkin, Darren Cosker, and Kwang In Kim. 2018. Unsupervised attention-guided image-to-image translation. Advances in neural information processing systems 31 (2018).Google Scholar
Martin Arjovsky, Soumith Chintala, and L'eon Bottou. 2017. Wasserstein generative adversarial networks. In ICML. PMLR, 214--223.Google Scholar
Xinyuan Chen, Chang Xu, Xiaokang Yang, and Dacheng Tao. 2018. Attention-GAN for Object Transfiguration in Wild Images. In ECCV (2).Google Scholar
Yunjey Choi, Min-Je Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. StarGAN: Unified Generative Adversarial Networks for Multi- Domain Image-to-Image Translation. In CVPR.Google Scholar
Hao Dong, Paarth Neekhara, Chao Wu, and Yike Guo. 2017. Unsupervised image-to-image translation with generative adversarial networks. arXiv preprint arXiv:1701.02676 (2017).Google Scholar
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In NIPS.Google Scholar
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross B Girshick. 2017. Mask R-CNN. In ICCV.Google Scholar
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In NIPS.Google Scholar
Xun Huang, Yixuan Li, Omid Poursaeed, John E Hopcroft, and Serge J Belongie. 2017. Stacked Generative Adversarial Networks. In CVPR, Vol. 2. 3.Google Scholar
Goodfellow Ian et al . 2017. NIPS 2016 tutorial: Generative adversarial networks. CoRR.--2017.--Vol. abs/1701.00160.--1701.00160 (2017).Google Scholar
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to- Image Translation with Conditional Adversarial Networks. In CVPR.Google Scholar
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In ECCV (2).Google Scholar
Dimitris Kastaniotis, Ioanna Ntinou, Dimitrios Tsourounis, George Economou, and Spiros Fotopoulos. 2018. Attention-aware generative adversarial networks (ATA-GANs). In IVMSP. IEEE, 1--5.Google Scholar
Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, and Jiwon Kim. 2017. Learning to discover cross-domain relations with generative adversarial networks. In ICML. PMLR, 1857--1865.Google Scholar
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google Scholar
Hyeongmin Lee, Taeoh Kim, Eungyeol Song, and Sangyoun Lee. 2018. Collabonet: Collaboration of Generative Models by Unsupervised Classification. In ICIP. IEEE, 1068--1072.Google Scholar
Chuan Li and Michael Wand. 2016. Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks. In ECCV (3).Google Scholar
Xiaodan Liang, Hao Zhang, Liang Lin, and Eric Xing. 2018. Generative Semantic Manipulation with Mask-Contrasting GAN. In ECCV (13).Google Scholar
Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised Image-to-Image Translation Networks. In NIPS.Google Scholar
Ming-Yu Liu and Oncel Tuzel. 2016. Coupled Generative Adversarial Networks. In NIPS.Google Scholar
Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).Google Scholar
Augustus Odena. 2016. Semi-supervised learning with generative adversarial networks. arXiv preprint arXiv:1606.01583 (2016).Google Scholar
Augustus Odena, Vincent Dumoulin, and Chris Olah. 2016. Deconvolution and checkerboard artifacts. Distill 1, 10 (2016), e3.Google ScholarCross Ref
Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional image synthesis with auxiliary classifier gans. In ICML. PMLR, 2642--2651.Google Scholar
Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv1511.06434 (2015).Google Scholar
Shaoqing Ren, Kaiming He, Ross B Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In NIPS.Google Scholar
Tim Salimans, Ian J Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved Techniques for Training GANs. In NIPS.Google Scholar
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
Yaniv Taigman, Adam Polyak, and Lior Wolf. 2016. Unsupervised cross-domain image generation. arXiv preprint arXiv:1611.02200 (2016).Google Scholar
Hao Tang, Hong Liu, Dan Xu, Philip HS Torr, and Nicu Sebe. 2021. Attention-gan: Unpaired image-to-image translation using attention-guided generative adversarial networks. IEEE TNNLS (2021).Google Scholar
Hao Tang, Dan Xu, Nicu Sebe, and Yan Yan. 2019. Attention-guided generative adversarial networks for unsupervised image-to-image translation. In IJCNN. IEEE, 1--8.Google Scholar
Hao Tang, Dan Xu, Wei Wang, Yan Yan, and Nicu Sebe. 2018. Dual generator generative adversarial networks for multi-domain image-to-image translation. In ACCV. Springer, 3--21.Google Scholar
Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2016. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016).Google Scholar
Patricia Vitoria, Lara Raad, and Coloma Ballester. 2020. ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution. In WACV. IEEE Computer Society, 2434--2443.Google Scholar
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-Resolution Image Synthesis and Semantic Manipulation With Conditional GANs. In CVPR.Google Scholar
Chao Yang, Xin Lu, Zhe Lin, Eli Shechtman, Oliver Wang, and Hao Li. 2017. High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis. In CVPR.Google Scholar
Zili Yi, Hao Richard Zhang, Ping Tan, and Minglun Gong. 2017. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. In ICCV.Google Scholar
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In ICCV.Google Scholar
Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Efros, Oliver Wang, and Eli Shechtman. 2017. Toward Multimodal Image-to-Image Translation. In NIPS.Google Scholar

Index Terms

DiGAN: Directional Generative Adversarial Network for Object Transfiguration
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Pyramidal convolution attention generative adversarial network with data augmentation for image denoising
Abstract
Generative adversarial networks (GANs) have shown remarkable effects for various computer vision tasks. Standard convolution plays an important role in the GAN-based model. However, the single type of kernel with a single spatial size limits the ...
Read More
Metric-based Generative Adversarial Network
MM '17: Proceedings of the 25th ACM international conference on Multimedia

Existing methods of generative adversarial network (GAN) use different criteria to distinguish between real and fake samples, such as probability [9],energy [44] energy or other losses [30]. In this paper, by employing the merits of deep metric learning,...
Read More
Low-dose CT denoising using a Progressive Wasserstein generative adversarial network
Abstract
Low-dose computed tomography (LDCT) imaging can greatly reduce the radiation dose imposed on the patient. However, image noise and visual artifacts are inevitable when the radiation dose is low, which has serious impact on the clinical ...
Highlights
- Progressive Wasserstein generative adversarial network for low-dose computed tomography denoising
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval
June 2022
714 pages
ISBN:9781450392389
DOI:10.1145/3512527
General Chairs:
Vincent Oria
New Jersey Institute of Technology, USA
,
Maria Luisa Sapino
Università degli Studi di Torino, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Brigitte Kerhervé
Université du Québec à Montréal, Canada
,
Program Chairs:
Wen-Huang Cheng
National Yang Ming Chao Tung University, Taiwan
,
Ichiro Ide
Nagoya University, Japan
,
Vivek Singh
Rutgers University, USA
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 June 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cycle consistency
feature consistency
generative adversarial network
object transfiguration
segment-conditional generation
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate254of830submissions,31%
Upcoming Conference
ICMR '24

Sponsor:

sigmm

International Conference on Multimedia Retrieval

June 10 - 14, 2024

Phuket , Thailand
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 122
  Total Downloads
- Downloads (Last 12 months)31
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

DiGAN: Directional Generative Adversarial Network for Object Transfiguration

ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Pyramidal convolution attention generative adversarial network with data augmentation for image denoising

Metric-based Generative Adversarial Network

Low-dose CT denoising using a Progressive Wasserstein generative adversarial network

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

DiGAN: Directional Generative Adversarial Network for Object Transfiguration

ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Pyramidal convolution attention generative adversarial network with data augmentation for image denoising

Metric-based Generative Adversarial Network

Low-dose CT denoising using a Progressive Wasserstein generative adversarial network

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media