skip to main content
10.1145/3512527.3531400acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

DiGAN: Directional Generative Adversarial Network for Object Transfiguration

Authors Info & Claims
Published:27 June 2022Publication History

ABSTRACT

The concept of cycle consistency in couple mapping has helped CycleGAN illustrate remarkable performance in the context of image-to-image translation. However, its limitations in object transfiguration have not been ideally solved yet. In order to alleviate previous problems of wrong transformation position, degeneration, and artifacts, this work presents a new approach called Directional Generative Adversarial Network (DiGAN) in the field of object transfiguration. The major contribution of this work is threefold. First, paired directional generators are designed for both intra-domain and inter-domain generations. Second, a segmentation network based on Mask R-CNN is introduced to build conditional inputs for both generators and discriminators. Third, a feature loss and a segmentation loss are added to optimize the model. Experimental results indicate that DiGAN surpasses CycleGAN and AttentionGAN by 17.2% and 60.9% higher on Inception Score, 15.5% and 2.05% lower on Fréchet Inception Distance, and 14.2% and 15.6% lower on VGG distance, respectively, in horse-to-zebra mapping.

Skip Supplemental Material Section

Supplemental Material

ICMR22-fp218.mp4

mp4

25.1 MB

References

  1. Youssef Alami Mejjati, Christian Richardt, James Tompkin, Darren Cosker, and Kwang In Kim. 2018. Unsupervised attention-guided image-to-image translation. Advances in neural information processing systems 31 (2018).Google ScholarGoogle Scholar
  2. Martin Arjovsky, Soumith Chintala, and L'eon Bottou. 2017. Wasserstein generative adversarial networks. In ICML. PMLR, 214--223.Google ScholarGoogle Scholar
  3. Xinyuan Chen, Chang Xu, Xiaokang Yang, and Dacheng Tao. 2018. Attention-GAN for Object Transfiguration in Wild Images. In ECCV (2).Google ScholarGoogle Scholar
  4. Yunjey Choi, Min-Je Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. StarGAN: Unified Generative Adversarial Networks for Multi- Domain Image-to-Image Translation. In CVPR.Google ScholarGoogle Scholar
  5. Hao Dong, Paarth Neekhara, Chao Wu, and Yike Guo. 2017. Unsupervised image-to-image translation with generative adversarial networks. arXiv preprint arXiv:1701.02676 (2017).Google ScholarGoogle Scholar
  6. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In NIPS.Google ScholarGoogle Scholar
  7. Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross B Girshick. 2017. Mask R-CNN. In ICCV.Google ScholarGoogle Scholar
  8. Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In NIPS.Google ScholarGoogle Scholar
  9. Xun Huang, Yixuan Li, Omid Poursaeed, John E Hopcroft, and Serge J Belongie. 2017. Stacked Generative Adversarial Networks. In CVPR, Vol. 2. 3.Google ScholarGoogle Scholar
  10. Goodfellow Ian et al . 2017. NIPS 2016 tutorial: Generative adversarial networks. CoRR.--2017.--Vol. abs/1701.00160.--1701.00160 (2017).Google ScholarGoogle Scholar
  11. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to- Image Translation with Conditional Adversarial Networks. In CVPR.Google ScholarGoogle Scholar
  12. Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In ECCV (2).Google ScholarGoogle Scholar
  13. Dimitris Kastaniotis, Ioanna Ntinou, Dimitrios Tsourounis, George Economou, and Spiros Fotopoulos. 2018. Attention-aware generative adversarial networks (ATA-GANs). In IVMSP. IEEE, 1--5.Google ScholarGoogle Scholar
  14. Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, and Jiwon Kim. 2017. Learning to discover cross-domain relations with generative adversarial networks. In ICML. PMLR, 1857--1865.Google ScholarGoogle Scholar
  15. Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google ScholarGoogle Scholar
  16. Hyeongmin Lee, Taeoh Kim, Eungyeol Song, and Sangyoun Lee. 2018. Collabonet: Collaboration of Generative Models by Unsupervised Classification. In ICIP. IEEE, 1068--1072.Google ScholarGoogle Scholar
  17. Chuan Li and Michael Wand. 2016. Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks. In ECCV (3).Google ScholarGoogle Scholar
  18. Xiaodan Liang, Hao Zhang, Liang Lin, and Eric Xing. 2018. Generative Semantic Manipulation with Mask-Contrasting GAN. In ECCV (13).Google ScholarGoogle Scholar
  19. Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised Image-to-Image Translation Networks. In NIPS.Google ScholarGoogle Scholar
  20. Ming-Yu Liu and Oncel Tuzel. 2016. Coupled Generative Adversarial Networks. In NIPS.Google ScholarGoogle Scholar
  21. Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).Google ScholarGoogle Scholar
  22. Augustus Odena. 2016. Semi-supervised learning with generative adversarial networks. arXiv preprint arXiv:1606.01583 (2016).Google ScholarGoogle Scholar
  23. Augustus Odena, Vincent Dumoulin, and Chris Olah. 2016. Deconvolution and checkerboard artifacts. Distill 1, 10 (2016), e3.Google ScholarGoogle ScholarCross RefCross Ref
  24. Augustus Odena, Christopher Olah, and Jonathon Shlens. 2017. Conditional image synthesis with auxiliary classifier gans. In ICML. PMLR, 2642--2651.Google ScholarGoogle Scholar
  25. Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv1511.06434 (2015).Google ScholarGoogle Scholar
  26. Shaoqing Ren, Kaiming He, Ross B Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In NIPS.Google ScholarGoogle Scholar
  27. Tim Salimans, Ian J Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved Techniques for Training GANs. In NIPS.Google ScholarGoogle Scholar
  28. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  29. Yaniv Taigman, Adam Polyak, and Lior Wolf. 2016. Unsupervised cross-domain image generation. arXiv preprint arXiv:1611.02200 (2016).Google ScholarGoogle Scholar
  30. Hao Tang, Hong Liu, Dan Xu, Philip HS Torr, and Nicu Sebe. 2021. Attention-gan: Unpaired image-to-image translation using attention-guided generative adversarial networks. IEEE TNNLS (2021).Google ScholarGoogle Scholar
  31. Hao Tang, Dan Xu, Nicu Sebe, and Yan Yan. 2019. Attention-guided generative adversarial networks for unsupervised image-to-image translation. In IJCNN. IEEE, 1--8.Google ScholarGoogle Scholar
  32. Hao Tang, Dan Xu, Wei Wang, Yan Yan, and Nicu Sebe. 2018. Dual generator generative adversarial networks for multi-domain image-to-image translation. In ACCV. Springer, 3--21.Google ScholarGoogle Scholar
  33. Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2016. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016).Google ScholarGoogle Scholar
  34. Patricia Vitoria, Lara Raad, and Coloma Ballester. 2020. ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution. In WACV. IEEE Computer Society, 2434--2443.Google ScholarGoogle Scholar
  35. Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. High-Resolution Image Synthesis and Semantic Manipulation With Conditional GANs. In CVPR.Google ScholarGoogle Scholar
  36. Chao Yang, Xin Lu, Zhe Lin, Eli Shechtman, Oliver Wang, and Hao Li. 2017. High-Resolution Image Inpainting Using Multi-scale Neural Patch Synthesis. In CVPR.Google ScholarGoogle Scholar
  37. Zili Yi, Hao Richard Zhang, Ping Tan, and Minglun Gong. 2017. DualGAN: Unsupervised Dual Learning for Image-to-Image Translation. In ICCV.Google ScholarGoogle Scholar
  38. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In ICCV.Google ScholarGoogle Scholar
  39. Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Efros, Oliver Wang, and Eli Shechtman. 2017. Toward Multimodal Image-to-Image Translation. In NIPS.Google ScholarGoogle Scholar

Index Terms

  1. DiGAN: Directional Generative Adversarial Network for Object Transfiguration

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval
        June 2022
        714 pages
        ISBN:9781450392389
        DOI:10.1145/3512527

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 June 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate254of830submissions,31%

        Upcoming Conference

        ICMR '24
        International Conference on Multimedia Retrieval
        June 10 - 14, 2024
        Phuket , Thailand
      • Article Metrics

        • Downloads (Last 12 months)31
        • Downloads (Last 6 weeks)1

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader