Unsupervised Transformation Network Based on GANs for Target-Domain Oriented Multi-domain Image Translation

Ge, Hongwei; Yao, Yao; Chen, Zheng; Sun, Liang

doi:10.1007/978-3-030-20890-5_26

Unsupervised Transformation Network Based on GANs for Target-Domain Oriented Multi-domain Image Translation

Hongwei Ge ORCID: orcid.org/0000-0002-8937-1515¹⁸,
Yao Yao¹⁸,
Zheng Chen¹⁹ &
…
Liang Sun ORCID: orcid.org/0000-0003-2794-8654¹⁸

Conference paper
First Online: 02 June 2019

2081 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11362))

Abstract

Multi-domain image translation with unpaired data is a challenging problem. This paper proposes a generalized GAN-based unsupervised multi-domain transformation network (UMT-GAN) for image translation. The generation network of UMT-GAN consists of a universal encoder, a reconstructor and a series of translators corresponding to different target domains. The encoder is used to learn the universal information among different domains. The reconstructor is designed to extract the hierarchical representations of the images by minimizing the reconstruction loss. The translators are used to perform the multi-domain translation. Each translator and reconstructor are connected to a discriminator for adversarial training. Importantly, the high-level representations are shared between the source and multiple target domains, and all network structures are trained together by using a joint loss function. In particular, instead of using a random vector z as inputs to generate high-resolution images, UMT-GAN rather employs the source domain images as the inputs of the generator, hence help the model escape from collapsing to a certain extent. The experimental studies demonstrate the effectiveness and superiority of the proposed algorithm compared with several state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P.: InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. In: Advances in Neural Information Processing Systems, Barcelona, pp. 2172–2180 (2016)
Google Scholar
Cheng, Z., Yang, Q., Sheng, B.: Deep colorization. In: Proceedings of the IEEE International Conference on Computer Vision, Santiago, pp. 415–423 (2015)
Google Scholar
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: The IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City (2018)
Google Scholar
Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., Sengupta, B., Bharath, A.A.: Generative adversarial networks: an overview. IEEE Signal Process. Mag. 35(1), 53–65 (2018)
Article Google Scholar
Denton, E.L., Chintala, S., Fergus, R., et al.: Deep generative image models using a Laplacian pyramid of adversarial networks. In: Advances in Neural Information Processing Systems, pp. 1486–1494 (2015)
Google Scholar
Dolhansky, B., Canton Ferrer, C.: Eye in-painting with exemplar generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, pp. 7902–7911 (2018)
Google Scholar
Dong, C., Chen, C.L., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)
Article Google Scholar
Gan, Z., et al.: Triangle generative adversarial networks. In: Advances in Neural Information Processing Systems, Long Beach, pp. 5247–5256 (2017)
Google Scholar
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, pp. 2414–2423 (2016)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Huynh-Thu, Q., Ghanbari, M.: Scope of validity of psnr in image/video quality assessment. Electron. Lett. 44(13), 800–801 (2008)
Article Google Scholar
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5967–5976. IEEE, Honolulu (2017)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Chapter Google Scholar
Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: Precup, D., Teh, Y.W. (eds.) International Conference on Machine Learning, vol. 70, pp. 1857–1865. PMLR, Sydney, Australia (2017)
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations, San Diego (2015)
Google Scholar
Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., Matas, J.: DeblurGAN: blind motion deblurring using conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, pp. 8183–8192 (2018)
Google Scholar
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, pp. 105–114 (2017)
Google Scholar
Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M., Yang, M.H.: Diverse image-to-image translation via disentangled representations. In: Proceedings of the European Conference on Computer Vision, Munich, Germany, pp. 35–51(2018)
Google Scholar
Li, J., Skinner, K.A., Eustice, R.M., Johnson-Roberson, M.: WaterGAN: Unsupervised generative network to enable real-time color correction of monocular underwater images. IEEE Robot. Autom. Lett. 3(1), 387–394 (2018)
Google Scholar
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: Advances in Neural Information Processing Systems, Long Beach, pp. 700–708 (2017)
Google Scholar
Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. In: Advances in Neural Information Processing Systems, Barcelona, pp. 469–477 (2016)
Google Scholar
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, Santiago, pp. 3730–3738 (2015)
Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, pp. 3431–3440 (2015)
Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, pp. 2536–2544 (2016)
Google Scholar
Qian, R., Tan, R.T., Yang, W., Su, J., Liu, J.: Attentive generative adversarial network for raindrop removal from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, pp. 2482–2491 (2018)
Google Scholar
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: International Conference on Learning Representations (2016)
Google Scholar
Regmi, K., Borji, A.: Cross-view image synthesis using conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, pp. 3501–3510 (2018)
Google Scholar
Tyleček, R., Šára, R.: Spatial pattern templates for recognition of objects with regular structure. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 364–374. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40602-7_39
Chapter Google Scholar
Wang, C., Xu, C., Wanga, C., Tao, D.: Perceptual adversarial networks for image-to-image transformation. IEEE Trans. Image Process. 27(8), 4066–4079 (2018)
Article MathSciNet Google Scholar
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, pp. 8798–8807 (2018)
Google Scholar
Wang, Z., Bovik, A.C.: A universal image quality index. IEEE Signal Process. Lett. 9(3), 81–84 (2002)
Article Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Article Google Scholar
Huang, X., Liu, M.-Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 179–196. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_11
Chapter Google Scholar
Yi, Z., Zhang, H., Tan, P., Gong, M.: DualGAN: unsupervised dual learning for image-to-image translation. In: IEEE International Conference on Computer Vision, pp. 2868–2876. IEEE, Venice (2017)
Google Scholar
Zhang, R., Isola, P., Efros, A.A.: Colorful image colorization. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 649–666. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_40
Chapter Google Scholar
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE International Conference on Computer Vision, Venice, pp. 2242–2251 (2017)
Google Scholar
Zhu, J.Y., et al.: Toward multimodal image-to-image translation. In: Advances in Neural Information Processing Systems, Long Beach, CA, pp. 465–476 (2017)
Google Scholar

Download references

Acknowledgment

The authors are grateful for the support of the National Natural Science Foundation of China (61572104, 61402076, 61502072), the Fundamental Research Funds for the Central Universities (DUT17JC04), and the Project of the Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University (93K172017K03).

Author information

Authors and Affiliations

College of Computer Science and Technology, Dalian University of Technology, Dalian, 116023, China
Hongwei Ge, Yao Yao & Liang Sun
Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO, 63130, USA
Zheng Chen

Authors

Hongwei Ge
View author publications
You can also search for this author in PubMed Google Scholar
Yao Yao
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Liang Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongwei Ge .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C. V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ge, H., Yao, Y., Chen, Z., Sun, L. (2019). Unsupervised Transformation Network Based on GANs for Target-Domain Oriented Multi-domain Image Translation. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11362. Springer, Cham. https://doi.org/10.1007/978-3-030-20890-5_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-20890-5_26
Published: 02 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20889-9
Online ISBN: 978-3-030-20890-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics