Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks

Wang, Gang; Shi, Haibo; Chen, Yufei; Wu, Bin

doi:10.1007/s10489-022-04389-0

Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks

Published: 28 December 2022

Volume 53, pages 17243–17259, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Gang Wang ORCID: orcid.org/0000-0002-5002-7303¹,
Haibo Shi¹,
Yufei Chen² &
…
Bin Wu³

792 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Cycle consistency conducts generative adversarial networks from aligned image pairs to unpaired training sets and can be applied to various image-to-image translations. However, the accumulation of errors that may occur during image reconstruction can affect the realism and quality of the generated images. To address this, we exploit a novel long and short cycle-consistent loss. This new loss is simple and easy to implement. Our dual-cycle constrained cross-domain image-to-image translation method can handle error accumulation and enforce adversarial learning. When image information is migrated from one domain to another, the cycle consistency-based image reconstruction constraint should be constrained in both short and long cycles to eliminate error accumulation. We adopt the cascading manner with dual-cycle consistency, where the reconstructed image in the first cycle can be cast as the new input to the next cycle. We show a distinct improvement over baseline approaches in most translation scenarios. With extensive experiments on several datasets, the proposed method is superior to several tested approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Image forgery detection: a survey of recent deep-learning approaches

Article Open access 03 October 2022

A Complete Review on Image Denoising Techniques for Medical Images

Article 04 July 2023

References

Anoosheh A, Agustsson E, Timofte R, Van Gool L (2018) Combogan: unrestrained scalability for image domain translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 783–790
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of the international conference on machine learning, pp 214–223
Chen C, Wang G (2021) Iosuda: an unsupervised domain adaptation with input and output space alignment for joint optic disc and cup segmentation. Appl Intell 51(6):3880–3898
Article Google Scholar
Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J (2018) Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Engin D, Genç A, Kemal Ekenel H (2018) Cycle-dehaze: enhanced cyclegan for single image dehazing. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 825–833
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
He D, Xia Y, Qin T, Wang L, Yu N, Liu TY, Ma WY (2016) Dual learning for machine translation. In: Advances in neural information processing systems, pp 820–828
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, pp 6626–6637
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Article MathSciNet MATH Google Scholar
Huang QX, Guibas L (2013) Consistent shape maps via semidefinite programming. In: Computer graphics forum. Wiley online library, vol 32, pp 177–186
Huang X, Liu MY, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the european conference on computer vision (ECCV), pp 172–189
Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4401–4410
Kim J, Kim M, Kang H, Lee K (2019) U-gat-it: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: Proceedings of the international conference on learning representations
Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th international conference on machine learning. JMLR. org, vol 70, pp 1857–1865
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. In: Proceedings of the international conference on learning representations
Kingma DP, Welling M (2013) Auto-encoding variational bayes. In: Proceedings of the international conference on learning representations
Laffont PY, Ren Z, Tao X, Qian C, Hays J (2014) Transient attributes for high-level understanding and editing of outdoor scenes. ACM Trans Grap (TOG) 33(4):149
Google Scholar
Larsson G, Maire M, Shakhnarovich G (2016) Learning representations for automatic colorization. In: European conference on computer vision. Springer, pp 577–593
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681– 4690
Lee HY, Tseng HY, Mao Q, Huang JB, Lu YD, Singh M, Yang MH (2020) Drit++: Diverse image-to-image translation via disentangled representations. Int J Comput Vis 128(10):2402–2417
Article Google Scholar
Li C, Deng C, Wang L, Xie D, Liu X (2019) Coupled cyclegan: unsupervised hashing network for cross-modal retrieval. In: Proceedings of the AAAI conference on artificial intelligence, 33(1), pp 176–183
Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European conference on computer vision. Springer, pp 702–716
Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: Advances in neural information processing systems, pp 700–708
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Luo P, Wang G, Lin L, Wang X (2017) Deep dual learning for semantic image segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 2718– 2726
Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: ICCV, pp 2794–2802
Mejjati YA, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. In: Advances in neural information processing systems, pp 3693–3703
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784
Mo S, Cho M, Shin J (2018) Instagan: instance-aware image-to-image translation. In: Proceedings of the international conference on learning representations
Nizan O, Tal A (2019) Breaking the cycle–colleagues are all you need. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7860–7869
Park T, Efros AA, Zhang R, Zhu JY (2020) Contrastive learning for unpaired image-to-image translation. In: European conference on computer vision. Springer, pp 319–345
Richardson E, Alaluf Y, Patashnik O, Nitzan Y, Azar Y, Shapiro S, Cohen-Or D (2021) Encoding in style: a stylegan encoder for image-to-image translation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2287–2296
Saxena D, Kulshrestha T, Cao J, Cheung SC (2022) Multi-constraint adversarial networks for unsupervised image-to-image translation. IEEE Trans Image Process
Schlegl T, Seeböck P, Waldstein SM, Langs G, Schmidt-erfurth U (2019) f-anogan: Fast unsupervised anomaly detection with generative adversarial networks. Med Image Anal 54:30–44
Article Google Scholar
Smolensky P (1986) Information processing in dynamical systems: foundations of harmony theory. Tech Rep, Colorado Univ at Boulder Dept of computer science
Tang H, Xu D, Liu G, Wang W, Sebe N, Yan Y (2019) Cycle in cycle generative adversarial networks for keypoint-guided image generation. In: Proceedings of the 27th ACM international conference on multimedia, pp 2052–2060
Tieleman T, Hinton G (2012) Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA: Neural Netw Mach Learn 4(2):26–31
Google Scholar
Tyleček R, Šára R (2013) Spatial pattern templates for recognition of objects with regular structure. In: German conference on pattern recognition. Springer, pp 364–374
Ulyanov D, Vedaldi A, Lempitsky V (2016) Instance normalization: the missing ingredient for fast stylization. arXiv:1607.08022
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, pp 1096–1103
Wang G, Shi H, Chen Y (2021) Self-augmentation with dual-cycle constraint for unsupervised image-to-image generation. In: Proceedings of the IEEE 33rd international conference on tools with artificial intelligence, pp 886–890
Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8798–8807
Wang X, Tang X (2008) Face photo-sketch synthesis and recognition. IEEE Trans Pattern Anal Mach Intell 31(11):1955–1967
Article Google Scholar
Wolterink JM, Dinkla AM, Savenije MH, Seevinck PR, van den Berg CA, Išgum I (2017) Deep mr to ct synthesis using unpaired data. In: International workshop on simulation and synthesis in medical imaging. Springer, pp 14–23
Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2849–2857
Yuan Y, Liu S, Zhang J, Zhang Y, Dong C, Lin L (2018) Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 701–710
Zach C, Klopschitz M, Pollefeys M (2010) Disambiguating visual relations using loop constraints. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1426–1433
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Zhao Y, Chen C (2021) Unpaired image-to-image translation via latent energy transport. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16418–16427
Zheng C, Cham TJ, Cai J (2021) The spatially-correlative loss for various image translation tasks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16407–16417
Zhou T, Krahenbuhl P, Aubry M, Huang Q, Efros AA (2016) Learning dense correspondence via 3d-guided cycle consistency. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 117–126
Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Zhu JY, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. In: Advances in neural information processing systems, pp 465–476

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant No. 61703260, and partially supported by the National Natural Science Foundation of China under Grant No. 62173252.

Author information

Authors and Affiliations

Institute of Data Science and Statistics, School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, 200433, China
Gang Wang & Haibo Shi
CAD Research Center, College of Electronics and Information Engineering, Tongji University, Shanghai, 201804, China
Yufei Chen
Shanghai University of Finance and Economics Zhejiang College, Jinhua, 321013, China
Bin Wu

Authors

Gang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haibo Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yufei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Bin Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Gang Wang or Bin Wu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, G., Shi, H., Chen, Y. et al. Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks. Appl Intell 53, 17243–17259 (2023). https://doi.org/10.1007/s10489-022-04389-0

Download citation

Accepted: 05 December 2022
Published: 28 December 2022
Issue Date: July 2023
DOI: https://doi.org/10.1007/s10489-022-04389-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Image forgery detection: a survey of recent deep-learning approaches

A Complete Review on Image Denoising Techniques for Medical Images

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Image forgery detection: a survey of recent deep-learning approaches

A Complete Review on Image Denoising Techniques for Medical Images

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation