Skip to main content
Log in

Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Cycle consistency conducts generative adversarial networks from aligned image pairs to unpaired training sets and can be applied to various image-to-image translations. However, the accumulation of errors that may occur during image reconstruction can affect the realism and quality of the generated images. To address this, we exploit a novel long and short cycle-consistent loss. This new loss is simple and easy to implement. Our dual-cycle constrained cross-domain image-to-image translation method can handle error accumulation and enforce adversarial learning. When image information is migrated from one domain to another, the cycle consistency-based image reconstruction constraint should be constrained in both short and long cycles to eliminate error accumulation. We adopt the cascading manner with dual-cycle consistency, where the reconstructed image in the first cycle can be cast as the new input to the next cycle. We show a distinct improvement over baseline approaches in most translation scenarios. With extensive experiments on several datasets, the proposed method is superior to several tested approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Anoosheh A, Agustsson E, Timofte R, Van Gool L (2018) Combogan: unrestrained scalability for image domain translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 783–790

  2. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of the international conference on machine learning, pp 214–223

  3. Chen C, Wang G (2021) Iosuda: an unsupervised domain adaptation with input and output space alignment for joint optic disc and cup segmentation. Appl Intell 51(6):3880–3898

    Article  Google Scholar 

  4. Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J (2018) Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797

  5. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223

  6. Engin D, Genç A, Kemal Ekenel H (2018) Cycle-dehaze: enhanced cyclegan for single image dehazing. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 825–833

  7. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680

  8. He D, Xia Y, Qin T, Wang L, Yu N, Liu TY, Ma WY (2016) Dual learning for machine translation. In: Advances in neural information processing systems, pp 820–828

  9. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, pp 6626–6637

  10. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  MATH  Google Scholar 

  11. Huang QX, Guibas L (2013) Consistent shape maps via semidefinite programming. In: Computer graphics forum. Wiley online library, vol 32, pp 177–186

  12. Huang X, Liu MY, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the european conference on computer vision (ECCV), pp 172–189

  13. Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134

  14. Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4401–4410

  15. Kim J, Kim M, Kang H, Lee K (2019) U-gat-it: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: Proceedings of the international conference on learning representations

  16. Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th international conference on machine learning. JMLR. org, vol 70, pp 1857–1865

  17. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. In: Proceedings of the international conference on learning representations

  18. Kingma DP, Welling M (2013) Auto-encoding variational bayes. In: Proceedings of the international conference on learning representations

  19. Laffont PY, Ren Z, Tao X, Qian C, Hays J (2014) Transient attributes for high-level understanding and editing of outdoor scenes. ACM Trans Grap (TOG) 33(4):149

    Google Scholar 

  20. Larsson G, Maire M, Shakhnarovich G (2016) Learning representations for automatic colorization. In: European conference on computer vision. Springer, pp 577–593

  21. Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681– 4690

  22. Lee HY, Tseng HY, Mao Q, Huang JB, Lu YD, Singh M, Yang MH (2020) Drit++: Diverse image-to-image translation via disentangled representations. Int J Comput Vis 128(10):2402–2417

    Article  Google Scholar 

  23. Li C, Deng C, Wang L, Xie D, Liu X (2019) Coupled cyclegan: unsupervised hashing network for cross-modal retrieval. In: Proceedings of the AAAI conference on artificial intelligence, 33(1), pp 176–183

  24. Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European conference on computer vision. Springer, pp 702–716

  25. Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: Advances in neural information processing systems, pp 700–708

  26. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  27. Luo P, Wang G, Lin L, Wang X (2017) Deep dual learning for semantic image segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 2718– 2726

  28. Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: ICCV, pp 2794–2802

  29. Mejjati YA, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. In: Advances in neural information processing systems, pp 3693–3703

  30. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784

  31. Mo S, Cho M, Shin J (2018) Instagan: instance-aware image-to-image translation. In: Proceedings of the international conference on learning representations

  32. Nizan O, Tal A (2019) Breaking the cycle–colleagues are all you need. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7860–7869

  33. Park T, Efros AA, Zhang R, Zhu JY (2020) Contrastive learning for unpaired image-to-image translation. In: European conference on computer vision. Springer, pp 319–345

  34. Richardson E, Alaluf Y, Patashnik O, Nitzan Y, Azar Y, Shapiro S, Cohen-Or D (2021) Encoding in style: a stylegan encoder for image-to-image translation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2287–2296

  35. Saxena D, Kulshrestha T, Cao J, Cheung SC (2022) Multi-constraint adversarial networks for unsupervised image-to-image translation. IEEE Trans Image Process

  36. Schlegl T, Seeböck P, Waldstein SM, Langs G, Schmidt-erfurth U (2019) f-anogan: Fast unsupervised anomaly detection with generative adversarial networks. Med Image Anal 54:30–44

    Article  Google Scholar 

  37. Smolensky P (1986) Information processing in dynamical systems: foundations of harmony theory. Tech Rep, Colorado Univ at Boulder Dept of computer science

  38. Tang H, Xu D, Liu G, Wang W, Sebe N, Yan Y (2019) Cycle in cycle generative adversarial networks for keypoint-guided image generation. In: Proceedings of the 27th ACM international conference on multimedia, pp 2052–2060

  39. Tieleman T, Hinton G (2012) Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA: Neural Netw Mach Learn 4(2):26–31

    Google Scholar 

  40. Tyleček R, Šára R (2013) Spatial pattern templates for recognition of objects with regular structure. In: German conference on pattern recognition. Springer, pp 364–374

  41. Ulyanov D, Vedaldi A, Lempitsky V (2016) Instance normalization: the missing ingredient for fast stylization. arXiv:1607.08022

  42. Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, pp 1096–1103

  43. Wang G, Shi H, Chen Y (2021) Self-augmentation with dual-cycle constraint for unsupervised image-to-image generation. In: Proceedings of the IEEE 33rd international conference on tools with artificial intelligence, pp 886–890

  44. Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8798–8807

  45. Wang X, Tang X (2008) Face photo-sketch synthesis and recognition. IEEE Trans Pattern Anal Mach Intell 31(11):1955–1967

    Article  Google Scholar 

  46. Wolterink JM, Dinkla AM, Savenije MH, Seevinck PR, van den Berg CA, Išgum I (2017) Deep mr to ct synthesis using unpaired data. In: International workshop on simulation and synthesis in medical imaging. Springer, pp 14–23

  47. Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2849–2857

  48. Yuan Y, Liu S, Zhang J, Zhang Y, Dong C, Lin L (2018) Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 701–710

  49. Zach C, Klopschitz M, Pollefeys M (2010) Disambiguating visual relations using loop constraints. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1426–1433

  50. Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595

  51. Zhao Y, Chen C (2021) Unpaired image-to-image translation via latent energy transport. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16418–16427

  52. Zheng C, Cham TJ, Cai J (2021) The spatially-correlative loss for various image translation tasks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16407–16417

  53. Zhou T, Krahenbuhl P, Aubry M, Huang Q, Efros AA (2016) Learning dense correspondence via 3d-guided cycle consistency. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 117–126

  54. Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232

  55. Zhu JY, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. In: Advances in neural information processing systems, pp 465–476

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China under Grant No. 61703260, and partially supported by the National Natural Science Foundation of China under Grant No. 62173252.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Gang Wang or Bin Wu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, G., Shi, H., Chen, Y. et al. Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks. Appl Intell 53, 17243–17259 (2023). https://doi.org/10.1007/s10489-022-04389-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-04389-0

Keywords

Navigation