Abstract
Cycle consistency conducts generative adversarial networks from aligned image pairs to unpaired training sets and can be applied to various image-to-image translations. However, the accumulation of errors that may occur during image reconstruction can affect the realism and quality of the generated images. To address this, we exploit a novel long and short cycle-consistent loss. This new loss is simple and easy to implement. Our dual-cycle constrained cross-domain image-to-image translation method can handle error accumulation and enforce adversarial learning. When image information is migrated from one domain to another, the cycle consistency-based image reconstruction constraint should be constrained in both short and long cycles to eliminate error accumulation. We adopt the cascading manner with dual-cycle consistency, where the reconstructed image in the first cycle can be cast as the new input to the next cycle. We show a distinct improvement over baseline approaches in most translation scenarios. With extensive experiments on several datasets, the proposed method is superior to several tested approaches.
Similar content being viewed by others
References
Anoosheh A, Agustsson E, Timofte R, Van Gool L (2018) Combogan: unrestrained scalability for image domain translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 783–790
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of the international conference on machine learning, pp 214–223
Chen C, Wang G (2021) Iosuda: an unsupervised domain adaptation with input and output space alignment for joint optic disc and cup segmentation. Appl Intell 51(6):3880–3898
Choi Y, Choi M, Kim M, Ha JW, Kim S, Choo J (2018) Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223
Engin D, Genç A, Kemal Ekenel H (2018) Cycle-dehaze: enhanced cyclegan for single image dehazing. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 825–833
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
He D, Xia Y, Qin T, Wang L, Yu N, Liu TY, Ma WY (2016) Dual learning for machine translation. In: Advances in neural information processing systems, pp 820–828
Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in neural information processing systems, pp 6626–6637
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Huang QX, Guibas L (2013) Consistent shape maps via semidefinite programming. In: Computer graphics forum. Wiley online library, vol 32, pp 177–186
Huang X, Liu MY, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the european conference on computer vision (ECCV), pp 172–189
Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4401–4410
Kim J, Kim M, Kang H, Lee K (2019) U-gat-it: unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation. In: Proceedings of the international conference on learning representations
Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th international conference on machine learning. JMLR. org, vol 70, pp 1857–1865
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. In: Proceedings of the international conference on learning representations
Kingma DP, Welling M (2013) Auto-encoding variational bayes. In: Proceedings of the international conference on learning representations
Laffont PY, Ren Z, Tao X, Qian C, Hays J (2014) Transient attributes for high-level understanding and editing of outdoor scenes. ACM Trans Grap (TOG) 33(4):149
Larsson G, Maire M, Shakhnarovich G (2016) Learning representations for automatic colorization. In: European conference on computer vision. Springer, pp 577–593
Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z et al (2017) Photo-realistic single image super-resolution using a generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681– 4690
Lee HY, Tseng HY, Mao Q, Huang JB, Lu YD, Singh M, Yang MH (2020) Drit++: Diverse image-to-image translation via disentangled representations. Int J Comput Vis 128(10):2402–2417
Li C, Deng C, Wang L, Xie D, Liu X (2019) Coupled cyclegan: unsupervised hashing network for cross-modal retrieval. In: Proceedings of the AAAI conference on artificial intelligence, 33(1), pp 176–183
Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European conference on computer vision. Springer, pp 702–716
Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: Advances in neural information processing systems, pp 700–708
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Luo P, Wang G, Lin L, Wang X (2017) Deep dual learning for semantic image segmentation. In: Proceedings of the IEEE international conference on computer vision, pp 2718– 2726
Mao X, Li Q, Xie H, Lau RY, Wang Z, Paul Smolley S (2017) Least squares generative adversarial networks. In: ICCV, pp 2794–2802
Mejjati YA, Richardt C, Tompkin J, Cosker D, Kim KI (2018) Unsupervised attention-guided image-to-image translation. In: Advances in neural information processing systems, pp 3693–3703
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784
Mo S, Cho M, Shin J (2018) Instagan: instance-aware image-to-image translation. In: Proceedings of the international conference on learning representations
Nizan O, Tal A (2019) Breaking the cycle–colleagues are all you need. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7860–7869
Park T, Efros AA, Zhang R, Zhu JY (2020) Contrastive learning for unpaired image-to-image translation. In: European conference on computer vision. Springer, pp 319–345
Richardson E, Alaluf Y, Patashnik O, Nitzan Y, Azar Y, Shapiro S, Cohen-Or D (2021) Encoding in style: a stylegan encoder for image-to-image translation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2287–2296
Saxena D, Kulshrestha T, Cao J, Cheung SC (2022) Multi-constraint adversarial networks for unsupervised image-to-image translation. IEEE Trans Image Process
Schlegl T, Seeböck P, Waldstein SM, Langs G, Schmidt-erfurth U (2019) f-anogan: Fast unsupervised anomaly detection with generative adversarial networks. Med Image Anal 54:30–44
Smolensky P (1986) Information processing in dynamical systems: foundations of harmony theory. Tech Rep, Colorado Univ at Boulder Dept of computer science
Tang H, Xu D, Liu G, Wang W, Sebe N, Yan Y (2019) Cycle in cycle generative adversarial networks for keypoint-guided image generation. In: Proceedings of the 27th ACM international conference on multimedia, pp 2052–2060
Tieleman T, Hinton G (2012) Lecture 6.5-rmsprop: divide the gradient by a running average of its recent magnitude. COURSERA: Neural Netw Mach Learn 4(2):26–31
Tyleček R, Šára R (2013) Spatial pattern templates for recognition of objects with regular structure. In: German conference on pattern recognition. Springer, pp 364–374
Ulyanov D, Vedaldi A, Lempitsky V (2016) Instance normalization: the missing ingredient for fast stylization. arXiv:1607.08022
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, pp 1096–1103
Wang G, Shi H, Chen Y (2021) Self-augmentation with dual-cycle constraint for unsupervised image-to-image generation. In: Proceedings of the IEEE 33rd international conference on tools with artificial intelligence, pp 886–890
Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8798–8807
Wang X, Tang X (2008) Face photo-sketch synthesis and recognition. IEEE Trans Pattern Anal Mach Intell 31(11):1955–1967
Wolterink JM, Dinkla AM, Savenije MH, Seevinck PR, van den Berg CA, Išgum I (2017) Deep mr to ct synthesis using unpaired data. In: International workshop on simulation and synthesis in medical imaging. Springer, pp 14–23
Yi Z, Zhang H, Tan P, Gong M (2017) Dualgan: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE international conference on computer vision, pp 2849–2857
Yuan Y, Liu S, Zhang J, Zhang Y, Dong C, Lin L (2018) Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 701–710
Zach C, Klopschitz M, Pollefeys M (2010) Disambiguating visual relations using loop constraints. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1426–1433
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Zhao Y, Chen C (2021) Unpaired image-to-image translation via latent energy transport. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16418–16427
Zheng C, Cham TJ, Cai J (2021) The spatially-correlative loss for various image translation tasks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 16407–16417
Zhou T, Krahenbuhl P, Aubry M, Huang Q, Efros AA (2016) Learning dense correspondence via 3d-guided cycle consistency. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 117–126
Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Zhu JY, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. In: Advances in neural information processing systems, pp 465–476
Acknowledgments
This work was supported by the National Natural Science Foundation of China under Grant No. 61703260, and partially supported by the National Natural Science Foundation of China under Grant No. 62173252.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, G., Shi, H., Chen, Y. et al. Unsupervised image-to-image translation via long-short cycle-consistent adversarial networks. Appl Intell 53, 17243–17259 (2023). https://doi.org/10.1007/s10489-022-04389-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04389-0