Skip to main content
Log in

Unsupervised image-to-image translation using intra-domain reconstruction loss

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Generative adversarial networks (GANs) have been successfully used for considerable computer vision tasks, especially the image-to-image translation. However, GANs are often accompanied by training instability and mode collapse in the process of image-to-image translation, which leads to the generation of low-quality images. To address the aforementioned problem, by combining CycleGAN and intra-domain reconstruction loss (IDRL), we propose an unsupervised image-to-image translation network named “Cycle-IDRL”. Specifically, the generator adopts the U-Net network with skip connections, which merges the coarse-grained and fine-grained features and the least squares loss in LSGAN is used to improve the stability of training process. Especially, the target domain features extracted from the discriminator are used as input of generator to generate reconstructed samples. Then, we construct the IDRL between the target domain samples and the reconstructed samples by using L1 norm. The experimental results on multiple datasets show that the proposed method performs better than the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Dong C, Loy CC, He KM, Tang XO (2016) Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell 38(2):295–307

    Article  Google Scholar 

  2. Wang XT, Yu K, Wu SX, Gu JJ, Liu YH, Dong, Loy CC, Qiao Y, Tang XO (2018) Enhanced super-resolution generative adversarial networks. In: European conference on computer vision

  3. W.L. Zhang, Y.H. Liu, C. Dong, and Y. Qiao, RankSRGAN: Generative Adversarial Networks With Ranker for Image Super-Resolution, IEEE International Conference on Computer Vision (2019)

  4. Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks. In: IEEE conference on computer vision and pattern recognition

  5. Chang HW, Lu JW, Yu F, Finkelstein A (2018) PairedCycleGAN: asymmetric style transfer for applying and removing makeup. In: IEEE conference on computer vision and pattern recognition

  6. He ZL, Zuo WM, Kan M, Shan SG, Chen XL (2019) AttGAN: facial attribute editing by only changing what you want. IEEE Trans Image Process 28(11):5464–5478

    Article  MathSciNet  Google Scholar 

  7. Xiao TH, Hong JP, Ma JW (2018) ELEGANT: exchanging latent encodings with GAN for transferring multiple face attributes. In: European conference on computer vision

  8. Zeng YH, Fu JL, Chao HY, Guo B (2019) Learning pyramid-context encoder network for high-quality image inpainting. In: IEEE conference on computer vision and pattern RecPIognition

  9. Yu JH, Lin Z, Yang JM, Shen XH, Lu X, Huang TS (2018) Generative image inpainting with contextual attention. In: IEEE conference on computer vision and pattern recognition

  10. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Conference on neural information processing systems, pp 2672–2680

  11. Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: IEEE conference on computer vision and pattern recognition, pp 5967–5976

  12. Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (2018) High-resolution image synthesis and semantic manipulation with conditional GANs. In: IEEE conference on computer vision and pattern recognition, pp 8798–8807

  13. Mao Q, Lee HY, Tseng HY, Ma SW, Yang MH (2019) Mode seeking generative adversarial networks for diverse image synthesis. In: IEEE conference on computer vision and pattern recognition

  14. Wang TC, Liu MY, Zhu JY, Liu G, Tao A, Kautz J, Catanzaro B (2018) Video-to-video synthesis. In: Conference on neural information processing systems, pp 1–14

  15. Wang XZ, Xing HJ, Li Y (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654

    Article  Google Scholar 

  16. Wang XZ, Wang R, Xu C (2018) discovering the relationship between generalization and uncertainty by incorporating complexity of classification. IEEE Trans Cybern 48(2):703–715

    Article  MathSciNet  Google Scholar 

  17. Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE international conference on computer vision, pp 2242–2251

  18. Yi Z, Zhang H, Tan P, Gong M (2017) DualGAN: unsupervised dual learning for image-to-image translation. In: IEEE international conference on computer vision, pp 2868–2876

  19. Kim T, Cha M, Kim H, Lee JK, Kim J (2017) Learning to discover cross domain relations with generative adversarial networks. In: International conference on machine learning

  20. Chen Y, Lai YK, Liu YJ (2018) CartoonGAN: generative adversarial networks for photo cartoonization. In: IEEE conference on computer vision and pattern recognition, pp 9465–9474

  21. Lu GS, Zhou ZM, Song YX, Ren K, Yu Y (2019) Guiding the one-to-one mapping in CycleGAN via optimal transport. In: The thirty-third AAAI conference on artificial intelligence

  22. Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: NeurIPS, pp 700–708

  23. Huang X, Liu MY, Belongie S, Kautz J (2018) Multi-modal unsupervised image-to-image translation. In: European conference on computer vision

  24. Lee HY, Tseng HY, Huang JB, Singh M, Yang MH (2018) Diverse image-to-image translation via disentangled representations. In: European conference on computer vision, pp 35–51

  25. Lin J, Xia Y, Qin T, Chen Z, Liu TY (2018) Conditional image-to-image translation. In: IEEE conference on computer vision and pattern recognition

  26. Denton E, Chintala S, Fergus R, Szlam A (2015) Deep generative image models using a Laplacian pyramid of adversarial networks. arXiv:1506.05751

  27. Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: International conference of learning representation, pp 1–16

  28. Zhu JY, Krähenbühl P, Shechtman E, Efros AA (2016) Generative visual manipulation on the natural image manifold. In: European conference on computer vision, pp 597–613

  29. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training GANs. arXiv:1606.03498

  30. Mathieu MF, Zhao J, Ramesh A, Sprechmann P, LeCun Y (2016) Disentangling factors of variation in deep representation using adversarial training. In: Conference on neural information processing systems

  31. Reed S, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H (2016) Generative adversarial text to image synthesis. In: International conference on machine learning

  32. Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: IEEE conference on computer vision and pattern recognition, pp 2536–2544

  33. Mathieu M, Couprie C, LeCun Y (2015) Deep multiscale video prediction beyond mean square error, pp 1–14. arXiv:1511.05440

  34. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784

  35. Odena A (2016) Semi-supervised learning with generative adversarial networks, pp 1–3. arXiv:1606.01583

  36. Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) InfoGAN: interpretable representation learning by information maximizing generative adversarial nets. arXiv:1606.03657

  37. Zhao J, Mathieu M, Lecun Y (2016) Energy-based generative adversarial network, pp 1–17. arXiv:1609.03126

  38. Berthelot D, Schumm T, Metz L (2017) Began: boundary equilibrium generative adversarial networks, pp 1–9. arXiv:1703.10717

  39. Wardefarley D, Bengio Y (2017) Improving generative adversarial networks with denoising feature matching. In: Proceedings of the international conference on learning representations

  40. Larsen ABL, Snderby SK, Larochelle H, Winther O (2016) Autoencoding beyond pixels using a learned similarity metric. arXiv:1512.09300v2

  41. Che T, Li Y, Jacob AP, Bengio Y, Li W (2017) Mode regularized generative adversarial networks. In: International conference on learning representations

  42. Rosca M, Lakshminarayanan B, Warde-Farley D, Mohamed S (2017) Variational approaches for auto-encoding generative adversarial networks. arXiv:1706.04987

  43. Mao X, Li Q, Xie H, Lau RYK, Wang Z, Smolley SP (2017) Least squares generative adversarial networks. In: IEEE conference on computer vision and pattern recognition, pp 2813–2821

  44. Mroueh Y, Sercu T, Goel V (2017) Mcgan: mean and covariance feature matching GAN. In: International conference on machine learning

  45. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. arXiv:1701.07875

  46. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A (2017) Improved training of Wasserstein GANs. In: Neural information processing systems

  47. Ulyanov D, Vedaldi A, Lempitsky V (2014) Adversarial generator-encoder networks. arXiv:1704.02304

  48. Kodali N, Abernethy J, Hays J, Kira Z (2017) On convergence and stability of GANs. arXiv:1705.07215

  49. Wang XZ, Zhang TL, Wang R (2019) Non-iterative deep learning: incorporating restricted boltzmann machine into multilayer random weight neural networks. IEEE Trans Syst Man Cybern Syst 49(7):1299–1380

    Article  Google Scholar 

  50. Wei W, Wang JH, Liang JY (2015) Compacted decision tables based attribute reduction. Knowl-Based Syst 86:261–277

    Article  Google Scholar 

  51. Wang JH, Liang JY, Qian YH (2010) A heuristic method to attribute reduction for concept lattice. In: 2010 international conference on machine learning and cybernetics, pp 483–487

  52. Wang JH, Qian YH, Liang JY (2009) A new measure of uncertainty based on knowledge granulation for rough sets. Inf Sci 179(4):458–470

    Article  MathSciNet  Google Scholar 

  53. Hertzmann A, Jacobs CE, Oliver N, Curless B, Salesin DH (2001) Image analogies. In: SIGGRAPH

  54. Wang R, Wang XZ, Kwong S, Xu C (2017) Incorporating diversity and informativeness in multiple-instance active learning. IEEE Trans Fuzzy Syst 25(6):1460–1475

    Article  Google Scholar 

  55. Mejjati YA, Richardt C, Tompkin J, Cosker D (2018) Unsupervised attention-guided image-to-image translation. In: Conference on neural information processing systems, pp 1–18

Download references

Acknowledgements

The authors are very indebted to the anonymous referees for their critical comments and suggestions for the improvement of this paper. This work was supported by grants from the National Natural Science Foundation of China (Nos. 61673396, U19A2073, 61976245).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingwen Shao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, Y., Shao, M., Zuo, W. et al. Unsupervised image-to-image translation using intra-domain reconstruction loss. Int. J. Mach. Learn. & Cyber. 11, 2077–2088 (2020). https://doi.org/10.1007/s13042-020-01098-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01098-3

Keywords

Navigation