Skip to main content

Attention Guided Unsupervised Image-to-Image Translation with Progressively Growing Strategy

  • Conference paper
  • First Online:
Book cover Pattern Recognition (ACPR 2019)

Abstract

Unsupervised image-to-image translation such as CycleGAN has received considerable attention in recent research. However, when handling large images, the quality of generated images are not in good quality. Progressive Growing GAN has proved that progressively growing of GANs could generate high pixels images. However, if we simply combine PG-method and CycleGAN, it must bring model collapse. In this paper, motivated from skip connection, we propose Progressive Growing CycleGAN (PG-Att-CycleGAN), which can stably grow the input size of both the generator and discriminator progressively from \(256\times 256\) to \(512\times 512\) and finally \(1024\times 1024\) using the weight \({\alpha }\). The whole process makes generated images clearer and stabilizes training of the network. In addition, our new generator and discriminator cannot only make the domain transfer more natural, but also increase the stability of training by using the attention block. Finally, through our model, we can process high scale images with good qualities. We use VGG16 network to evaluate domain transfer ability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)

    Google Scholar 

  2. Pecina, P., et al.: Adaptation of machine translation for multilingual information retrieval in the medical domain. Artif. Intell. Med. 61(3), 165–185 (2014)

    Google Scholar 

  3. Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer GAN to bridge domain gap for person re-identification. In: CVPR (2018)

    Google Scholar 

  4. Bansal, A., Ma, S., Ramanan, D., Sheikh, Y.: Recycle-GAN: unsupervised video retargeting. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 122–138. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_8

    Chapter  Google Scholar 

  5. Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability and variation. In: ICLR (2018)

    Google Scholar 

  6. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  7. Mejjati, Y.A., Richardt, C., Tompkin, J., Cosker, D.: Unsupervised attention-guided image-to-image translation. In: NIPS (2018)

    Google Scholar 

  8. Odena, A., Dumoulin, V., Olah, C.: Deconvolution and checkerboard artifacts. Distill 1, e3 (2016)

    Google Scholar 

  9. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual Losses for Real-Time Style Transfer and Super-Resolution. arXiv preprint arXiv:1603.08155 (2016)

  10. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)

  11. Zhang, H., et al.: StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. arXiv preprint arXiv:1612.03242 (2016)

  12. Long, J., Shelhamer, E.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)

    Google Scholar 

  13. Bellman, R., Kashef, B., Vasudevan, R.: Dynamic programming and bicubic spline interpolation. J. Math. Anal. Appl. 44, 160–174 (1973)

    Google Scholar 

  14. Wang, F., et al.: Residual Attention Network for Image Classification. arXiv preprint arXiv:1704.06904 (2017)

  15. Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: International Conference on Machine Learning (2017)

    Google Scholar 

  16. Mao, X., et al.: Least Squares Generative Adversarial Networks. arXiv preprint arXiv:1611.04076 (2016)

  17. Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)

    Google Scholar 

  18. Arjovsky, M., Chinatala, S., Bottou, L.: Wassertein GAN. arXiv preprint arXiv:1701.07875 (2017)

  19. Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556 (2016)

  20. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yuchen Wu , Runtong Zhang or Keiji Yanai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, Y., Zhang, R., Yanai, K. (2020). Attention Guided Unsupervised Image-to-Image Translation with Progressively Growing Strategy. In: Cree, M., Huang, F., Yuan, J., Yan, W. (eds) Pattern Recognition. ACPR 2019. Communications in Computer and Information Science, vol 1180. Springer, Singapore. https://doi.org/10.1007/978-981-15-3651-9_9

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-3651-9_9

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-3650-2

  • Online ISBN: 978-981-15-3651-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics