Attention Guided Unsupervised Image-to-Image Translation with Progressively Growing Strategy

Wu, Yuchen; Zhang, Runtong; Yanai, Keiji

doi:10.1007/978-981-15-3651-9_9

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1180))

Included in the following conference series:

Asian Conference on Pattern Recognition

575 Accesses

Abstract

Unsupervised image-to-image translation such as CycleGAN has received considerable attention in recent research. However, when handling large images, the quality of generated images are not in good quality. Progressive Growing GAN has proved that progressively growing of GANs could generate high pixels images. However, if we simply combine PG-method and CycleGAN, it must bring model collapse. In this paper, motivated from skip connection, we propose Progressive Growing CycleGAN (PG-Att-CycleGAN), which can stably grow the input size of both the generator and discriminator progressively from \(256\times 256\) to \(512\times 512\) and finally \(1024\times 1024\) using the weight \({\alpha }\). The whole process makes generated images clearer and stabilizes training of the network. In addition, our new generator and discriminator cannot only make the domain transfer more natural, but also increase the stability of training by using the attention block. Finally, through our model, we can process high scale images with good qualities. We use VGG16 network to evaluate domain transfer ability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)
Google Scholar
Pecina, P., et al.: Adaptation of machine translation for multilingual information retrieval in the medical domain. Artif. Intell. Med. 61(3), 165–185 (2014)
Google Scholar
Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer GAN to bridge domain gap for person re-identification. In: CVPR (2018)
Google Scholar
Bansal, A., Ma, S., Ramanan, D., Sheikh, Y.: Recycle-GAN: unsupervised video retargeting. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 122–138. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_8
Chapter Google Scholar
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability and variation. In: ICLR (2018)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Mejjati, Y.A., Richardt, C., Tompkin, J., Cosker, D.: Unsupervised attention-guided image-to-image translation. In: NIPS (2018)
Google Scholar
Odena, A., Dumoulin, V., Olah, C.: Deconvolution and checkerboard artifacts. Distill 1, e3 (2016)
Google Scholar
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual Losses for Real-Time Style Transfer and Super-Resolution. arXiv preprint arXiv:1603.08155 (2016)
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)
Zhang, H., et al.: StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks. arXiv preprint arXiv:1612.03242 (2016)
Long, J., Shelhamer, E.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Google Scholar
Bellman, R., Kashef, B., Vasudevan, R.: Dynamic programming and bicubic spline interpolation. J. Math. Anal. Appl. 44, 160–174 (1973)
Google Scholar
Wang, F., et al.: Residual Attention Network for Image Classification. arXiv preprint arXiv:1704.06904 (2017)
Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: International Conference on Machine Learning (2017)
Google Scholar
Mao, X., et al.: Least Squares Generative Adversarial Networks. arXiv preprint arXiv:1611.04076 (2016)
Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)
Google Scholar
Arjovsky, M., Chinatala, S., Bottou, L.: Wassertein GAN. arXiv preprint arXiv:1701.07875 (2017)
Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv preprint arXiv:1409.1556 (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Electronic Science and Technology of China, Chengdu, China
Yuchen Wu & Runtong Zhang
The University of Electro-Communications, Tokyo, Japan
Keiji Yanai

Authors

Yuchen Wu
View author publications
You can also search for this author in PubMed Google Scholar
Runtong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Keiji Yanai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yuchen Wu , Runtong Zhang or Keiji Yanai .

Editor information

Editors and Affiliations

University of Waikato, Hamilton, New Zealand
Michael Cree
National Ilan University, Yilan, Taiwan
Fay Huang
State University of New York at Buffalo, Buffalo, NY, USA
Junsong Yuan
Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, Y., Zhang, R., Yanai, K. (2020). Attention Guided Unsupervised Image-to-Image Translation with Progressively Growing Strategy. In: Cree, M., Huang, F., Yuan, J., Yan, W. (eds) Pattern Recognition. ACPR 2019. Communications in Computer and Information Science, vol 1180. Springer, Singapore. https://doi.org/10.1007/978-981-15-3651-9_9

Download citation

DOI: https://doi.org/10.1007/978-981-15-3651-9_9
Published: 07 March 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-3650-2
Online ISBN: 978-981-15-3651-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics