Feature-attention module for context-aware image-to-image translation

Bai, Jing; Chen, Ran; Liu, Min

doi:10.1007/s00371-020-01943-0

Feature-attention module for context-aware image-to-image translation

Original article
Published: 23 September 2020

Volume 36, pages 2145–2159, (2020)
Cite this article

The Visual Computer Aims and scope Submit manuscript

713 Accesses
14 Citations
Explore all metrics

Abstract

In a summer2winter image-to-image translation, trees should be transformed from green to gray, but the colors of houses or girls should not be changed. However, current unsupervised one-to-one image translation techniques failed to focus the translation on individual objects. To tackle this issue, we propose a novel feature-attention module for capturing the mutual influences of various features, so as to automatically attend only to specific scene objects in unsupervised image-to-image translation. The proposed module can be integrated into different image translation networks and improve their context-aware translation ability. The qualitative and quantitative experiments on horse2zebra, apple2orange and summer2winter datasets based on DualGAN, CycleGAN and UNIT demonstrate a significant improvement in our proposed module over the state-of-the-art methods. In addition, the experiments on apple2orange dataset based on MUNIT and DRIT further indicate the effectiveness of FA module in multimodal translation tasks. We also show that the computation complexity of the proposed module is linear to the image size; moreover, the experiments on the day2night dataset prove that the proposed module is insensitive to the growth of image resolution. The source code and trained models are available at https://github.com/gaoyuainshuyi/fa.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Trans-Cycle: Unpaired Image-to-Image Translation Network by Transformer

Multimodal Unsupervised Image-to-Image Translation

Self-attention StarGAN for Multi-domain Image-to-Image Translation

References

Chao, D., Chen, C.L., Kaiming, H., Xiaoou, T.: Learning a deep convolutional network for image super-resolution. In: ECCV (2014)
Mikaeli, E., Aghagolzadeh, A., Azghani, M.: Single-image super-resolution via patch-based and group-based local smoothness modeling. Vis. Comput. 36, 1573–1589 (2020)
Article Google Scholar
Ma, T., Tian, W.: Back-projection-based progressive growing generative adversarial network for single image super-resolution. Vis. Comput. (2020). https://doi.org/10.1007/s00371-020-01843-3
Article Google Scholar
Richard, Z., Phillip, I., Alexei, A.E.: Colorful image colorization. In: ECCV (2016)
Deepak, P., Philipp, K., Jeff, D., Trevor, D., Alexei, A.E.: Context encoders: feature learning by inpainting. In: CVPR (2016)
Leon, A.G., Alexander, S.E., Matthias, B.: Image style transfer using convolutional neural networks. In: CVPR (2016)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)
Sangwoo, M., Minsu, C., Jinwoo, S.: InstaGAN: instance-aware image-to-image translation. In: ICRL (2018)
Qifeng, C., Vladlen, K.: Photographic image synthesis with cascaded refinement networks. In: ICCV (2017)
Zili, Y., Zhang, H., Ping, T., Minglun, G.: DualGAN: unsupervised dual learning for image-to-image translation. In: ICCV (2017)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle consistent adversarial networks. In: CVPR (2017)
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NIPS (2017)
Mejjati, Y.A., Richardt, C., Tompkin, J., Cosker, D., Kim K.I.: Unsupervised attention-guided image to image translation. In: NeurIPS (2018)
Ramprasaath, R.S., Michael, C., Abhishek, D., Ramakrishna, V., Devi, P., Dhruv, B.: Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: ICCV (2017)
Ian, J.G., Jean, P.A., Mehdi, M., Xu, B., David, W.F., Sherjil, O., Aaron, C., Yoshua, B.: Generative adversarial nets. In: NIPS (2014)
Mirza, M., Osindero, S.: Conditional generative adversarial nets. Comput. Sci. 2014, 2672–2680 (2014)
Google Scholar
Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. In: NIPS (2016)
Matthew, A., Smita, K.: TraVeLGAN: image-to-image translation by transformation vector learning. In: CVPR (2019)
Wayne, W., Kaidi, C., Cheng, L., Chen, Q., Chen, C.L.: TransGaGa: geometry-aware unsupervised image-to-image translation. In: CVPR (2019)
Shen, Z., Huang, M., Shi, J., Xue, X., Thomas, H.: Towards instance-level image-to-image translation. In: CVPR (2019)
Ronald, A.R.: The dynamic representation of scenes. Vis. Cogn. 7(1–3), 17–42 (2000)
Google Scholar
Hu, J., Shen, L., Sun, G., Wu, E.: Squeeze-and-excitation networks. In: CVPR (2018)
Woo, C., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: ECCV (2018)
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. In: CVPR (2017)
Ashish, V., Noam, S., Niki, P., Jakob, U., Llion, J., Aidan, N.G., Łukasz, K., Illia, P.: Attention is all you need. In: NIPS (2017)
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: ICML (2019)
Wang, X.L., Girshick, R., Gupta, A., He, K.M.: Non-local neural networks. In: CVPR (2018)
Chen, X.Y., Xu, C., Yang, X.K., Tao, D.C.: Attention-GAN for object transfiguration in wild images. In: ECCV (2018)
Gao, Z.L., Xie, J.T., Wang, Q.L., Li, P.H.: Global Second-order pooling convolutional networks. In: CVPR (2019)
Wang, Z., Li, J., Song, G., Li, T.: Less memory, faster speed: refining self-attention module for image reconstruction. arXiv preprint arXiv:1905.08008 (2019)
Bińkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying MMD GANs. In: ICLR (2018)
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: NIPS (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-NET: convolutional networks for biomedical image segmentation. In: MICCAI (2015)
He, K.M., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: ECCV (2014)
https://www.wjx.cn/
Huang, X., Liu, M.Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: ECCV(2018)
Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M., Yang, M.H.: Diverse image-to-image translation via disentangled representations. In: ECCV (2018)

Download references

Acknowledgements

We thank the reviewers for their valuable feedback. This work was partly supported by the Natural Science Foundation of China (61762003), CAS “Light of West China” Program (2018QNXZ0024) and First-class discipline construction in Ningxia universities (Electronic Science and Technology: NXYLXK2017A07).

Author information

Authors and Affiliations

School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China
Jing Bai & Ran Chen
Ningxia Province Key Laboratory of Intelligent Information and Data Processing, Yinchuan, 750021, China
Jing Bai
School of Mechanical Engineering, Purdue University, West Lafayette, IN, 47907, USA
Min Liu

Authors

Jing Bai
View author publications
You can also search for this author in PubMed Google Scholar
Ran Chen
View author publications
You can also search for this author in PubMed Google Scholar
Min Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing Bai.

Ethics declarations

Conflict of interest

The authors declared that they have no conflicts of interest to this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bai, J., Chen, R. & Liu, M. Feature-attention module for context-aware image-to-image translation. Vis Comput 36, 2145–2159 (2020). https://doi.org/10.1007/s00371-020-01943-0

Download citation

Published: 23 September 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s00371-020-01943-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature-attention module for context-aware image-to-image translation

Abstract

Access this article

Similar content being viewed by others

Trans-Cycle: Unpaired Image-to-Image Translation Network by Transformer

Multimodal Unsupervised Image-to-Image Translation

Self-attention StarGAN for Multi-domain Image-to-Image Translation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Feature-attention module for context-aware image-to-image translation

Abstract

Access this article

Similar content being viewed by others

Trans-Cycle: Unpaired Image-to-Image Translation Network by Transformer

Multimodal Unsupervised Image-to-Image Translation

Self-attention StarGAN for Multi-domain Image-to-Image Translation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation