Skip to main content
Log in

Feature-attention module for context-aware image-to-image translation

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

In a summer2winter image-to-image translation, trees should be transformed from green to gray, but the colors of houses or girls should not be changed. However, current unsupervised one-to-one image translation techniques failed to focus the translation on individual objects. To tackle this issue, we propose a novel feature-attention module for capturing the mutual influences of various features, so as to automatically attend only to specific scene objects in unsupervised image-to-image translation. The proposed module can be integrated into different image translation networks and improve their context-aware translation ability. The qualitative and quantitative experiments on horse2zebra, apple2orange and summer2winter datasets based on DualGAN, CycleGAN and UNIT demonstrate a significant improvement in our proposed module over the state-of-the-art methods. In addition, the experiments on apple2orange dataset based on MUNIT and DRIT further indicate the effectiveness of FA module in multimodal translation tasks. We also show that the computation complexity of the proposed module is linear to the image size; moreover, the experiments on the day2night dataset prove that the proposed module is insensitive to the growth of image resolution. The source code and trained models are available at https://github.com/gaoyuainshuyi/fa.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Chao, D., Chen, C.L., Kaiming, H., Xiaoou, T.: Learning a deep convolutional network for image super-resolution. In: ECCV (2014)

  2. Mikaeli, E., Aghagolzadeh, A., Azghani, M.: Single-image super-resolution via patch-based and group-based local smoothness modeling. Vis. Comput. 36, 1573–1589 (2020)

    Article  Google Scholar 

  3. Ma, T., Tian, W.: Back-projection-based progressive growing generative adversarial network for single image super-resolution. Vis. Comput. (2020). https://doi.org/10.1007/s00371-020-01843-3

    Article  Google Scholar 

  4. Richard, Z., Phillip, I., Alexei, A.E.: Colorful image colorization. In: ECCV (2016)

  5. Deepak, P., Philipp, K., Jeff, D., Trevor, D., Alexei, A.E.: Context encoders: feature learning by inpainting. In: CVPR (2016)

  6. Leon, A.G., Alexander, S.E., Matthias, B.: Image style transfer using convolutional neural networks. In: CVPR (2016)

  7. Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR (2017)

  8. Sangwoo, M., Minsu, C., Jinwoo, S.: InstaGAN: instance-aware image-to-image translation. In: ICRL (2018)

  9. Qifeng, C., Vladlen, K.: Photographic image synthesis with cascaded refinement networks. In: ICCV (2017)

  10. Zili, Y., Zhang, H., Ping, T., Minglun, G.: DualGAN: unsupervised dual learning for image-to-image translation. In: ICCV (2017)

  11. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle consistent adversarial networks. In: CVPR (2017)

  12. Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NIPS (2017)

  13. Mejjati, Y.A., Richardt, C., Tompkin, J., Cosker, D., Kim K.I.: Unsupervised attention-guided image to image translation. In: NeurIPS (2018)

  14. Ramprasaath, R.S., Michael, C., Abhishek, D., Ramakrishna, V., Devi, P., Dhruv, B.: Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: ICCV (2017)

  15. Ian, J.G., Jean, P.A., Mehdi, M., Xu, B., David, W.F., Sherjil, O., Aaron, C., Yoshua, B.: Generative adversarial nets. In: NIPS (2014)

  16. Mirza, M., Osindero, S.: Conditional generative adversarial nets. Comput. Sci. 2014, 2672–2680 (2014)

    Google Scholar 

  17. Liu, M.Y., Tuzel, O.: Coupled generative adversarial networks. In: NIPS (2016)

  18. Matthew, A., Smita, K.: TraVeLGAN: image-to-image translation by transformation vector learning. In: CVPR (2019)

  19. Wayne, W., Kaidi, C., Cheng, L., Chen, Q., Chen, C.L.: TransGaGa: geometry-aware unsupervised image-to-image translation. In: CVPR (2019)

  20. Shen, Z., Huang, M., Shi, J., Xue, X., Thomas, H.: Towards instance-level image-to-image translation. In: CVPR (2019)

  21. Ronald, A.R.: The dynamic representation of scenes. Vis. Cogn. 7(1–3), 17–42 (2000)

    Google Scholar 

  22. Hu, J., Shen, L., Sun, G., Wu, E.: Squeeze-and-excitation networks. In: CVPR (2018)

  23. Woo, C., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: ECCV (2018)

  24. Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. In: CVPR (2017)

  25. Ashish, V., Noam, S., Niki, P., Jakob, U., Llion, J., Aidan, N.G., Łukasz, K., Illia, P.: Attention is all you need. In: NIPS (2017)

  26. Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: ICML (2019)

  27. Wang, X.L., Girshick, R., Gupta, A., He, K.M.: Non-local neural networks. In: CVPR (2018)

  28. Chen, X.Y., Xu, C., Yang, X.K., Tao, D.C.: Attention-GAN for object transfiguration in wild images. In: ECCV (2018)

  29. Gao, Z.L., Xie, J.T., Wang, Q.L., Li, P.H.: Global Second-order pooling convolutional networks. In: CVPR (2019)

  30. Wang, Z., Li, J., Song, G., Li, T.: Less memory, faster speed: refining self-attention module for image reconstruction. arXiv preprint arXiv:1905.08008 (2019)

  31. Bińkowski, M., Sutherland, D.J., Arbel, M., Gretton, A.: Demystifying MMD GANs. In: ICLR (2018)

  32. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: NIPS (2017)

  33. Ronneberger, O., Fischer, P., Brox, T.: U-NET: convolutional networks for biomedical image segmentation. In: MICCAI (2015)

  34. He, K.M., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

  35. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: ECCV (2014)

  36. https://www.wjx.cn/

  37. Huang, X., Liu, M.Y., Belongie, S., Kautz, J.: Multimodal unsupervised image-to-image translation. In: ECCV(2018)

  38. Lee, H.Y., Tseng, H.Y., Huang, J.B., Singh, M., Yang, M.H.: Diverse image-to-image translation via disentangled representations. In: ECCV (2018)

Download references

Acknowledgements

We thank the reviewers for their valuable feedback. This work was partly supported by the Natural Science Foundation of China (61762003), CAS “Light of West China” Program (2018QNXZ0024) and First-class discipline construction in Ningxia universities (Electronic Science and Technology: NXYLXK2017A07).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Bai.

Ethics declarations

Conflict of interest

The authors declared that they have no conflicts of interest to this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bai, J., Chen, R. & Liu, M. Feature-attention module for context-aware image-to-image translation. Vis Comput 36, 2145–2159 (2020). https://doi.org/10.1007/s00371-020-01943-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-020-01943-0

Keywords

Navigation