Skip to main content

Improving Disentanglement-Based Image-to-Image Translation with Feature Joint Block Fusion

  • Conference paper
  • First Online:
Advances in Brain Inspired Cognitive Systems (BICS 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11691))

Included in the following conference series:

  • 1275 Accesses

Abstract

Image-to-image translation aims to change attributes or domains of images, where the feature disentanglement based method is widely used recently due to its feasibility and effectiveness. In this method, a feature extractor is usually integrated in the encoder-decoder architecture generative adversarial network (GAN), which extracts features from domains and images, respectively. However, the two types of features are not properly combined, resulting in blurry generated images and indistinguishable translated domains. To alleviate this issue, we propose a new feature fusion approach to leverage the ability of the feature disentanglement. Instead of adding the two extracted features directly, we design a joint block fusion that contains integration, concatenation, and squeeze operations, thus allowing the generator to take full advantage of the two features and generate more photo-realistic images. We evaluate both the classification accuracy and Fréchet Inception Distance (FID) of the proposed method on two benchmark datasets of Alps Seasons and CelebA. Extensive experimental results demonstrate that the proposed joint block fusion can improve both the discriminability of domains and the quality of translated image. Specially, the classification accuracies are improved by 1.04% (FID reduced by 1.22) and 1.87% (FID reduced by 4.96) on Alps Seasons and CelebA, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Cao, G., Xie, X., Yang, W., Liao, Q., Shi, G., Wu, J.: Feature-fused SSD: fast detection for small objects. In: Ninth International Conference on Graphic and Image Processing (ICGIP 2017) (2017)

    Google Scholar 

  2. Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: StarGAN: unified generative adversarial networks for multi-domain image-to-image translation. In: The IEEE Conference on Computer Vision and Pattern Recognition, June 2018

    Google Scholar 

  3. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR (2015). http://arxiv.org/abs/1512.03385

  4. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, pp. 6626–6637 (2017)

    Google Scholar 

  5. Huang, K., Hussain, A., Wang, Q., Zhang, R.: Deep Learning: Fundamentals, Theory and Applications. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-06073-2. ISBN 978-3-030-06072-5

    Book  Google Scholar 

  6. Lin, J., Xia, Y., Liu, S., et al.: Exploring explicit domain supervision for latent space disentanglement in unpaired image-to-image translation. arXiv 1902.03782 (2019)

    Google Scholar 

  7. Liu, M., et al.: STGAN: a unified selective transfer network for arbitrary image attribute editing. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  8. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

    Google Scholar 

  9. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML 2010), pp. 807–814 (2010)

    Google Scholar 

  10. Qian, Z., Huang, K., Wang, Q., Xiao, J., Zhang, R.: Generative adversarial classifier for handwriting characters super-resolution. arXiv:1901.06199 (2019)

  11. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  12. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6924–6932 (2017)

    Google Scholar 

  13. Wang, J., Feng, W., Chen, Y., Yu, H., Huang, M., Yu, P.S.: Visual domain adaptation with manifold embedded distribution alignment. In: 2018 ACM Multimedia, pp. 402–410. ACM (2018)

    Google Scholar 

  14. Zhang, Z., Song, Y., Qi, H.: Age progression/regression by conditional adversarial autoencoder. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5810–5818 (2017)

    Google Scholar 

Download references

Acknowledgements

The work was partially supported by National Natural Science Foundation of China under no. 61876155 and 61876154; The Natural Science Foundation of the Jiangsu Higher Education Institutions of China under no. 17KJD520010; Suzhou Science and Technology Program under no. SYG201712, SZS201613; Natural Science Foundation of Jiangsu Province BK20181189 and BK20181190; Key Program Special Fund in XJTLU under no. KSF-A-01, KSF-P-02, KSF-E-26, and KSF-A-10; XJTLU Research Development Fund RDF-16-02-49.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kaizhu Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Z., Zhang, R., Wang, QF., Huang, K. (2020). Improving Disentanglement-Based Image-to-Image Translation with Feature Joint Block Fusion. In: Ren, J., et al. Advances in Brain Inspired Cognitive Systems. BICS 2019. Lecture Notes in Computer Science(), vol 11691. Springer, Cham. https://doi.org/10.1007/978-3-030-39431-8_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-39431-8_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-39430-1

  • Online ISBN: 978-3-030-39431-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics