Skip to main content
Log in

A bi-directional facial attribute transfer framework: transfer your single facial attribute to a portrait illustration

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Facial attribute transfer aims to transfer target facial attributes—such as beard, bangs and opening mouth—to a face without them in a source facial image while keeping non-target attributes of the face intact. Existing methods for facial attribute transfer are basically homogeneous images oriented, which focus on transferring target attributes to (or between) photorealistic facial images. In this paper, facial attribute transfer between heterogeneous images is addressed, which is a new and more challenging task. More specifically, we propose a bi-directional facial attribute transfer method based on GAN (generative adversarial network) and latent representation in a new way, for the instance based facial attribute transfer that aims to transfer a target facial attribute with its basic shape from a reference photorealistic facial image to a source realistic portrait illustration and vice versa (i.e., erasing the target attribute in the facial image). How to achieve visual style consistency of the transferred attribute in the heterogeneous result images and overcome information dimensionality imbalance between photorealistic facial images and realistic portrait illustrations are the key points in our work. We deal with content and visual style of an image separately in latent representation learning by the composite encoder designed with the architecture of convolutional neural network and fully connected neural network, which is different from previous latent representation based facial attribute transfer methods that mix content and visual style in a latent representation. The approach turns out to well preserve the visual style consistency. Besides, we introduce different multipliers for weights of loss items in our objective functions to balance information imbalance between heterogeneous images. Experiments show that our method is capable of achieving facial attribute transfer between heterogeneous images with good results. For purpose of quantitative analysis, FID scores of our method on a couple of datasets are also given to show its effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Abbreviations

\(*\) :

The state with or without the target attribute. \(*\in \left\{ 0,1\right\}\), where 0 presents the state without the target attribute, and 1 presents the state with the target attribute

\({\Theta }^*\) :

A sample with or without the target attribute

\(a^{{\Theta }^*}\) :

The target attribute code of \({\Theta }^*\)

\(z^{{\Theta }^*}\) :

The non-target attribute code of \({\Theta }^*\)

\(c^{{\Theta }^*}\) :

The content code of \({\Theta }^*\), where \(c^{{\Theta }^*} = (a^{{\Theta }^*},z^{{\Theta }^*} )\)

\(s^{{\Theta }^*}\) :

The style code of \({\Theta }^*\)

\(D_{a^{{\Theta }^*}}\) :

The dimension value of \(a^{{\Theta }^*}\)

\(x^0\) :

A photorealistic facial image without the target attribute

\(x^1\) :

A photorealistic facial image with the target attribute

\(y^0\) :

A realistic portrait illustration without the target attribute

\(y^1\) :

A realistic portrait illustration with the target attribute

\(x_{\mathrm{trans}}^0\) :

The result image of the edited photorealistic facial image

\(y_{\mathrm{trans}}^1\) :

The result image of the edited realistic portrait illustration

\(X^0\) :

The photorealistic facial image domain of facial images without the target attribute

\(X^1\) :

The photorealistic facial image domain of facial images with the target attribute

\(Y^0\) :

The realistic portrait illustration domain of portrait illustrations without the target attribute

\(Y^1\) :

The realistic portrait illustration domain of portrait illustrations with the target attribute

\(E_{X^*}^C\) :

The content encoder for \(x^*\)

\(E_{X^*}^S\) :

The style encoder for \(x^*\)

\(E_{X^*}\) :

The encoder to encode \(x^*\), where \(E_{X^*}=(E_{X^*}^C,E_{X^*}^S)\)

\(E_{Y^*}^C\) :

The content encoder for \(y^*\)

\(E_{Y^*}^S\) :

The style encoder for \(y^*\)

\(E_{Y^*}\) :

The encoder to encode \(y^*\), where \(E_{Y^*}=(E_{Y^*}^C,E_{Y^*}^S)\)

\(G_{X^*}\) :

The generator to generate \(x^*\)

\(G_{Y^*}\) :

The generator to generate \(y^*\)

\(D_{X^0}\) :

The discriminator to discriminate \(x^0\) and \(x_{\mathrm{trans}}^0\) images

\(D_{Y^1}\) :

The discriminator to discriminate \(y^1\) and \(y_{\mathrm{trans}}^1\) images

L :

The loss function

\(\lambda\) :

The weight

\(\alpha\) :

The multiplier for \(\lambda\)

References

  1. Bengio Y, Mesnil G, Dauphin YN, Rifai S (2013) Better mixing via deep representations. In: Proceedings of the 30th international conference on machine learning, ICML 2013, Atlanta, GA, USA, 16–21 June 2013, JMLR workshop and conference proceedings, vol 28, pp 552–560. JMLR.org. http://proceedings.mlr.press/v28/bengio13.html

  2. Choi Y, Choi M, Kim M, Ha J, Kim S, Choo J (2018) Stargan: unified generative adversarial networks for multi-domain image-to-image translation. In: 2018 IEEE conference on computer vision and pattern recognition, CVPR 2018, Salt Lake City, UT, USA, June 18–22, 2018. IEEE Computer Society, pp 8789–8797. https://doi.org/10.1109/CVPR.2018.00916. http://openaccess.thecvf.com/content_cvpr_2018/html/Choi_StarGAN_Unified_Generative_CVPR_2018_paper.html

  3. Dong Y, Liang T, Zhang Y, Du B (2021) Spectral-spatial weighted kernel manifold embedded distribution alignment for remote sensing image classification. IEEE Trans Cybern 51(6):3185–3197. https://doi.org/10.1109/TCYB.2020.3004263

    Article  Google Scholar 

  4. He Z, Zuo W, Kan M, Shan S, Chen X (2019) Attgan: facial attribute editing by only changing what you want. IEEE Trans Image Process 28(11):5464–5478. https://doi.org/10.1109/TIP.2019.2916751

    Article  MathSciNet  MATH  Google Scholar 

  5. Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S (2017) Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Guyon I, von Luxburg U, Bengio, S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R (eds) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp 6626–6637. https://proceedings.neurips.cc/paper/2017/hash/8a1d694707eb0fefe65871369074926d-Abstract.html

  6. Huang X, Belongie SJ (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: IEEE international conference on computer vision, ICCV 2017, Venice, Italy, October 22–29, 2017. IEEE Computer Society, pp 1510–1519. https://doi.org/10.1109/ICCV.2017.167

  7. Huang X, Liu M, Belongie SJ, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) Computer vision—ECCV 2018—15th European conference, Munich, Germany, September 8–14, 2018, proceedings, Part III, Lecture notes in computer science, vol 11207. Springer, pp 179–196. https://doi.org/10.1007/978-3-030-01219-9_11

  8. Isola P, Zhu J, Zhou T, Efros A (2017) A Image-to-image translation with conditional adversarial networks. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017. IEEE Computer Society, pp 5967–5976. https://doi.org/10.1109/CVPR.2017.632

  9. Jing Y, Yang Y, Feng Z, Ye J, Yu Y, Song M (2020) Neural style transfer: a review. IEEE Trans Vis Comput Graph 26(11):3365–3385. https://doi.org/10.1109/TVCG.2019.2921336

    Article  Google Scholar 

  10. Kim T, Kim B, Cha, M, Kim J (2017) Unsupervised visual attribute transfer with reconfigurable generative adversarial networks. CoRR arXiv:1707.09798

  11. Lample G, Zeghidour N, Usunier N, Bordes A, Denoyer L, Ranzato M (2017) Fader networks: manipulating images by sliding attributes. In: Guyon I, von Luxburg U, Bengio S, Wallach HM, Fergus R, Vishwanathan SVN, Garnett R (eds) Advances in neural information processing systems 30: annual conference on neural information processing systems 2017, December 4–9, 2017, Long Beach, CA, USA, pp 5967–5976. https://proceedings.neurips.cc/paper/2017/hash/3fd60983292458bf7dee75f12d5e9e05-Abstract.html

  12. Lee C, Liu Z, Wu L, Luo P (2020) Maskgan: towards diverse and interactive facial image manipulation. In: 2020 IEEE/CVF conference on computer vision and pattern recognition, CVPR 2020, Seattle, WA, USA, June 13–19, 2020. IEEE, pp 5548–5557. https://doi.org/10.1109/CVPR42600.2020.00559

  13. Li M, Zuo W, Zhang D (2016) Convolutional network for attribute-driven and identity-preserving human face generation. CoRR arXiv:1608.06434

  14. Li M, Zuo W, Zhang D (2016) Deep identity-aware transfer of facial attributes. CoRR arXiv:1610.05586

  15. Li X, Zhang S, Hu J, Cao L, Hong X, Mao X, Huang F, Wu Y, Ji R (2021) Image-to-image translation via hierarchical style disentanglement. CoRR arXiv:2103.01456

  16. Lin Y, Wu P, Chang C, Chang EY, Liao S (2019) Relgan: multi-domain image-to-image translation via relative attributes. In: 2019 IEEE/CVF international conference on computer vision, ICCV 2019, Seoul, Korea (South), October 27–November 2, 2019. IEEE, pp 5913–5921. https://doi.org/10.1109/ICCV.2019.00601

  17. Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: 2015 IEEE international conference on computer vision, ICCV 2015, Santiago, Chile, December 7–13, 2015. IEEE Computer Society, pp 3730–3738. https://doi.org/10.1109/ICCV.2015.425

  18. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorch. In: NIPS-W

  19. Perarnau G, van de Weijer J, Raducanu B, Álvarez JM (2016) Invertible conditional GANs for image editing. CoRR arXiv:1611.06355

  20. Shen W, Liu R (2017) Learning residual images for face attribute manipulation. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017. IEEE Computer Society, pp 1225–1233. https://doi.org/10.1109/CVPR.2017.135

  21. Shi R, Ye D, Chen Z (2020) Gradient controlled and discriminative features guided image-to-image translation method towards realistic portrait illustrations. Pattern Recognit Artif Intell 33(11):959. https://doi.org/10.16451/j.cnki.issn1003-6059.202011001 (in Chinese)

    Article  Google Scholar 

  22. Upchurch P, Gardner JR, Pleiss G, Pless R, Snavely N, Bala K, Weinberger KQ (2017) Deep feature interpolation for image content changes. In: 2017 IEEE conference on computer vision and pattern recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, 2017. IEEE Computer Society, pp 6090–6099. https://doi.org/10.1109/CVPR.2017.645

  23. Xiao T, Hong J, Ma J (2018) DNA-GAN: learning disentangled representations from multi-attribute images. In: 6th international conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30–May 3, 2018, workshop track proceedings. OpenReview.net. https://openreview.net/forum?id=rkX1FF_UM

  24. Zhao C, Wang X, Zuo W, Shen F, Shao L, Miao D (2020) Similarity learning with joint transfer constraints for person re-identification. Pattern Recognit. https://doi.org/10.1016/j.patcog.2019.107014

    Article  Google Scholar 

  25. Zhou S, Xiao T, Yang Y, Feng D, He Q, He W (2017) Genegan: learning object transfiguration and attribute subspace from unpaired data. CoRR arXiv:1705.04932

  26. Zhu J, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE international conference on computer vision, ICCV 2017, Venice, Italy, October 22–29, 2017. IEEE Computer Society, pp 2242–2251. https://doi.org/10.1109/ICCV.2017.244

Download references

Funding

This work was supported by National Natural Science Foundation of China [No. 61672158] and Natural Science Foundation of Fujian Province [No. 2018J01798].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong-yi Ye.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Transfer results concerning beard, bangs and opening mouth

Appendix: Transfer results concerning beard, bangs and opening mouth

See Figs. 7, 8, 9, 10, 11 and 12.

Fig. 7
figure 7

Transfer results concerning beard from a \(x^1\) to a \(y^0\). Images in every row from left to right are \(x^1\), \(y^0\), \(x_{\mathrm{trans}}^0\) and \(y_{\mathrm{trans}}^1\), respectively

Fig. 8
figure 8

Transfer results concerning beard from different \(x^1\) images to the same \(y^0\), respectively. Every two columns as a group from left to right. In each group, the left are \(x^1\) images, the top is \(y^0\) image, and others are \(y_{\mathrm{trans}}^1\) images

Fig. 9
figure 9

Transfer results concerning bangs from a \(x^1\) to a \(y^0\). Images in every row from left to right are \(x^1\), \(y^0\), \(x_{\mathrm{trans}}^0\) and \(y_{\mathrm{trans}}^1\), respectively

Fig. 10
figure 10

Transfer results concerning bangs from different \(x^1\) images to the same \(y^0\), respectively. Every two columns as a group from left to right. In each group, the left are \(x^1\) images, the top is \(y^0\) image, and others are \(y_{\mathrm{trans}}^1\) images

Fig. 11
figure 11

Transfer results concerning opening mouth from a \(x^1\) to a \(y^0\). Images in every row from left to right are \(x^1\), \(y^0\), \(x_{\mathrm{trans}}^0\) and \(y_{\mathrm{trans}}^1\), respectively

Fig. 12
figure 12

Transfer results concerning opening mouth from different \(x^1\) images to the same \(y^0\), respectively. Every two columns as a group from left to right. In each group, the left are \(x^1\) images, the top is \(y^0\) image, and others are \(y_{\mathrm{trans}}^1\) images

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, Rx., Ye, Dy. & Chen, Zj. A bi-directional facial attribute transfer framework: transfer your single facial attribute to a portrait illustration. Neural Comput & Applic 34, 253–270 (2022). https://doi.org/10.1007/s00521-021-06360-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06360-5

Keywords

Navigation