Skip to main content
Log in

Multi-view face generation via unpaired images

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Multi-view face generation from a single image is an essential and challenging problem. Most of the existing methods need to use paired images when training models. However, collecting and labeling large-scale paired face images could lead to high labor and time cost. In order to address this problem, multi-view face generation via unpaired images is proposed in this paper. To avoid using paired data, the encoder and discriminator are trained, so that the high-level abstract features of the identity and view of the input image are learned by the encoder, and then, these low-dimensional data are input into the generator, so that the realistic face image can be reconstructed by the training generator and discriminator. During testing, multiple one-hot vectors representing the view are imposed to the identity representation and the generator is employed to map them to high-dimensional data, respectively, which can generate multi-view images while preserving the identity features. Furthermore, to reduce the number of used labels, semi-supervised learning is used in the model. The experimental results show that our method can produce photo-realistic multi-view face images with a small number of view labels, and makes a useful exploration for the synthesis of face images via unpaired data and very few labels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. https://github.com/guozhongluo/head-pose-estimation-and-face-landmark

References

  1. Blanz, V., Vetter, T., Rockwood, A.: A morphable model for the synthesis of 3D faces. ACM Siggraph. 187–194, (2002)

  2. Luo, J., Juyong, Z., Bailin, D., et al.: 3D face reconstruction with geometry details from a single image. IEEE Trans. Image Process. 27, 4756 (2018)

    Article  MathSciNet  Google Scholar 

  3. Booth, J., Roussos, A., Ponniah, A., et al.: Large scale 3d morphable models. Int. J. Comput. Vision. pp. 1-22 (2017)

  4. Zhou, H., Liu, J., Liu, Z., et al.: Rotate-and-render: unsupervised photorealistic face rotation from single-view images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2020)

  5. Tewariet, A., al.: StyleRig: Rigging StyleGAN for 3D Control over Portrait Images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2020)

  6. Ma, M., Peng, S., Hu, X.: A lighting robust fitting approach of 3D morphable model for face reconstruction. V. Computer 32(10), 1223 (2016)

  7. Isola, P., Zhu, J, Y., Zhou, T., et al.: Image-to-image translation with conditional adversarial networks[J]. (2016)

  8. Jia, X., Ghodrati, A., Pedersoli, M., et al.: Towards automatic image editing: learning to see another you. In: British Machine Vision Conference (BMVC). (2016)

  9. Zhu, Z., Luo, P., Wang, X., et al.: Multi-view perceptron: a deep model for learning face identity and view representations. In: Annual Conference on Neural Information Processing Systems (NIPS), pp. 217-225 (2014)

  10. Fu, C., et al.: High fidelity face manipulation with extreme pose and expression. arXiv:1903.12003. (2019)

  11. Goodfellow, I.J., Pouget, J., Mirza, M., et al.: Generative adversarial nets. In: annual conference on neural Information processing systems (NIPS), pp. 2672-2680. (2014)

  12. Tran, L., Yin, X., Liu, X.: Disentangled representation learning GAN for pose-invariant face recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), volume 3, page 7. (2017)

  13. Tian, Y., Peng, X., Zhao, L., et al.: CR-GAN: Learning complete representations for multi-view generation. In: International joint conference on artificial intelligence (IJCAI), pp. 942-948. (2018)

  14. Huang, R., Zhang, S., Li, T., et al.: Beyond face rotation: global and local perception gan for photorealistic and identity preserving frontal view synthesis. In: IEEE International Conference on Computer Vision (ICCV). (2017)

  15. Cao, J., Hu, Y., Yu, B., et al.: Load balanced gans for multi-view face image synthesis. arXiv preprint arXiv:1802.07447. (2018)

  16. Hu, Y., Wu, X., Yu, B., et al.: Pose-guided photorealistic face rotation. In: IEEE conference on computer vision and pattern recognition (CVPR). (2018)

  17. Sanchez, E., et al.: A recurrent cycle consistency loss for progressive face-to-face synthesis. in FG. (2020)

  18. Donahue, C., Lipton, Z. C., et al.: Semantically decomposing the latent spaces of generative adversarial networks. In: International Conference on Learning Representations (ICLR). (2018)

  19. Chen, M., Denoyer, L., et al.: Multi-view data generation without view supervision. In: International Conference on Learning Representations (ICLR). (2018)

  20. Emily, L., Denton., Vighnesh, B.: Unsupervised learning of disentangled representations from video. In: Annual conference on neural iInformation processing ystems (NIPS). (2017)

  21. Deng, Y., Yang, J., Chen, D., et al.: Disentangled and controllable face image generation via 3D imitative-contrastive learning. In: IEEE conference on computer vision and pattern recognition (CVPR). (2020)

  22. Gross, R., Matthews, I., Cohn, J., et al.: Multi-pie. Image V. Comput. 28(5), 807–813 (2010)

    Google Scholar 

  23. Zhu, J.Y., Park, T., Isola, P., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE international conference on computer vision (ICCV). (2017)

  24. Mirza, M., Osindero, S.: Conditional generative adversarial nets. Comput. Sci. 2672–2680, (2014)

  25. Denton, E., Gross, S., Fergus, R.: Semi-supervised learning with context-conditional generative adversarial networks. arXiv preprint arXiv:1611.06430. (2016)

  26. Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: International conference on machine learning (ICML), pp. 2642-2651. (2017)

  27. Chen, X., Duan, Y., Houthooft, R., et al.: Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In: Annual Conference on Neural Information Processing Systems (NIPS). (2016)

  28. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: International conference on learning representations (ICLR). (2014)

  29. Bao, J., Chen, D., Wen, F.: Cvae-gan: fine-grained image generation through asymmetric training. In: IEEE International Conference on Computer Vision (ICCV). (2017)

  30. Makhzani, A., Shlens, J., Jaitly, N., et al.: Adversarial autoencoders. In: International Conference on Learning Representations (ICLR). (2016)

  31. Hassner, T., Harel, S., Paz, E., et al.: Effective face frontalization in unconstrained images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4295-4304. (2015)

  32. Zhu, X., Lei, Z., Yan, J., et al.: High-fidelity pose and expression normalization for face recognition in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2015)

  33. Li, S., Liu, X., Chai X., et al.: Morphable displacement field based image matching for face recognition across pose. In: European conference on computer vision (ECCV). (2012)

  34. Bas, A., Smith, W., Bolkart, T., et al.: Fitting a 3d morphable model to edges: a comparison between hard and soft correspondences. In: Asian conference on computer vision (ACCV), pp. 377-391. (2016)

  35. Sagonas, C., Panagakis, Y., Zafeiriou, S.: Robust statistical face frontalization. In: IEEE international conference on computer vision (ICCV). (2015)

  36. Sagonas, C., Panagakis, Y., Zafeiriou, S., et al.: Robust statistical frontalization of human and animal faces. Int. J. Comput. 122(2), 270–291 (2017)

    MathSciNet  Google Scholar 

  37. Yang, J., Reed, S., Yang, M.H., et al.: Weakly-supervised disentangling with recurrent transformations for 3d view synthesis. In: Annual conference on neural information processing systems (NIPS), pp. 1099-1107. (2015)

  38. Yim, N.J., Jung, N.H., Yoo, B.I., et al.: Rotating your face using multi-task deep neural network. In: IEEE conference on computer vision and pattern recognition (CVPR), pp. 676-684. (2015)

  39. Yin, X., Yu, X., Sohn, K., et al.: Towards large-pose face frontalization in the wild. In: IEEE international conference on computer vision (ICCV). (2017)

  40. Sanghoon, K., Jinmook, L., Kyeongryeol, B., et al.: Low-power scalable 3-d face frontalization processor for cnn-based face recognition in mobile devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems. (2018)

  41. Blanz, V., Vetter, T., S. : Face recognition based on fitting a 3d morphable model. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1063–1074 (2003)

  42. Eduardo, P., Carlos, C.: Unsupervised learning for concept detection in medical images: a comparative analysis. Appl. Sci. 8, 1213 (2018)

    Article  Google Scholar 

  43. Jirui, Y., Ke, G., Pengfei, Z, K., et al.: Multi-view predictive latent space learning. Pattern Recognition Letters. (2018)

  44. Guillaume, L., Neil, Z., Nicolas, U., et al.: Fader networks: manipulating images by sliding attributes. In: Annual conference on neural information processing systems (NIPS). (2017)

  45. Tang, Y., Han, X., Li, Y., et al.: Expressive facial style transfer for personalized memes mimic. V. Comput. 35(6–8), 783–795 (2019)

    Google Scholar 

  46. Liu, H., Li, C., Lei, D., et al.: Unsupervised video-to-video translation with preservation of frame modification tendency. V. Comput. 36, 2105 (2020)

    Google Scholar 

  47. Huang, R., Ye, M., Xu, P., et al.: Learning to pool high-level features for face representation. V. Comput 31, 1683 (2015)

    Google Scholar 

  48. Odena, A.: Semi-supervised learning with generative adversarial networks. arXiv preprint arXiv:1606.01583. (2016)

  49. Salimans, T., Goodfellow, I.J., Zaremba, W., et al.: Improved techniques for training gans. In: Annual Conference on Neural Information Processing Systems (NIPS). (2016)

  50. Springenberg, J.T.: Unsupervised and semi-supervised learning with categorical generative adversarial networks. In: International Conference on Learning Representations (ICLR). (2016)

  51. Zhu, X., Lei, Z., Liu, X, Shi H., et al.: Face alignment across large poses: a 3D solution. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2016)

  52. Viola, P., Jones, M.: Fast and robust classification using asymmetric AdaBoost and a detector cascade. Adv. Neural Inf. Process. Syst (2002)

  53. Kingma, D.P., Ba, J.L.: Adam: A method for stochastic optimization. In: International Conference on Learning Representations (ICLR). (2015)

  54. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: A unified embedding for face recognition and clustering. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815-823. (2015)

  55. Martin, H., Hubert, R., Thomas, U., et al.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Adv. Neural Inf. Process. Syst. (2017)

  56. Phillips, D.P., Moon, H., et al.: The FERET evaluation methodology for face recognition algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 22(10), 1090–1104 (2000)

    Article  Google Scholar 

  57. GourierD, N., Hall, Crowley, J.: Estimating face orientation from robust detection of salient facial structures. in Proc. Pointing 2004 Workshop: Visual Observation of Deictic Gestures, pp. 17-25. (2004)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant 62076117, Grant 61762061 and 61862044, the Natural Science Foundation of Jiangxi Province, China, under Grant 20161ACB20004, and Jiangxi Key Laboratory of Smart City under Grant No. 20192BCD40002.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weidong Min.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, S., Zou, Y., Min, W. et al. Multi-view face generation via unpaired images. Vis Comput 38, 2539–2554 (2022). https://doi.org/10.1007/s00371-021-02129-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-021-02129-y

Keywords

Navigation