Multi-view face generation via unpaired images

Wang, Shuai; Zou, Yanni; Min, Weidong; Wu, Jiansheng; Xiong, Xin

doi:10.1007/s00371-021-02129-y

Multi-view face generation via unpaired images

Original article
Published: 13 April 2021

Volume 38, pages 2539–2554, (2022)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Shuai Wang¹,
Yanni Zou¹,
Weidong Min^2,3,
Jiansheng Wu¹ &
…
Xin Xiong¹

577 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Multi-view face generation from a single image is an essential and challenging problem. Most of the existing methods need to use paired images when training models. However, collecting and labeling large-scale paired face images could lead to high labor and time cost. In order to address this problem, multi-view face generation via unpaired images is proposed in this paper. To avoid using paired data, the encoder and discriminator are trained, so that the high-level abstract features of the identity and view of the input image are learned by the encoder, and then, these low-dimensional data are input into the generator, so that the realistic face image can be reconstructed by the training generator and discriminator. During testing, multiple one-hot vectors representing the view are imposed to the identity representation and the generator is employed to map them to high-dimensional data, respectively, which can generate multi-view images while preserving the identity features. Furthermore, to reduce the number of used labels, semi-supervised learning is used in the model. The experimental results show that our method can produce photo-realistic multi-view face images with a small number of view labels, and makes a useful exploration for the synthesis of face images via unpaired data and very few labels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Image Matching from Handcrafted to Deep Features: A Survey

Article Open access 04 August 2020

High-fidelity facial expression transfer using part-based local–global conditional gans

Article 26 July 2023

Notes

https://github.com/guozhongluo/head-pose-estimation-and-face-landmark

References

Blanz, V., Vetter, T., Rockwood, A.: A morphable model for the synthesis of 3D faces. ACM Siggraph. 187–194, (2002)
Luo, J., Juyong, Z., Bailin, D., et al.: 3D face reconstruction with geometry details from a single image. IEEE Trans. Image Process. 27, 4756 (2018)
Article MathSciNet Google Scholar
Booth, J., Roussos, A., Ponniah, A., et al.: Large scale 3d morphable models. Int. J. Comput. Vision. pp. 1-22 (2017)
Zhou, H., Liu, J., Liu, Z., et al.: Rotate-and-render: unsupervised photorealistic face rotation from single-view images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2020)
Tewariet, A., al.: StyleRig: Rigging StyleGAN for 3D Control over Portrait Images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2020)
Ma, M., Peng, S., Hu, X.: A lighting robust fitting approach of 3D morphable model for face reconstruction. V. Computer 32(10), 1223 (2016)
Isola, P., Zhu, J, Y., Zhou, T., et al.: Image-to-image translation with conditional adversarial networks[J]. (2016)
Jia, X., Ghodrati, A., Pedersoli, M., et al.: Towards automatic image editing: learning to see another you. In: British Machine Vision Conference (BMVC). (2016)
Zhu, Z., Luo, P., Wang, X., et al.: Multi-view perceptron: a deep model for learning face identity and view representations. In: Annual Conference on Neural Information Processing Systems (NIPS), pp. 217-225 (2014)
Fu, C., et al.: High fidelity face manipulation with extreme pose and expression. arXiv:1903.12003. (2019)
Goodfellow, I.J., Pouget, J., Mirza, M., et al.: Generative adversarial nets. In: annual conference on neural Information processing systems (NIPS), pp. 2672-2680. (2014)
Tran, L., Yin, X., Liu, X.: Disentangled representation learning GAN for pose-invariant face recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), volume 3, page 7. (2017)
Tian, Y., Peng, X., Zhao, L., et al.: CR-GAN: Learning complete representations for multi-view generation. In: International joint conference on artificial intelligence (IJCAI), pp. 942-948. (2018)
Huang, R., Zhang, S., Li, T., et al.: Beyond face rotation: global and local perception gan for photorealistic and identity preserving frontal view synthesis. In: IEEE International Conference on Computer Vision (ICCV). (2017)
Cao, J., Hu, Y., Yu, B., et al.: Load balanced gans for multi-view face image synthesis. arXiv preprint arXiv:1802.07447. (2018)
Hu, Y., Wu, X., Yu, B., et al.: Pose-guided photorealistic face rotation. In: IEEE conference on computer vision and pattern recognition (CVPR). (2018)
Sanchez, E., et al.: A recurrent cycle consistency loss for progressive face-to-face synthesis. in FG. (2020)
Donahue, C., Lipton, Z. C., et al.: Semantically decomposing the latent spaces of generative adversarial networks. In: International Conference on Learning Representations (ICLR). (2018)
Chen, M., Denoyer, L., et al.: Multi-view data generation without view supervision. In: International Conference on Learning Representations (ICLR). (2018)
Emily, L., Denton., Vighnesh, B.: Unsupervised learning of disentangled representations from video. In: Annual conference on neural iInformation processing ystems (NIPS). (2017)
Deng, Y., Yang, J., Chen, D., et al.: Disentangled and controllable face image generation via 3D imitative-contrastive learning. In: IEEE conference on computer vision and pattern recognition (CVPR). (2020)
Gross, R., Matthews, I., Cohn, J., et al.: Multi-pie. Image V. Comput. 28(5), 807–813 (2010)
Google Scholar
Zhu, J.Y., Park, T., Isola, P., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: IEEE international conference on computer vision (ICCV). (2017)
Mirza, M., Osindero, S.: Conditional generative adversarial nets. Comput. Sci. 2672–2680, (2014)
Denton, E., Gross, S., Fergus, R.: Semi-supervised learning with context-conditional generative adversarial networks. arXiv preprint arXiv:1611.06430. (2016)
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: International conference on machine learning (ICML), pp. 2642-2651. (2017)
Chen, X., Duan, Y., Houthooft, R., et al.: Infogan: Interpretable representation learning by information maximizing generative adversarial nets. In: Annual Conference on Neural Information Processing Systems (NIPS). (2016)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: International conference on learning representations (ICLR). (2014)
Bao, J., Chen, D., Wen, F.: Cvae-gan: fine-grained image generation through asymmetric training. In: IEEE International Conference on Computer Vision (ICCV). (2017)
Makhzani, A., Shlens, J., Jaitly, N., et al.: Adversarial autoencoders. In: International Conference on Learning Representations (ICLR). (2016)
Hassner, T., Harel, S., Paz, E., et al.: Effective face frontalization in unconstrained images. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4295-4304. (2015)
Zhu, X., Lei, Z., Yan, J., et al.: High-fidelity pose and expression normalization for face recognition in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2015)
Li, S., Liu, X., Chai X., et al.: Morphable displacement field based image matching for face recognition across pose. In: European conference on computer vision (ECCV). (2012)
Bas, A., Smith, W., Bolkart, T., et al.: Fitting a 3d morphable model to edges: a comparison between hard and soft correspondences. In: Asian conference on computer vision (ACCV), pp. 377-391. (2016)
Sagonas, C., Panagakis, Y., Zafeiriou, S.: Robust statistical face frontalization. In: IEEE international conference on computer vision (ICCV). (2015)
Sagonas, C., Panagakis, Y., Zafeiriou, S., et al.: Robust statistical frontalization of human and animal faces. Int. J. Comput. 122(2), 270–291 (2017)
MathSciNet Google Scholar
Yang, J., Reed, S., Yang, M.H., et al.: Weakly-supervised disentangling with recurrent transformations for 3d view synthesis. In: Annual conference on neural information processing systems (NIPS), pp. 1099-1107. (2015)
Yim, N.J., Jung, N.H., Yoo, B.I., et al.: Rotating your face using multi-task deep neural network. In: IEEE conference on computer vision and pattern recognition (CVPR), pp. 676-684. (2015)
Yin, X., Yu, X., Sohn, K., et al.: Towards large-pose face frontalization in the wild. In: IEEE international conference on computer vision (ICCV). (2017)
Sanghoon, K., Jinmook, L., Kyeongryeol, B., et al.: Low-power scalable 3-d face frontalization processor for cnn-based face recognition in mobile devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems. (2018)
Blanz, V., Vetter, T., S. : Face recognition based on fitting a 3d morphable model. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1063–1074 (2003)
Eduardo, P., Carlos, C.: Unsupervised learning for concept detection in medical images: a comparative analysis. Appl. Sci. 8, 1213 (2018)
Article Google Scholar
Jirui, Y., Ke, G., Pengfei, Z, K., et al.: Multi-view predictive latent space learning. Pattern Recognition Letters. (2018)
Guillaume, L., Neil, Z., Nicolas, U., et al.: Fader networks: manipulating images by sliding attributes. In: Annual conference on neural information processing systems (NIPS). (2017)
Tang, Y., Han, X., Li, Y., et al.: Expressive facial style transfer for personalized memes mimic. V. Comput. 35(6–8), 783–795 (2019)
Google Scholar
Liu, H., Li, C., Lei, D., et al.: Unsupervised video-to-video translation with preservation of frame modification tendency. V. Comput. 36, 2105 (2020)
Google Scholar
Huang, R., Ye, M., Xu, P., et al.: Learning to pool high-level features for face representation. V. Comput 31, 1683 (2015)
Google Scholar
Odena, A.: Semi-supervised learning with generative adversarial networks. arXiv preprint arXiv:1606.01583. (2016)
Salimans, T., Goodfellow, I.J., Zaremba, W., et al.: Improved techniques for training gans. In: Annual Conference on Neural Information Processing Systems (NIPS). (2016)
Springenberg, J.T.: Unsupervised and semi-supervised learning with categorical generative adversarial networks. In: International Conference on Learning Representations (ICLR). (2016)
Zhu, X., Lei, Z., Liu, X, Shi H., et al.: Face alignment across large poses: a 3D solution. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). (2016)
Viola, P., Jones, M.: Fast and robust classification using asymmetric AdaBoost and a detector cascade. Adv. Neural Inf. Process. Syst (2002)
Kingma, D.P., Ba, J.L.: Adam: A method for stochastic optimization. In: International Conference on Learning Representations (ICLR). (2015)
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: A unified embedding for face recognition and clustering. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 815-823. (2015)
Martin, H., Hubert, R., Thomas, U., et al.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. Adv. Neural Inf. Process. Syst. (2017)
Phillips, D.P., Moon, H., et al.: The FERET evaluation methodology for face recognition algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 22(10), 1090–1104 (2000)
Article Google Scholar
GourierD, N., Hall, Crowley, J.: Estimating face orientation from robust detection of salient facial structures. in Proc. Pointing 2004 Workshop: Visual Observation of Deictic Gestures, pp. 17-25. (2004)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant 62076117, Grant 61762061 and 61862044, the Natural Science Foundation of Jiangxi Province, China, under Grant 20161ACB20004, and Jiangxi Key Laboratory of Smart City under Grant No. 20192BCD40002.

Author information

Authors and Affiliations

School of Information Engineering, Nanchang University, Nanchang, 330031, China
Shuai Wang, Yanni Zou, Jiansheng Wu & Xin Xiong
School of Software, Nanchang University, Nanchang, 330047, China
Weidong Min
Jiangxi Key Laboratory of Smart City, Nanchang, 330047, China
Weidong Min

Authors

Shuai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yanni Zou
View author publications
You can also search for this author in PubMed Google Scholar
Weidong Min
View author publications
You can also search for this author in PubMed Google Scholar
Jiansheng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Xiong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weidong Min.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, S., Zou, Y., Min, W. et al. Multi-view face generation via unpaired images. Vis Comput 38, 2539–2554 (2022). https://doi.org/10.1007/s00371-021-02129-y

Download citation

Accepted: 28 March 2021
Published: 13 April 2021
Issue Date: July 2022
DOI: https://doi.org/10.1007/s00371-021-02129-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-view face generation via unpaired images

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Image Matching from Handcrafted to Deep Features: A Survey

High-fidelity facial expression transfer using part-based local–global conditional gans

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-view face generation via unpaired images

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Image Matching from Handcrafted to Deep Features: A Survey

High-fidelity facial expression transfer using part-based local–global conditional gans

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation