Learning-detailed 3D face reconstruction based on convolutional neural networks from a single image

Khan, Asad; Hayat, Sakander; Ahmad, Muhammad; Cao, Jinde; Tahir, Muhammad Faizan; Ullah, Asad; Javed, Muhammad Sufyan

doi:10.1007/s00521-020-05373-w

Learning-detailed 3D face reconstruction based on convolutional neural networks from a single image

Original Article
Published: 30 September 2020

Volume 33, pages 5951–5964, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Asad Khan¹^na1,
Sakander Hayat ORCID: orcid.org/0000-0002-6842-7604²^na1,
Muhammad Ahmad³,
Jinde Cao⁴,
Muhammad Faizan Tahir⁵,
Asad Ullah⁶ &
…
Muhammad Sufyan Javed^7,8

1023 Accesses
8 Citations
Explore all metrics

Abstract

The efficiency of convolutional neural networks (CNNs) facilitates 3D face reconstruction, which takes a single image as an input and demonstrates significant performance in generating a detailed face geometry. The dependence of the extensive scale of labelled data works as a key to making CNN-based techniques significantly successful. However, no such datasets are publicly available that provide an across-the-board quantity of face images with correspondingly explained 3D face geometry. State-of-the-art learning-based 3D face reconstruction methods synthesize the training data by using a coarse morphable model of a face having non-photo-realistic synthesized face images. In this article, by using a learning-based inverse face rendering, we propose a novel data-generation technique by rendering a large number of face images that are photo-realistic and possess distinct properties. Based on the real-time fine-scale textured 3D face reconstruction comprising decently constructed datasets, we can train two cascaded CNNs in a coarse-to-fine manner. The networks are trained for actual detailed 3D face reconstruction from a single image. Experimental results demonstrate that the reconstruction of 3D face shapes with geometry details from only one input image can efficiently be performed by our method. Furthermore, the results demonstrate the efficiency of our technique to pose, expression and lighting dynamics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Utilizing the Neural Renderer for Accurate 3D Face Reconstruction from a Single Image

Article 04 July 2023

Wei Wei, Danni Zhang, … Chen Guo

State-of-the-Art in 3D Face Reconstruction from a Single RGB Image

Faster, Better and More Detailed: 3D Face Reconstruction with Graph Convolutional Networks

Notes

Both fine-scale and coarse-scale photo-realistic face image datasets will be publicly available once the present work is published.
https://github.com/unibas-gravis/scalismo-faces.
https://github.com/waps101/3DMMedges.

References

Blanz V, Vetter T (2003) Face recognition based on fitting a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 25(9):1063–1074
Article Google Scholar
Blanz V (2006) Face recognition based on a 3D morphable model. In: 7th International conference on automatic face and gesture recognition (FGR06), Southampton, pp. 617-624, https://doi.org/10.1109/FGR.2006.42.
Ichim AE, Bouaziz S, Pauly M (2015) Dynamic 3d avatar creation from hand-held video input. ACM Trans Gr (ToG) 34(4):45
Google Scholar
Thies J, Zollhofer M, Stamminger M, Theobalt C, Niessner M (2016) Face2face: real-time face capture and reenactment of rgb videos. In: IEEE conference on computer vision and pattern recognition 2387–2395
Kemelmacher Shlizerman I, Basri R (2011) 3d face reconstruction from a single image using a single reference face shape. IEEE Trans Pattern Anal Mach Intell 33(2):394–405
Article Google Scholar
Richardson E, Sela M, OR-EL R, Kimmel R (2017) Learning detailed face reconstruction from a single image. In: IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, HI, pp. 5553-5562, https://doi.org/10.1109/CVPR.2017.589
Cao C, Weng Y, Zhou S, Tong Y, Zhou K (2014) Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans Vis Comput Gr 20(3):413–425
Article Google Scholar
Aldrian O, Smith WA (2013) Inverse rendering of faces with a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 35(5):1080–1093
Article Google Scholar
Zhang R, Tsai P-S, Cryer JE, Shah M (1999) Shapefrom-shading: a survey. IEEE Trans Pattern Anal Mach Intell 21(8):690–706
Article Google Scholar
Garrido P, Zollhöfer M, Casas D, Valgaerts L, Varanasi K, Pérez P, Theobalt C (2016) Reconstruction of personalized 3d face rigs from monocular video. ACM Trans Gr 35(3):28. https://doi.org/10.1145/2890493
Article Google Scholar
Prados E, Faugeras O (2006) Shape from shading. Handbook of mathematical models in computer vision. Springer, Berlin, pp 375–388
MATH Google Scholar
Shimshoni I, Moses Y, Lindenbaum M (2000) Shape reconstruction of 3d bilaterally symmetric surfaces. Int J Comput Vis 39(2):97–110
Article Google Scholar
Zhao WY, Chellappa R (2000) Illumination-insensitive face recognition using symmetric shape-from-shading. In: IEEE Conference on computer vision and pattern recognition (CVPR) 1:286–293
Zhao WY, Chellappa R (2001) Symmetric shape-fromshading using self-ratio image. Int J Comput Vis 45(1):55–65
Article Google Scholar
STEWART GW (1993) On the early history of the singular value decomposition. SIAM Rev 35(4):551–566
Article MathSciNet Google Scholar
Zhu X, Lei Z, Yan J, Yi D, Li SZ (2015) High-fidelity pose and expression normalization for face recognition in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 787-796
Jourabloo A, Liu X (2016) Large-pose face alignment via cnn-based dense 3d model fitting. In: IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, pp. 4188-4196, https://doi.org/10.1109/CVPR.2016.454
Feng L, Zeng D, Zhao Q, Liu X (2016) Joint face alignment and 3d face reconstruction. In: European conference on computer vision. Amsterdam, The Netherlands
Zhu X, Lei Z, Liu X, Shi H, Li SZ (2016) Face alignment across large poses: a 3d solution. In: IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, pp. 146-155, https://doi.org/10.1109/CVPR.2016.23
Amberg B, Blake A, Fitzgibbon A, Romdhani S, Vetter T (2007) Reconstructing high quality face-surfaces using model based stereo. In: IEEE 11th international conference on computer vision, Rio de Janeiro, pp. 1–8, https://doi.org/10.1109/ICCV.2007.4408998
Dou P, Wu Y, Shah S, Kakadiaris I (2014) Robust 3d face shape reconstruction from single images via two-fold coupled structure learning and off-the-shelf landmark detectors. In: British machine vision conference. https://doi.org/10.5244/C.28.131
Aldrian O, Smith W (2010) A linear approach of 3d face shape and texture recovery using a 3d morphable model. In: British machine vision conference. https://doi.org/10.5244/C.24.75
Liu F, Zeng D, Li J, Zhao Q (2015) Cascaded regressor based 3d face reconstruction from a single arbitrary view image. In arXiv preprint arXiv:1509.06161
Castelan M, Horebeek J V (2008) 3d face shape approximation from intensities using partial least squares. In: IEEE conference on computer vision and pattern recognition workshops, Anchorage, AK, pp. 1-8, https://doi.org/10.1109/CVPRW.2008.4563049
Zhen L, Bai Q, He R, Li SZ (2008) Face shape recovery from a single image using cca mapping between tensor spaces. In: IEEE conference on computer vision and pattern recognition (CVPR), Anchorage, AK, pp. 1-7, https://doi.org/10.1109/CVPR.2008.4587341
Richardson E, Sela M, Kimmel R (2016) 3d face reconstruction by learning from synthetic data. In: Fourth international conference on 3D vision (3DV), Stanford, CA, pp. 460-469, https://doi.org/10.1109/3DV.2016.56
Cao C, Bradley D, Zhou K, Beeler T (2015) Real-time high-fidelity facial performance capture. ACM Trans Gr 34(4):46
Article Google Scholar
Cao C, Hou Q, Zhou K (2014) Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans Gr 33(4):43
Google Scholar
Cao C, Weng Y, Lin S, Zhou K (2013) 3d shape regression for real-time facial animation. ACM Trans Gr 32, 4, Article 41, 10 pages. https://doi.org/10.1145/2461912.2462012
Shi F, Wu H-T, Tong X, Chai J (2014) Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans Gr 33(6):222
Article Google Scholar
Bas A, Smith WAP, Bolkart T, Wuhrer S (2016) Fitting a 3d morphable model to edges: a comparison between hard and soft correspondences. In: Asian conference on computer vision workshop on facial informatics (Taipei, Taiwan), vol. 10117, pp. 377–391
SCHöBORN S, EGGER B, MOREL-FORSTER A, VETTER T (2017) Markov chain Monte Carlo for automated face image analysis. Int J Comput Vis 123:160C183. https://doi.org/10.1007/s11263-016-0967-5
Article MathSciNet Google Scholar
Paysan P, Knothe R, Amberg B, Romdhani S, Vetter T (2009) A 3d face model for pose and illumination invariant face recognition. In: IEEE International conference on advanced video and signal based surveillance, Genova, pp. 296-301, https://doi.org/10.1109/AVSS.2009.58
Ramamoorthi R, Hanrahan P (2001) An efficient representation for irradiance environment maps. In: 28th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’01). Association for Computing Machinery, New York, NY, USA, 497C500. https://doi.org/10.1145/383259.383317
Blanz V, Vetter TA (1999) Morphable model for the synthesis of 3D faces. In: 26th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’99). ACM Press/Addison-Wesley Publishing Co., USA, 187C194. https://doi.org/10.1145/311535.311556
Chartrand R, Yin W (2008) Iteratively reweighted algorithms for compressive sensing. In; IEEE international conference on acoustics, speech and signal processing, Las Vegas, NV, pp. 3869–3872, https://doi.org/10.1109/ICASSP.2008.4518498
Gross R, Matthews I, Cohn J, Kanade T, Baker S (2010) Multi-pie. Image Vis Comput 28(5):807–813
Article Google Scholar
Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: IEEE international conference on computer vision workshops, Sydney, NSW, pp. 397–403, https://doi.org/10.1109/ICCVW.2013.59
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE Conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, pp. 770–778, https://doi.org/10.1109/CVPR.2016.90
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer- assisted intervention, Springer, pp. 234–241
Parkhi OM, Vedaldi A, Zisserman A (2015) Deep face recognition. In: MWJ Xianghua Xie, Tam GKL (Eds.) British Machine Vision Conference (BMVC), BMVA Press, pp. 41.1–41.12
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: 22Nd ACM international conference on multimedia (New York, NY, USA), (MM 14), ACM, pp. 675–678. 10.1145/2647868.2654889
Guo Y, Zhang J, Cai J, Jiang B, Zheng J (2019) Cnnbased real-time dense face reconstruction with inverse-rendered photo-realistic face images. IEEE Trans Pattern Anal Mach Intell 41(6):1294–1307. https://doi.org/10.1109/TPAMI.2018.2837742
Article Google Scholar
Egger B, Schborn S, Schneider A, Kortylewski A, Morel-Forster A, Blumer C, Vetter T (2018) Occlusion-aware 3d morphable models and an illumination prior for face image analysis. Int J Comput Vis 126:1269C1287. https://doi.org/10.1007/s11263-018-1064-8
Article Google Scholar
Garrido P, Valgaerts L, Wu C, Theobalt C (2013) Reconstructing detailed dynamic face geometry from monocular video. ACM Trans Gr 32, 6, Article 158, 10 pages. 10.1145/2508363.2508380
Kim H, Zollhöer M, Tewari A, Thies J, Richardt C, Theobalt C (2018) Inversefacenet: deep single-shot inverse face rendering from a single image. In: IEEE conference on computer vision and pattern recognition
Phillips PJ, Flynn PJ, Scruggs T, Bowyer KW, Chang J, Hoffman K, Marques J, Min J, Worek W (2005) Overview of the face recognition grand challenge. In: IEEE conference on computer vision and pattern recognition (CVPR’05), San Diego, CA, USA, pp. 947–954 vol. 1, https://doi.org/10.1109/CVPR.2005.268
Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: IEEE conference on computer vision and pattern recognition, Columbus, OH, pp. 1867–1874, https://doi.org/10.1109/CVPR.2014.241

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for a careful reading of this article, and for all their comments, which led to a number of improvements in the article. S. Hayat and A. Ullah is supported by the Higher Education Commission, Pakistan under grant number 20-11682/NRPU/RGM/R&D/HEC/2020. A. Khan was supported by the National Natural Science Foundation of China (Grant No. 61772164).

Author information

Asad Khan and Sakander Hayat contributed equally to this work.

Authors and Affiliations

School of Computer Science and Software Engineering, Guangzhou University, Guangzhou, 510006, People’s Republic of China
Asad Khan
Faculty of Engineering Sciences, GIK Institute of Engineering Sciences and Technology, Topi, Pakistan
Sakander Hayat
Department of Computer Science, National University of Computer and Emerging Sciences (NUCES-FAST), Faisalabad Campus, Pakistan
Muhammad Ahmad
Research Center for Complex Systems and Network Sciences, and School of Mathematics, Southeast University, Nanjing, China
Jinde Cao
School of Electric Power, South China University of Technology, Guangzhou, 510640, China
Muhammad Faizan Tahir
Department of Mathematical Sciences, Karakoram International University (KIU), Gilgit-Baltistan, Pakistan
Asad Ullah
Department of Physics, COMSATS University Islamabad, Lahore Campus, Punjab, 54000, Pakistan
Muhammad Sufyan Javed
Siyuan Laboratory, Department of Physics, Jinan University, Guangzhou, 510632, People’s Republic of China
Muhammad Sufyan Javed

Authors

Asad Khan
View author publications
You can also search for this author in PubMed Google Scholar
Sakander Hayat
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Ahmad
View author publications
You can also search for this author in PubMed Google Scholar
Jinde Cao
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Faizan Tahir
View author publications
You can also search for this author in PubMed Google Scholar
Asad Ullah
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Sufyan Javed
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Asad Khan or Sakander Hayat.

Ethics declarations

Conflict of Interests

The authors declare that there are no conflict of interests regarding the publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Khan, A., Hayat, S., Ahmad, M. et al. Learning-detailed 3D face reconstruction based on convolutional neural networks from a single image. Neural Comput & Applic 33, 5951–5964 (2021). https://doi.org/10.1007/s00521-020-05373-w

Download citation

Received: 21 August 2019
Accepted: 18 September 2020
Published: 30 September 2020
Issue Date: June 2021
DOI: https://doi.org/10.1007/s00521-020-05373-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning-detailed 3D face reconstruction based on convolutional neural networks from a single image

Abstract

Access this article

Similar content being viewed by others

Utilizing the Neural Renderer for Accurate 3D Face Reconstruction from a Single Image

State-of-the-Art in 3D Face Reconstruction from a Single RGB Image

Faster, Better and More Detailed: 3D Face Reconstruction with Graph Convolutional Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning-detailed 3D face reconstruction based on convolutional neural networks from a single image

Abstract

Access this article

Similar content being viewed by others

Utilizing the Neural Renderer for Accurate 3D Face Reconstruction from a Single Image

State-of-the-Art in 3D Face Reconstruction from a Single RGB Image

Faster, Better and More Detailed: 3D Face Reconstruction with Graph Convolutional Networks

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation