Abstract
3D face reconstruction from single face image has received much attention in the past decade, as it has been used widely in many applications in the field of computer vision. Despite more accurate solutions by 3D scanners and several commercial systems, they have drawbacks such as the need for manual initialization, time and economy constraints. In this paper, a novel framework for 3D face reconstruction is presented. Firstly, landmarks are localized on the database faces with the proposed landmark-mapping strategy employing a model template. Then, an autoencoder assisted by the proposed energy function to simultaneously learn the facial patch subspace and the keypoints positions is employed to predict the landmarks. Finally, an unique 3D reconstruction is obtained with the proposed predicted landmark based deformation. Meta-parameters are incorporated into the energy function during the training phase to enhance the performance of the autoencoder network in reconstructing the face model. The experiments are carried out on two databases namely the USF Human ID 3-D Database and the Bosphorus 3D face database. The experimental results show that the Autoencoder based Face REconstruction with Simultaneous patch Learning and Landmark Estimation method (SL2E-AFRE) is efficient and the performance of the same is significantly upgraded in each iteration.
Similar content being viewed by others
References
Amberg B, Romdhani S, Vetter T (2007) Optimal step nonrigid icp algorithms for surface registration. In: 2007 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8
Arslan AT, Seke E (2019) Face depth estimation with conditional generative adversarial networks. IEEE Access 7:23,222–23,231
Baldi P (2012) Autoencoders, unsupervised learning, and deep architectures. In: Proceedings of ICML workshop on unsupervised and transfer learning, pp 37–49
Baumberger C, Reyes M, Constantinescu M, Olariu R, de Aguiar E, Santos TO (2014) 3d face reconstruction from video using 3d morphable model and silhouette. In: 2014 27th SIBGRAPI conference on graphics, patterns and images (SIBGRAPI). IEEE, pp 1–8
Besl PJ, McKay ND (1992) Method for registration of 3-d shapes. In: Sensor fusion IV: control paradigms and data structures, international society for optics and photonics, vol 1611, pp 586–606
Blanz V, Vetter T (1999) A morphable model for the synthesis of 3d faces. In: Proceedings of the 26th annual conference on computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., pp 187–194
Booth J, Roussos A, Ververas E, Antonakos E, Ploumpis S, Panagakis Y, Zafeiriou S (2018) 3d reconstruction of “in-the-wild” faces in images and videos. IEEE Trans Patt Anal Mach Intell 40 (11):2638–2652
Castelan M, Hancock ER (2004) Acquiring height maps of faces from a single image. In: Proceedings. 2nd international symposium on 3D data processing, visualization and transmission, 2004. 3DPVT 2004. IEEE, pp 183–190
Chang T, Li H, Wen G, Hu Y, Ma J (2019) Facial expression recognition sensing the complexity of testing samples. Appl Intell 49(12):4319–4334
Ding B, Wang Y, Yao J, Lu P (2006) A fast individual face modeling and facial animation system. In: International conference on technologies for E-learning and digital entertainment. Springer, pp 980–988
Ding L, Ding X, Fang C (2014) 3d face sparse reconstruction based on local linear fitting. Vis Comput 30(2):189–200
Dou P, Shah SK, Kakadiaris IA (2017) End-to-end 3d face reconstruction with deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5908–5917
Feng Y, Wu F, Shao X, Wang Y, Zhou X (2018) Joint 3d face reconstruction and dense alignment with position map regression network. In: Proceedings of the European conference on computer vision (ECCV), pp 534–551
Gowsikhaa D, Abirami S, Baskaran R (2014) Automated human behavior analysis from surveillance videos: a survey. Artif Intell Rev 42(4):747–765
Han L, Xiao Q, Wang S (2016) 3d face reconstruction from a single frontal face image by robust cascaded regression. In: 2016 international symposium on computer, consumer and control (IS3C). IEEE, pp 841–845
Hartigan JA, Wong MA (1979) Algorithm as 136: a k-means clustering algorithm. J Royal Stat Soc Series C (Applied Statistics) 28(1):100–108
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Horn BK (1975) Obtaining shape from shading information. Psychol Comput Vis: 115–155
Jackson AS, Bulat A, Argyriou V, Tzimiropoulos G (2017) Large pose 3d face reconstruction from a single image via direct volumetric cnn regression. In: Proceedings of the IEEE international conference on computer vision, pp 1031–1039
Jiang D, Hu Y, Yan S, Zhang L, Zhang H, Gao W (2005) Efficient 3d reconstruction for face recognition. Pattern Recogn 38(6):787–798
Jiang L, Zhang J, Deng B, Li H, Liu L (2018) 3d face reconstruction with geometry details from a single image. IEEE Trans Image Process 27(10):4756–4770
Joshi M, Vyas A (2020) Comparison of canny edge detector with sobel and prewitt edge detector using different image formats. Int J Eng Res Technol (1):133–137
Kemelmacher-Shlizerman I, Basri R (2011) 3d face reconstruction from a single image using a single reference face shape. IEEE Trans Pattern Anal Mach Intell 33(2):394–405
Liang H, Liang R, Song M, He X (2016) Coupled dictionary learning for the detail-enhanced synthesis of 3-d facial expressions. IEEE Trans Cybern 46(4):890–901
Luo C, Zhang J, Yu J, Chen CW, Wang S (2019) Real-time head pose estimation and face modeling from a depth image. IEEE Trans Multimedia
Karthika Devi MS, Shahin Fathima RB (2019) Cbcs - comic book cover synopsis: Generating synopsis of a comic book with unsupervised abstarctive dialogue. In: International conference on 9th world engineering education forum 2019
Karthika Devi RB, Shahin Fathima MS (2019) Sync- short yet novel concise natural language description: Generatimng a short story sequence of an album images using multi modal network. In: International conference on ICT for sustainable development
Park SW, Heo J, Savvides M (2008) 3d face econstruction from a single 2d face image. In: 2008 IEEE computer society conference on computer vision and pattern recognition workshops. IEEE, pp 1–8
Patel NM, Zaveri M (2012) 3d model reconstruction and animation from single view face image. In: 2012 international conference on audio, language and image processing (ICALIP). IEEE, pp 674–682
Richardson E, Sela M, Or-El R, Kimmel R (2017) Learning detailed face reconstruction from a single image. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1259–1268
Savran A, Alyüz N, Dibeklioğlu H, Çeliktutan O, Gökberk B, Sankur B, Akarun L (2008) Bosphorus database for 3d face analysis. In: European workshop on biometrics and identity management. Springer, pp 47–56
Sivarathinabala M, Abirami S, Baskaran R (2015) View invariant human action recognition using improved motion descriptor. In: Computational intelligence in data mining, vol 3. Springer, pp 545–554
Song M, Tao D, Huang X, Chen C, Bu J (2012) Three-dimensional face reconstruction from a single image by a coupled rbf network. IEEE Trans Image Process 21(5):2887–2897
Sun Y, Jian M, Dong J (2016) Human face reconstruction from a single input image based on a coupled statistical model. In: Bio-inspired computing-theories and applications. Springer, pp 373–378
Tozza S, Falcone M (2016) Analysis and approximation of some shape-from-shading models for non-lambertian surfaces. J Math Imaging Vis 55(2):153–178
Tran AT, Hassner T, Masi I, Paz E, Nirkin Y, Medioni GG (2018) Extreme 3d face reconstruction: Seeing through occlusions. In: CVPR, pp 3935–3944
Tran L, Liu X (2019) On learning 3d face morphable model from in-the-wild images. IEEE Trans Pattern Anal Mach Intell
Wei W, Xu Q, Wang L, Hei X, Shen P, Shi W, Shan L (2014) Gi/geom/1 queue based on communication model for mesh networks. Int J Commun Syst 27(11):3013–3029
Wei W, Fan X, Song H, Fan X, Yang J (2016) Imperfect information dynamic stackelberg game based resource allocation using hidden markov for cloud computing. IEEE Trans Serv Comput 11(1):78–89
Wei W, Song H, Li W, Shen P, Vasilakos A (2017) Gradient-driven parking navigation using a continuous information potential field based on wireless sensor network. Inform Sci 408:100–114
Wei W, Su J, Song H, Wang H, Fan X (2018) Cdma-based anti-collision algorithm for epc global c1 gen2 systems. Telecommun Syst 67(1):63–71
Wei W, Xia X, Wozniak M, Fan X, Damaševičius R, Li Y (2019) Multi-sink distributed power control algorithm for cyber-physical-systems in coal mine tunnels. Comput Netw 161:210–219
Wei W, Zhou B, Połap D, Woźniak M (2019) A regional adaptive variational pde model for computed tomography image reconstruction. Pattern Recogn 92:64–81
Wu F, Li S, Zhao T, Ngan KN, Sheng L (2019) Cascaded regression using landmark displacement for 3d face reconstruction. Pattern Recogn Lett 125:766–772
Wu Y, Ji Q (2019) Facial landmark detection: a literature survey. Int J Comput Vis 127 (2):115–142
Zeng D, Zhao Q, Long S, Li J (2017) Examplar coherent 3d face reconstruction from forensic mugshot database. Image Vis Comput 58:193–203
Zhang J, Zhuang YT (2007) Sample based 3d face reconstruction from a single frontal image by adaptive locally linear embedding. J Zhejiang University-SCIENCE A 8(4):550–558
Zhang J, Li K, Liang Y, Li N (2017) Learning 3d faces from 2d images via stacked contractive autoencoder. Neurocomputing 257:67–78
Zhou X, Leonardos S, Hu X, Daniilidis K (2015) 3d shape reconstruction from 2d landmarks: A convex formulation. In: Proceedings of IEEE conference on computer vision and pattern recognition. Citeseer, pp 4447–4455
Acknowledgments
This publication is an outcome of the R&D work undertaken in the project under the Visvesvaraya PhD Scheme of Ministry of Electronics & Information Technology, Government of India, being implemented by Digital India Corporation(formerly Media Lab Asia).
Funding
This publication is an outcome of the R&D work undertaken in the project under the Visvesvaraya PhD Scheme of Ministry of Electronics & Information Technology, Government of India, being implemented by Digital India Corporation(formerly Media Lab Asia).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All the authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Devi, P.R.S., Baskaran, R. SL2E-AFRE : Personalized 3D face reconstruction using autoencoder with simultaneous subspace learning and landmark estimation. Appl Intell 51, 2253–2268 (2021). https://doi.org/10.1007/s10489-020-02000-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-02000-y