Abstract
Face analysis has been a hot research field in computer vision for decades. The dataset is of vital importance for modern machine learning methods. The paper proposes a flexible CG (Computer Graphics) rendering pipe-line for creating facial image datasets together with automatic ground truth labelling. The proposed pipe-line could produce a huge amount of labelled data fast and in low cost compared to traditional dataset creation methods which need high cost hardware and longtime manual ground truth labelling. The paper also proposes a data capture setup in the CG environment for creating the dataset for facial landmark localization. The effectiveness of the proposed method is verified by cross validation with Multi-PIE dataset. For creating a high quality training dataset, some of the varying factors of the dataset should be considered. The paper analyzes a few varying factors for accurate eye landmark localization, such as eye closure levels, eye and eyebrow shapes and wearing glasses. Based on the benefits of the proposed CG rendering pipe-line, the paper implemented a facial landmark localization system across large face rotation by integrating off-the-shelves algorithms. The experiments on Multi-PIE and real persons show that the implemented system could localize facial landmarks accurately across [−90°, +90°] in yaw rotation in real time.
Similar content being viewed by others
References
Blender. Available: https://www.blender.org/
Cao C, Weng Y, Zhou S, Tong Y, Zhou K (2014) Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans Vis Comput Graph 20:413–425
Cao X, Wei Y, Wen F, Sun J (2014) Face alignment by explicit shape regression. Int J Comput Vis 107:177–190
Cao Z, Simon T, Wei S-E, Sheikh Y (2017) Realtime multi-person 2d pose estimation using part affinity fields. CVPR: 7
Criminisi A, Shotton J (2013) Decision forests for computer vision and medical image analysis: Springer Science & Business Media
Cycles. Available: https://www.cycles-renderer.org/
Deng W, Fang Y, Xu Z, Hu J (2018) Facial landmark localization by enhanced convolutional neural network. Neurocomputing 273:222–229
Dong Y, Wang Y, Yue J, Hu Z (2016) Real time 3D facial movement tracking using a monocular camera. Sensors 16:1157
Dong Y, Zhang Y, Yue J, Hu Z (2016) Comparison of random forest, random ferns and support vector machine for eye state classification. Multimed Tools Appl 75:11763–11783
Fan X, Liu R, Luo Z, Li Y, Feng Y (2018) Explicit shape regression with characteristic number for facial landmark localization. IEEE Transactions on Multimedia 20:567–579
Gao W, Cao B, Shan S, Chen X, Zhou D, Zhang X et al (2008) The CAS-PEAL large-scale Chinese face database and baseline evaluations. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 38:149–161
Gross R, Matthews I, Cohn J, Kanade T, Baker S (2010) Multi-pie. Image Vis Comput 28:807–813
Guo J-M, Markoni H (2018) Driver drowsiness detection using hybrid convolutional neural network and long short-term memory. Multimed Tools Appl: 1–29
Huang GB, Ramesh M, Berg T, Learned-Miller E (2007) Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst
H. Joo, H. Liu, L. Tan, L. Gui, B. Nabbe, I. Matthews, et al., (2015) Panoptic studio: a massively multiview system for social motion capture. Proc IEEE Int Conf Comput Vision: 3334–3342
Kasinski A, Florek A, Schmidt A (2008) The PUT face database. Image Process Commun 13:59–64
Kazemi V, Josephine S (2014) One millisecond face alignment with an ensemble of regression trees. 27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, United States, 23 June 2014 through 28 June 2014: 1867-1874
Kendrick C, Tan K, Walker K, Yap M (2018) Towards real-time facial landmark detection in depth data using auxiliary information. Symmetry 10
Koestinger M, Wohlhart P, Roth PM, Bischof H (2011) Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. Computer vision workshops (ICCV workshops), 2011 IEEE international conference on: 2144–2151.
Le V, Brandt J, Lin Z, Bourdev L, Huang TS (2012) Interactive facial feature localization. European Conference on Computer Vision: 679–692
Liao S, Jain AK, Li SZ (2016) A fast and accurate unconstrained face detector. IEEE Trans Pattern Anal Mach Intell 38:211–223
Ma DS, Correll J, Wittenbrink B (2015) The Chicago face database: a free stimulus set of faces and norming data. Behav Res Methods 47:1122–1135
Marks RJ (2012) Advanced topics in Shannon sampling and interpolation theory. Springer Texts in Electrical Engineering 1
Milborrow S, Morkel J, Nicolls F (2010) The MUCT landmarked face database. Pattern Recognition Association of South Africa 201
Pan Y, Zhou J, Gao Y, Xiong SJAPA (2018) Robust facial landmark localization based on texture and pose correlated initialization
Paysan P, Knothe R, Amberg B, Romdhani S, Vetter T (2009) A 3D face model for pose and illumination invariant face recognition. Advanced video and signal based surveillance, 2009. AVSS'09. Sixth IEEE international conference on: 296–301
Ranjan R, Patel VM, Chellappa R (2017) Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell
Ranjan R, Sankaranarayanan S, Castillo CD, Chellappa R (2017) An all-in-one convolutional neural network for face analysis. Automatic Face & Gesture Recognition (FG 2017), 2017 12th IEEE international conference on: 17–24
Ren S, Cao X, Wei Y, Sun J (2014) Face alignment at 3000 fps via regressing local binary features. Proc IEEE Conf Comput Vision Pattern Recogn: 1685–1692
Siddiqi MH, Ali R, Khan AM, Kim ES, Kim GJ, Lee S (2015) Facial expression recognition using active contour-based face detection, facial movement-based feature extraction, and non-linear feature selection. Multimedia Systems 21:541–555
Wang Y, Yue J, Dong Y, Hu Z (2016) Robust discriminative regression for facial landmark localization under occlusion. Neurocomputing 214:881–893
Weng R, Lu J, Tan YP, Zhou J (2016) Learning cascaded deep auto-encoder networks for face alignment. IEEE Trans Multimed 18:2066–2078
Xiong X, De la Torre F (2013) Supervised descent method and its applications to face alignment. Computer vision and pattern recognition (CVPR), 2013 IEEE conference on: 532–539
Yu J, Luo C, Yu L, Li L, Wang Z (2016) Facial video coding/decoding at ultra-low bit-rate: a 2D/3D model-based approach. Multimed Tools Appl 75:12021–12041
Zhou E, Fan H, Cao Z, Jiang Y, Yin Q (2013) Extensive facial landmark localization with coarse-to-fine convolutional network Cascade. Presented at the 2013 IEEE international conference on computer vision workshops
Acknowledgements
The work was partially supported by the National Natural Science Foundation of China under Grant No. 61873189, the Natural Science Foundation of Shanghai under Grant No. 18ZR1442500 and the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dong, Y., Lin, M., Yue, J. et al. A low-cost photorealistic CG dataset rendering pipeline for facial landmark localization. Multimed Tools Appl 78, 22397–22420 (2019). https://doi.org/10.1007/s11042-019-7516-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-7516-5