Abstract
Facial point detection gains an increasing importance in computer vision as it plays a vital role in several applications such as facial expression recognition and human behavior analysis. In this work, we propose an approach to locate 49 facial points via neural networks in a cascade regression fashion. The localization process starts by detecting the face, followed by a face cropping refinement task and lastly arriving at the facial point location through five cascades of regressors. In particular, we perform a guided initialization using holistic features extracted from the entire face patch. Then, the points location is refined in the next four cascades using local features extracted from patches enclosing the prior estimates of the points. The generalization capability was improved by performing feature selection at each cascade. By evaluating our approach on samples gathered from four challenging databases, we achieved a location average error for each point ranging between 0.72 % and 1.57 % of the face width. The proposed approach was further evaluated according to the 300-w challenge, where we achieved competitive results to those obtained by state-of-the-art approaches and commercial software packages. Moreover, our approach showed better generalization capability. Finally, we validated the proposed enhancements by studying the impact of several factors on the point localization accuracy.
Similar content being viewed by others
References
Almuallim H, Dietterich TG (1994) Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69(1):279–305. doi:10.1016/0004-3702(94)90084-1. http://www.sciencedirect.com/science/article/pii/0004370294900841
Baltrusaitis T, McDuff D, Banda N, Mahmoud M, El Kaliouby R, Robinson P, Picard R (2011) Real-time inference of mental states from facial expressions and upper body gestures. In: IEEE International Conference on Automatic Face Gesture Recognition and Workshops (FG 2011), pp 909– 914
Baltrusaitis T, Robinson P, Morency L-P (2013) Constrained local neural fields for robust facial landmark detection in the wild. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW), pp 354–361. doi:10.1109/ICCVW.2013.54
Barbu A, She Y, Ding L, Gramajo G (2016) Feature selection with annealing for computer vision and big data learning. IEEE Trans Pattern Anal Mach Intell PP (99):1–1 . doi:10.1109/TPAMI.2016.2544315
Belhumeur P, Jacobs D, Kriegman D, Kumar N (2011) Localizing parts of faces using a consensus of exemplars. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 545 –552. doi:10.1109/CVPR.2011.5995602
Cristinacce D, Cootes TF (2006) Feature detection and tracking with constrained local models. In: Proceedings of the BMVC, pp 95.1–95.10. doi:10.5244/C.20.95
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005., vol 1, San Diego, CA, USA, pp 886–893
Ekman P, Friesen WV (10.1007/BF01115465) Measuring facial movement. J Nonverbal Behav 1(1):56–75
Everingham M, Sivic J, Zisserman A (2009) Taking the bite out of automated naming of characters in tv video. Image Vision Comput 27(5):545–559. doi:10.1016/j.imavis.2008.04.018
Ghimire D, Lee J, Li Z-N, Jeong S (2016) Recognition of facial expressions based on salient geometric features and support vector machines. Multimedia Tools and Applications:1–26
Gourier N, Hall D, Crowley JL (2004) Estimating Face Orientation from Robust Detection of Salient Facial Features. In: Proceedings of pointing 2004, ICPR, International Workshop on Visual Observation of Deictic Gestures
Gross R, Matthews I, Cohn J, Kanade T, Baker S (2010) Multi-pie. Image Vision Comput 28(5):807–813. doi:10.1016/j.imavis.2009.08.002
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182. http://dl.acm.org/citation.cfm?id=944919.944968
Hall MA (1999) Correlation-based feature selection for machine learning, Ph.D. thesis, Department of Computer Science. Waikato University, New Zealand
i ⋅bug - resources. http://ibug.doc.ic.ac.uk/resources/300-W/ (Accessed: 04- Nov- 2015)
Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp 1867–1874. doi:10.1109/CVPR.2014.241
King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758
Koller D, Sahami M (1995) Toward optimal feature selection. In: 13th International Conference on Machine Learning, pp 284–292
L. Inc., luxand facesdk ver. 6.1, www.luxand.com/facesdk/ (Dec. 2015)
Le V, Brandt J, Lin Z, Bourdev L, Huang TS (2012) Interactive facial feature localization. In: Proceedings of the 12th european conference on computer vision - Volume Part III, ECCV’12,. Springer-Verlag, Berlin, Heidelberg, pp 679–692
Lee Y-H, Kim CG, Kim Y, Whangbo TK (2015) Facial landmarks detection using improved active shape model on android platform. Multimedia Tools and Applications 74(20):8821–8830
Li H, Ding H, Huang D, Wang Y, Zhao X, Morvan J-M, Chen L (2015) An efficient multimodal 2d + 3d feature-based approach to automatic facial expression recognition. Comput Vis Image Underst 140(C):83–92. doi:10.1016/j.cviu.2015.07.005
Littlewort G, Whitehill J, Wu T, Fasel I, Frank M, Movellan J, Bartlett M (2011) The computer expression recognition toolbox (cert). In: 2011 IEEE International Conference on Automatic Face Gesture Recognition and Workshops (FG 2011), pp 298–305. doi:10.1109/FG.2011.5771414
Long N, Gianola D, Rosa G, Weigel K (2011) Dimension reduction and variable selection for genomic selection: application to predicting milk yield in holsteins. J Anim Breed Genet 128(4):247–257. doi:10.1111/j.1439-0388.2011.00917.x
M. Inc., Face ++ matlab sdk demo, www.faceplusplus.com (Dec 2013)
Martinez B, Valstar M, Binefa X, Pantic M (2013) Local evidence aggregation for regression-based facial point detection. IEEE Trans Pattern Anal Mach Intell 35(5):1149–1163. doi:10.1109/TPAMI.2012.205
Milborrow S, Morkel J, Nicolls F The MUCT Landmarked Face Database, Pattern Recognition Association of South Africa http://www.milbo.org/muct
Milborrow S, Nicolls F (2008) Locating facial features with an extended active shape model. In: Proceedings of the 10th European Conference on Computer Vision: Part IV, ECCV ’08. Springer-Verlag, Berlin, Heidelberg, pp 504–513
Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern Part B Cybern 36 (1):106–117. doi:10.1109/TSMCB.2005.854499
Saeed A, Al-Hamadi A, Ghoneim A (2015) Head pose estimation on top of haar-like face detection: A study using the kinect sensor. Sensors 15(9):20945–20966
Saeed A, Al-Hamadi A, Niese R, Elzobi M (2014) Frame-based facial expression recognition using geometrical features. Advances in Human-Computer Interaction 2014 (1):1–13
Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) A semi-automatic methodology for facial landmark annotation. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 896–903. doi:10.1109/CVPRW.2013.132
Sebe N, Lew MS, Sun Y, Cohen I, Gevers T, Huang TS (2007) Authentic facial expression analysis. Image Vision Comput 25(12):1856–1863. doi:10.1016/j.imavis.2005.12.021
Smith B, Brandt J, Lin Z, Zhang L (2014) Nonparametric context modeling of local appearance for pose- and expression-robust facial landmark localization. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1741–1748. doi:10.1109/CVPR.2014.225
Sun Y, Wang X, Tang X (2013) Deep convolutional network cascade for facial point detection. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’13. IEEE Computer Society, Washington, DC, USA, pp 3476–3483. doi:10.1109/CVPR.2013.446.
Taner Eskil M, Benli KS (2014) Facial expression recognition based on anatomy. Comput Vis. Image Underst. 119:1–14
Tzimiropoulos G, Pantic M (2014) Gauss-newton deformable part models for face alignment in-the-wild. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1851–1858. doi:10.1109/CVPR.2014.239
Valstar M, Martinez B, Binefa X, Pantic M (2010) Facial point detection using boosted regression and graph models. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2729–2736. doi:10.1109/CVPR.2010.5539996
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer vision and pattern recognition, 2001. CVPR 2001, vol 1, Kauai, Hawaii, USA, pp 511–518
Werner P, Al-Hamadi A, Niese R, Walter S, Gruss S, Harald C (2013) Towards pain monitoring: Facial expression, head pose, a new database, an automatic system and remaining challenges. In: British Machine Vision Conference (BMVC), Bristol, UK
Xiong X, De la Torre F (2013) Supervised descent method and its applications to face alignment. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 532– 539
Yan J, Lei Z, Yi D, Li S (2013) Learn to combine multiple hypotheses for accurate face alignment. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW), pp 392–396. doi:10.1109/ICCVW.2013.126
Yu X, Huang J, Zhang S, Yan W, Metaxas DN (2013) Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In: 2013 IEEE International Conference on Computer Vision, pp 1944–1951. doi:10.1109/ICCV.2013.244
Zafeiriou S, Zhang C, Zhang Z (2015) A survey on face detection in the wild: Past, present and future. Comput Vision Image Understanding 138:1–24
Zhang L, Tjondronegoro D, Chandran V (2014) Representation of facial expression categories in continuous arousal-valence space: Feature and correlation. Image Vision Comput 32(12):1067–1079. doi:10.1016/j.imavis.2014.09.005
Zhu X, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2879– 2886
Acknowledgments
This work is part of the project done within the Transregional Collaborative Research Centre SFB/TRR 62 Companion-Technology for Cognitive Technical Systems funded by the German Research Foundation (DFG).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Saeed, A., Al-Hamadi, A. & Neumann, H. Facial point localization via neural networks in a cascade regression framework. Multimed Tools Appl 77, 2261–2283 (2018). https://doi.org/10.1007/s11042-016-4261-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-4261-x