Facial point localization via neural networks in a cascade regression framework

Saeed, Anwar; Al-Hamadi, Ayoub; Neumann, Heiko

doi:10.1007/s11042-016-4261-x

Facial point localization via neural networks in a cascade regression framework

Published: 31 January 2017

Volume 77, pages 2261–2283, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Anwar Saeed¹,
Ayoub Al-Hamadi¹ &
Heiko Neumann²

342 Accesses
4 Citations
Explore all metrics

Abstract

Facial point detection gains an increasing importance in computer vision as it plays a vital role in several applications such as facial expression recognition and human behavior analysis. In this work, we propose an approach to locate 49 facial points via neural networks in a cascade regression fashion. The localization process starts by detecting the face, followed by a face cropping refinement task and lastly arriving at the facial point location through five cascades of regressors. In particular, we perform a guided initialization using holistic features extracted from the entire face patch. Then, the points location is refined in the next four cascades using local features extracted from patches enclosing the prior estimates of the points. The generalization capability was improved by performing feature selection at each cascade. By evaluating our approach on samples gathered from four challenging databases, we achieved a location average error for each point ranging between 0.72 % and 1.57 % of the face width. The proposed approach was further evaluated according to the 300-w challenge, where we achieved competitive results to those obtained by state-of-the-art approaches and commercial software packages. Moreover, our approach showed better generalization capability. Finally, we validated the proposed enhancements by studying the impact of several factors on the point localization accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

References

Almuallim H, Dietterich TG (1994) Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69(1):279–305. doi:10.1016/0004-3702(94)90084-1. http://www.sciencedirect.com/science/article/pii/0004370294900841
Baltrusaitis T, McDuff D, Banda N, Mahmoud M, El Kaliouby R, Robinson P, Picard R (2011) Real-time inference of mental states from facial expressions and upper body gestures. In: IEEE International Conference on Automatic Face Gesture Recognition and Workshops (FG 2011), pp 909– 914
Baltrusaitis T, Robinson P, Morency L-P (2013) Constrained local neural fields for robust facial landmark detection in the wild. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW), pp 354–361. doi:10.1109/ICCVW.2013.54
Barbu A, She Y, Ding L, Gramajo G (2016) Feature selection with annealing for computer vision and big data learning. IEEE Trans Pattern Anal Mach Intell PP (99):1–1 . doi:10.1109/TPAMI.2016.2544315
Belhumeur P, Jacobs D, Kriegman D, Kumar N (2011) Localizing parts of faces using a consensus of exemplars. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 545 –552. doi:10.1109/CVPR.2011.5995602
Cristinacce D, Cootes TF (2006) Feature detection and tracking with constrained local models. In: Proceedings of the BMVC, pp 95.1–95.10. doi:10.5244/C.20.95
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005., vol 1, San Diego, CA, USA, pp 886–893
Ekman P, Friesen WV (10.1007/BF01115465) Measuring facial movement. J Nonverbal Behav 1(1):56–75
Everingham M, Sivic J, Zisserman A (2009) Taking the bite out of automated naming of characters in tv video. Image Vision Comput 27(5):545–559. doi:10.1016/j.imavis.2008.04.018
Ghimire D, Lee J, Li Z-N, Jeong S (2016) Recognition of facial expressions based on salient geometric features and support vector machines. Multimedia Tools and Applications:1–26
Gourier N, Hall D, Crowley JL (2004) Estimating Face Orientation from Robust Detection of Salient Facial Features. In: Proceedings of pointing 2004, ICPR, International Workshop on Visual Observation of Deictic Gestures
Gross R, Matthews I, Cohn J, Kanade T, Baker S (2010) Multi-pie. Image Vision Comput 28(5):807–813. doi:10.1016/j.imavis.2009.08.002
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182. http://dl.acm.org/citation.cfm?id=944919.944968
Hall MA (1999) Correlation-based feature selection for machine learning, Ph.D. thesis, Department of Computer Science. Waikato University, New Zealand
i ⋅bug - resources. http://ibug.doc.ic.ac.uk/resources/300-W/ (Accessed: 04- Nov- 2015)
Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp 1867–1874. doi:10.1109/CVPR.2014.241
King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758
Google Scholar
Koller D, Sahami M (1995) Toward optimal feature selection. In: 13th International Conference on Machine Learning, pp 284–292
L. Inc., luxand facesdk ver. 6.1, www.luxand.com/facesdk/ (Dec. 2015)
Le V, Brandt J, Lin Z, Bourdev L, Huang TS (2012) Interactive facial feature localization. In: Proceedings of the 12th european conference on computer vision - Volume Part III, ECCV’12,. Springer-Verlag, Berlin, Heidelberg, pp 679–692
Lee Y-H, Kim CG, Kim Y, Whangbo TK (2015) Facial landmarks detection using improved active shape model on android platform. Multimedia Tools and Applications 74(20):8821–8830
Article Google Scholar
Li H, Ding H, Huang D, Wang Y, Zhao X, Morvan J-M, Chen L (2015) An efficient multimodal 2d + 3d feature-based approach to automatic facial expression recognition. Comput Vis Image Underst 140(C):83–92. doi:10.1016/j.cviu.2015.07.005
Littlewort G, Whitehill J, Wu T, Fasel I, Frank M, Movellan J, Bartlett M (2011) The computer expression recognition toolbox (cert). In: 2011 IEEE International Conference on Automatic Face Gesture Recognition and Workshops (FG 2011), pp 298–305. doi:10.1109/FG.2011.5771414
Long N, Gianola D, Rosa G, Weigel K (2011) Dimension reduction and variable selection for genomic selection: application to predicting milk yield in holsteins. J Anim Breed Genet 128(4):247–257. doi:10.1111/j.1439-0388.2011.00917.x
Article Google Scholar
M. Inc., Face ++ matlab sdk demo, www.faceplusplus.com (Dec 2013)
Martinez B, Valstar M, Binefa X, Pantic M (2013) Local evidence aggregation for regression-based facial point detection. IEEE Trans Pattern Anal Mach Intell 35(5):1149–1163. doi:10.1109/TPAMI.2012.205
Article Google Scholar
Milborrow S, Morkel J, Nicolls F The MUCT Landmarked Face Database, Pattern Recognition Association of South Africa http://www.milbo.org/muct
Milborrow S, Nicolls F (2008) Locating facial features with an extended active shape model. In: Proceedings of the 10th European Conference on Computer Vision: Part IV, ECCV ’08. Springer-Verlag, Berlin, Heidelberg, pp 504–513
Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern Part B Cybern 36 (1):106–117. doi:10.1109/TSMCB.2005.854499
Saeed A, Al-Hamadi A, Ghoneim A (2015) Head pose estimation on top of haar-like face detection: A study using the kinect sensor. Sensors 15(9):20945–20966
Article Google Scholar
Saeed A, Al-Hamadi A, Niese R, Elzobi M (2014) Frame-based facial expression recognition using geometrical features. Advances in Human-Computer Interaction 2014 (1):1–13
Article Google Scholar
Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) A semi-automatic methodology for facial landmark annotation. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 896–903. doi:10.1109/CVPRW.2013.132
Sebe N, Lew MS, Sun Y, Cohen I, Gevers T, Huang TS (2007) Authentic facial expression analysis. Image Vision Comput 25(12):1856–1863. doi:10.1016/j.imavis.2005.12.021
Smith B, Brandt J, Lin Z, Zhang L (2014) Nonparametric context modeling of local appearance for pose- and expression-robust facial landmark localization. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1741–1748. doi:10.1109/CVPR.2014.225
Sun Y, Wang X, Tang X (2013) Deep convolutional network cascade for facial point detection. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’13. IEEE Computer Society, Washington, DC, USA, pp 3476–3483. doi:10.1109/CVPR.2013.446.
Taner Eskil M, Benli KS (2014) Facial expression recognition based on anatomy. Comput Vis. Image Underst. 119:1–14
Article Google Scholar
Tzimiropoulos G, Pantic M (2014) Gauss-newton deformable part models for face alignment in-the-wild. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1851–1858. doi:10.1109/CVPR.2014.239
Valstar M, Martinez B, Binefa X, Pantic M (2010) Facial point detection using boosted regression and graph models. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2729–2736. doi:10.1109/CVPR.2010.5539996
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer vision and pattern recognition, 2001. CVPR 2001, vol 1, Kauai, Hawaii, USA, pp 511–518
Werner P, Al-Hamadi A, Niese R, Walter S, Gruss S, Harald C (2013) Towards pain monitoring: Facial expression, head pose, a new database, an automatic system and remaining challenges. In: British Machine Vision Conference (BMVC), Bristol, UK
Xiong X, De la Torre F (2013) Supervised descent method and its applications to face alignment. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 532– 539
Yan J, Lei Z, Yi D, Li S (2013) Learn to combine multiple hypotheses for accurate face alignment. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW), pp 392–396. doi:10.1109/ICCVW.2013.126
Yu X, Huang J, Zhang S, Yan W, Metaxas DN (2013) Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In: 2013 IEEE International Conference on Computer Vision, pp 1944–1951. doi:10.1109/ICCV.2013.244
Zafeiriou S, Zhang C, Zhang Z (2015) A survey on face detection in the wild: Past, present and future. Comput Vision Image Understanding 138:1–24
Article Google Scholar
Zhang L, Tjondronegoro D, Chandran V (2014) Representation of facial expression categories in continuous arousal-valence space: Feature and correlation. Image Vision Comput 32(12):1067–1079. doi:10.1016/j.imavis.2014.09.005
Article Google Scholar
Zhu X, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2879– 2886

Download references

Acknowledgments

This work is part of the project done within the Transregional Collaborative Research Centre SFB/TRR 62 Companion-Technology for Cognitive Technical Systems funded by the German Research Foundation (DFG).

Author information

Authors and Affiliations

Institute for Information Technology and Communications (IIKT), Otto-von-Guericke-University Magdeburg, Magdeburg, Germany
Anwar Saeed & Ayoub Al-Hamadi
Institute of Neural Information Processing, University of Ulm, Ulm, Germany
Heiko Neumann

Authors

Anwar Saeed
View author publications
You can also search for this author in PubMed Google Scholar
Ayoub Al-Hamadi
View author publications
You can also search for this author in PubMed Google Scholar
Heiko Neumann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anwar Saeed.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Saeed, A., Al-Hamadi, A. & Neumann, H. Facial point localization via neural networks in a cascade regression framework. Multimed Tools Appl 77, 2261–2283 (2018). https://doi.org/10.1007/s11042-016-4261-x

Download citation

Received: 13 May 2016
Revised: 13 October 2016
Accepted: 12 December 2016
Published: 31 January 2017
Issue Date: January 2018
DOI: https://doi.org/10.1007/s11042-016-4261-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facial point localization via neural networks in a cascade regression framework

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Facial point localization via neural networks in a cascade regression framework

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

YOLO-based Object Detection Models: A Review and its Applications

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation