Discriminative Deep Face Shape Model for Facial Point Detection

Wu, Yue; Ji, Qiang

doi:10.1007/s11263-014-0775-8

Discriminative Deep Face Shape Model for Facial Point Detection

Published: 04 November 2014

Volume 113, pages 37–53, (2015)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yue Wu¹ &
Qiang Ji¹

1401 Accesses
38 Citations
Explore all metrics

Abstract

Facial point detection is an active area in computer vision due to its relevance to many applications. It is a nontrivial task, since facial shapes vary significantly with facial expressions, poses or occlusion. In this paper, we address this problem by proposing a discriminative deep face shape model that is constructed based on an augmented factorized three-way Restricted Boltzmann Machines model. Specifically, the discriminative deep model combines the top-down information from the embedded face shape patterns and the bottom up measurements from local point detectors in a unified framework. In addition, along with the model, effective algorithms are proposed to perform model learning and to infer the true facial point locations from their measurements. Based on the discriminative deep face shape model, 68 facial points are detected on facial images in both controlled and “in-the-wild” conditions. Experiments on benchmark data sets show the effectiveness of the proposed facial point detection algorithm against state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Baker, S., Gross, R., & Matthews, I. (2002). Lucas-kanade 20 years on: A unifying framework: Part 3. International Journal of Computer Vision, 56, 221–255.
Article Google Scholar
Belhumeur, P., Jacobs, D., Kriegman, D., & Kumar, N. (2013). Localizing parts of faces using a consensus of exemplars. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(12), 2930–2940.
Article Google Scholar
Belhumeur, P. N., Jacobs, D. W., Kriegman, D. J., & Kumar, N. (2011). Localizing parts of faces using a consensus of exemplars. In IEEE International Conference on Computer Vision and Pattern Recognition.
Cootes, T. F., Taylor, C. J., Cooper, D. H., & Graham, J. (1995). Active shape models their training and application. Computer Vision and Image Understanding, 61(1), 38–59.
Article Google Scholar
Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 681–685.
Article Google Scholar
Cristinacce, D., & Cootes, T. (2008). Automatic feature localisation with constrained local models. Pattern Recognition, 41(10), 3054–3067.
Article MATH Google Scholar
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. International Conference on Computer Vision and Pattern Recognition, 2, 886–893.
Google Scholar
Eslami, S., Heess, N., & Winn, J. (2012). The shape boltzmann machine: A strong model of object shape. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 406–413).
Fan, R. E., Chang, K. W., Hsieh, C. J., Wang, X. R., & Lin, C. J. (2008). LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.
MATH Google Scholar
Gross, R., Matthews, I., Cohn, J., Kanade, T., & Baker, S. (2010). Multi-pie. Image and Vision Computing, 28(5), 807–813.
Article Google Scholar
Hinton, G. E. (2002). Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8), 1771–1800.
Article MATH MathSciNet Google Scholar
Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507.
Article MATH MathSciNet Google Scholar
Kae, A., Sohn, K., Lee, H., & Learned-Miller, E. G. (2013). Augmenting crfs with boltzmann machine shape priors for image labeling. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 2019–2026).
Le, V., Brandt, J., Lin, Z., Bourdev, L., & Huang, T. S. (2012). Interactive facial feature localization. In European Conference on Computer Vision, Part III (ECCV’12, pp. 679–692).
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision, 60(2), 91–110.
Article Google Scholar
Martinez, B., Valstar, M. F., Binefa, X., & Pantic, M. (2013). Local evidence aggregation for regression-based facial point detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(5), 1149–1163.
Article Google Scholar
Matthews, I., & Baker, S. (2004). Active appearance models revisited. International Journal of Computer Vision, 60(2), 135–164.
Article Google Scholar
Memisevic, R., & Hinton, G. E. (2010). Learning to represent spatial transformations with factored higher-order boltzmann machines. Neural Computation, 22(6), 1473–1492.
Article MATH Google Scholar
Mohamed, A., Dahl, G., & Hinton, G. (2011). Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing, PP(99), 1.
Google Scholar
Ranzato, M., Krizhevsky, A., & Hinton, G. E. (2010). Factored 3-way restricted boltzmann machines for modeling natural images. In International Conference on Artificial Intelligence and Statistics (pp. 621–628).
Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M. (2013). 300 faces in-the-wild challenge: The first facial landmark localization challenge. In Proceedings of IEEE International Conference on Computer Vision (ICCV-W 2013), Sydney.
Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., & Pantic, M. (2013). A semi-automatic methodology for facial landmark annotation. In Computer Vision and Pattern Recognition Workshops (CVPRW, pp. 896–903).
Salakhutdinov, R., & Hinton, G. (2009). Deep boltzmann machines. Proceedings of the International Conference on Artificial Intelligence and Statistics, 5, 448–455.
Google Scholar
Saragih, J. M., Lucey, S., & Cohn, J. F. (2011). Deformable model fitting by regularized landmark mean-shift. International Journal of Computer Vision, 91(2), 200–215.
Article MATH MathSciNet Google Scholar
Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14(3), 199–222.
Article MathSciNet Google Scholar
Sun, Y., Wang, X., & Tang, X. (2013a). Deep convolutional network cascade for facial point detection. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 3476–3483).
Sun, Y., Wang, X., & Tang, X. (2013b). Hybrid deep learning for face verification. In 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1489–1496.
Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). DeepFace: Closing the gap to human-level performance in face verification. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1701–1708.
Taylor, G., Sigal, L., Fleet, D., & Hinton, G. (2010). Dynamical binary latent variable models for 3d human pose tracking. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR, pp. 631–638).
Tieleman, T. (2008). Training restricted boltzmann machines using approximations to the likelihood gradient. In Proceedings of the 25th International Conference on Machine Learning (pp. 1064–1071).
Tzimiropoulos, G., & Pantic, M. (2013). Optimization problems for fast aam fitting in-the-wild. In International conference on Computer Vision (pp. 593–600).
Valstar, M., Martinez, B., Binefa, V., & Pantic, M. (2010). Facial point detection using boosted regression and graph models. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 13–18).
Welling, M., & Hinton, G. E. (2002). A new learning algorithm for mean field boltzmann machines. In Proceedings of the International Conference on Artificial Neural Networks (ICANN ’02, pp 351–357). London: Springer.
Wu, Y., Wang, Z., & Ji, Q. (2013). Facial feature tracking under varying facial expressions and face poses based on restricted boltzmann machines. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 3452–3459).
Xiong, X., & De la Torre Frade, F. (2013). Supervised descent method and its applications to face alignment. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).
Zhou, E., Fan, H., Cao, Z., Jiang, Y., & Yin, Q. (2013). Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In IEEE International Conference on Computer Vision Workshops (pp. 386–391).
Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In IEEE International Conference on Computer Vision and Pattern Recognition (pp. 2879–2886).

Download references

Acknowledgments

This work is supported in part by a Grant from US Army Research office (W911NF-12-C-0017).

Author information

Authors and Affiliations

Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY, 12180-3590, USA
Yue Wu & Qiang Ji

Authors

Yue Wu
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Ji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qiang Ji.

Additional information

Communicated by Marc’Aurelio Ranzato, Geoffrey E. Hinton, and Yann Lecun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, Y., Ji, Q. Discriminative Deep Face Shape Model for Facial Point Detection. Int J Comput Vis 113, 37–53 (2015). https://doi.org/10.1007/s11263-014-0775-8

Download citation

Received: 09 February 2014
Accepted: 13 October 2014
Published: 04 November 2014
Issue Date: May 2015
DOI: https://doi.org/10.1007/s11263-014-0775-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discriminative Deep Face Shape Model for Facial Point Detection

Abstract

Access this article

Similar content being viewed by others

Learning the Face Shape Models for Facial Landmark Detection in the Wild

Facial point localization via neural networks in a cascade regression framework

Cascaded Regression for 3D Face Alignment

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discriminative Deep Face Shape Model for Facial Point Detection

Abstract

Access this article

Similar content being viewed by others

Learning the Face Shape Models for Facial Landmark Detection in the Wild

Facial point localization via neural networks in a cascade regression framework

Cascaded Regression for 3D Face Alignment

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation