Variational Gaussian Process Auto-Encoder for Ordinal Prediction of Facial Action Units

Eleftheriadis, Stefanos; Rudovic, Ognjen; Deisenroth, Marc Peter; Pantic, Maja

doi:10.1007/978-3-319-54184-6_10

Stefanos Eleftheriadis¹⁷,
Ognjen Rudovic¹⁷,
Marc Peter Deisenroth¹⁷ &
…
Maja Pantic^17,18

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10112))

Included in the following conference series:

Asian Conference on Computer Vision

2399 Accesses
6 Citations
3 Altmetric

Abstract

We address the task of simultaneous feature fusion and modeling of discrete ordinal outputs. We propose a novel Gaussian process (GP) auto-encoder modeling approach. In particular, we introduce GP encoders to project multiple observed features onto a latent space, while GP decoders are responsible for reconstructing the original features. Inference is performed in a novel variational framework, where the recovered latent representations are further constrained by the ordinal output labels. In this way, we seamlessly integrate the ordinal structure in the learned manifold, while attaining robust fusion of the input features. We demonstrate the representation abilities of our model on benchmark datasets from machine learning and affect analysis. We further evaluate the model on the tasks of feature fusion and joint ordinal prediction of facial action units. Our experiments demonstrate the benefits of the proposed approach compared to the state of the art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Adversarial Neuro-Tensorial Approach for Learning Disentangled Representations

Article Open access 16 February 2019

GANimation: Anatomically-Aware Facial Animation from a Single Image

PCA-AE: Principal Component Analysis Autoencoder for Organising the Latent Space of Generative Networks

Article 13 April 2022

Notes

1.
The subscript r indicates that the process facilitates the recognition model.
2.
For simplicity we assume an isotropic (diagonal) covariance across the dimensions.
3.
Note that we adopt here a linear model for $g_c(\cdot )$ as it operates on a low-dimensional non-linear manifold $\varvec{X}$, already obtained by the GP auto-encoder.

References

Bartlett, M., Whitehill, J.: Automated facial expression measurement: recent applications to basic research in human behavior, learning, and education. In: Handbook of Face Perception. Oxford University Press, USA (2010)
Google Scholar
Ekman, P., Friesen, W.V., Hager, J.C.: Facial action coding system. UT: A Human Face, Salt Lake City (2002)
Google Scholar
Pantic, M.: Machine analysis of facial behaviour: naturalistic and dynamic behaviour. Philos. Trans. Roy. Soc. B: Biol. Sci. 364, 3505–3513 (2009)
Article Google Scholar
Rudovic, O., Pavlovic, V., Pantic, M.: Context-sensitive dynamic ordinal regression for intensity estimation of facial action units. IEEE TPAMI 37, 944–958 (2015)
Article Google Scholar
Mahoor, M.H., Cadavid, S., Messinger, D.S., Cohn, J.F.: A framework for automated measurement of the intensity of non-posed facial action units. In: IEEE CVPR-W, pp. 74–80 (2009)
Google Scholar
Mavadati, S.M., Mahoor, M.H., Bartlett, K., Trinh, P., Cohn, J.F.: DISFA: a spontaneous facial action intensity database. IEEE TAC 4, 151–160 (2013)
Google Scholar
Ming, Z., Bugeau, A., Rouas, J.L., Shochi, T.: Facial action units intensity estimation by the fusion of features with multi-kernel support vector machine. In: IEEE FG, vol. 6, pp. 1–6 (2015)
Google Scholar
Valstar, M.F., Almaev, T., Girard, J.M., McKeown, G., Mehu, M., Yin, L., Pantic, M., Cohn, J.F.: FERA 2015 - second facial expression recognition and analysis challenge. In: IEEE FG, vol. 6, pp. 1–8 (2015)
Google Scholar
Savran, A., Sankur, B., Bilge, M.T.: Regression-based intensity estimation of facial action units. Image Vis. Comput. 30, 774–784 (2012)
Article Google Scholar
Kaltwang, S., Rudovic, O., Pantic, M.: Continuous pain intensity estimation from facial expressions. In: Bebis, G., et al. (eds.) ISVC 2012. LNCS, vol. 7432, pp. 368–377. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33191-6_36
Chapter Google Scholar
Jeni, L.A., Girard, J.M., Cohn, J.F., De La Torre, F.: Continuous AU intensity estimation using localized, sparse facial feature space. In: IEEE FG, pp. 1–7 (2013)
Google Scholar
Kaltwang, S., Todorovic, S., Pantic, M.: Doubly sparse relevance vector machine for continuous facial behavior estimation. IEEE TPAMI 38, 1748–1761 (2015)
Article Google Scholar
Li, Y., Mavadati, S.M., Mahoor, M.H., Ji, Q.: A unified probabilistic framework for measuring the intensity of spontaneous facial action units. In: IEEE FG (2013)
Google Scholar
Sandbach, G., Zafeiriou, S., Pantic, M.: Markov random field structures for facial action unit intensity estimation. In: IEEE ICCV-W, pp. 738–745 (2013)
Google Scholar
Kaltwang, S., Todorovic, S., Pantic, M.: Latent trees for estimating intensity of facial action units. In: IEEE CVPR, pp. 296–304 (2015)
Google Scholar
Nicolle, J., Bailly, K., Chetouani, M.: Facial action unit intensity prediction via hard multi-task metric learning for kernel regression. In: IEEE FG, pp. 1–6 (2015)
Google Scholar
Mohammadi, M.R., Fatemizadeh, E., Mahoor, M.H.: Intensity estimation of spontaneous facial action units based on their sparsity properties. IEEE TCYB 46, 817–826 (2016)
Google Scholar
Damianou, A., Ek, C.H., Titsias, M., Lawrence, N.: Manifold relevance determination. In: ICML, pp. 145–152 (2012)
Google Scholar
Urtasun, R., Quattoni, A., Lawrence, N., Darrell, T.: Transferring nonlinear representations using Gaussian processes with a shared latent space. Technical report MIT-CSAIL-TR-08-020 (2008)
Google Scholar
Calandra, R., Peters, J., Rasmussen, C.E., Deisenroth, M.P.: Manifold Gaussian processes for regression. In: IJCNN (2016)
Google Scholar
Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning, vol. 1. MIT Press, Cambridge (2006)
MATH Google Scholar
Titsias, M.K., Lawrence, N.D.: Bayesian Gaussian process latent variable model. In: AISTATS, pp. 844–851 (2010)
Google Scholar
Dai, Z., Damianou, A., González, J., Lawrence, N.: Variational auto-encoded deep Gaussian processes. In: ICLR (2016)
Google Scholar
Agresti, A.: Analysis of Ordinal Categorical Data. Wiley, Hoboken (2010)
Book MATH Google Scholar
Mahoor, M.H., Zhou, M., Veon, K.L., Mavadati, S.M., Cohn, J.F.: Facial action unit recognition with sparse representation. In: IEEE FG, pp. 336–342 (2011)
Google Scholar
Chu, W.S., Torre, F.D.L., Cohn, J.F.: Selective transfer machine for personalized facial action unit detection. In: IEEE CVPR, pp. 3515–3522 (2013)
Google Scholar
Zhao, K., Chu, W.S., De la Torre, F., Cohn, J.F., Zhang, H.: Joint patch and multi-label learning for facial action unit detection. In: IEEE CVPR (2015)
Google Scholar
Eleftheriadis, S., Rudovic, O., Pantic, M.: Multi-conditional latent variable model for joint facial action unit detection. In: IEEE ICCV, pp. 3792–3800 (2015)
Google Scholar
Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58, 11 (2011)
Article MathSciNet MATH Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. In: ICLR (2013)
Google Scholar
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: ICML, pp. 1278–1286 (2014)
Google Scholar
Chu, W., Ghahramani, Z.: Gaussian processes for ordinal regression. JMLR 6, 1019–1041 (2005)
MathSciNet MATH Google Scholar
Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)
Lawrence, N.D., Candela, J.Q.: Local distance preservation in the GP-LVM through back constraints. In: ICML, vol. 148, pp. 513–520 (2006)
Google Scholar
Lawrence, N.: Probabilistic non-linear principal component analysis with Gaussian process latent variable models. JMLR 6, 1783–1816 (2005)
MathSciNet MATH Google Scholar
Shon, A., Grochow, K., Hertzmann, A., Rao, R.: Learning shared latent structure for image synthesis and robotic imitation. NIPS 18, 1233–1240 (2006)
Google Scholar
Ek, C.H., Torr, P.H.S., Lawrence, N.D.: Gaussian process latent variable models for human pose estimation. In: Popescu-Belis, A., Renals, S., Bourlard, H. (eds.) MLMI 2007. LNCS, vol. 4892, pp. 132–143. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78155-4_12
Chapter Google Scholar
Eleftheriadis, S., Rudovic, O., Pantic, M.: Discriminative shared Gaussian processes for multiview and view-invariant facial expression recognition. IEEE TIP 24, 189–204 (2015)
MathSciNet Google Scholar
Damianou, A., Lawrence, N.: Semi-described and semi-supervised learning with Gaussian processes. In: UAI (2015)
Google Scholar
LeCun, Y., Cortes, C., Burges, C.J.: The MNIST database of handwritten digits (1998)
Google Scholar
Zhang, X., Yin, L., Cohn, J.F., Canavan, S., Reale, M., Horowitz, A., Liu, P., Girard, J.M.: BP4D-spontaneous: a high-resolution spontaneous 3D dynamic facial expression database. Image Vis. Comput. 32, 692–706 (2014)
Article Google Scholar
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE TPAMI 24, 971–987 (2002)
Article MATH Google Scholar
Shrout, P.E., Fleiss, J.L.: Intraclass correlations: uses in assessing rater reliability. Psychol. Bull. 86, 420 (1979)
Article Google Scholar
Sheth, R., Wang, Y., Khardon, R.: Sparse variational inference for generalized GP models. In: ICML, pp. 1302–1311 (2015)
Google Scholar

Download references

Acknowledgement

This work has been funded by the European Community Horizon 2020 under grant agreement No. 645094 (SEWA), and No. 688835 (DE-ENIGMA). MPD has been supported by a Google Faculty Research Award.

Author information

Authors and Affiliations

Department of Computing, Imperial College London, London, UK
Stefanos Eleftheriadis, Ognjen Rudovic, Marc Peter Deisenroth & Maja Pantic
EEMCS, University of Twente, Enschede, The Netherlands
Maja Pantic

Authors

Stefanos Eleftheriadis
View author publications
You can also search for this author in PubMed Google Scholar
Ognjen Rudovic
View author publications
You can also search for this author in PubMed Google Scholar
Marc Peter Deisenroth
View author publications
You can also search for this author in PubMed Google Scholar
Maja Pantic
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefanos Eleftheriadis .

Editor information

Editors and Affiliations

National Tsing Hua University, Hsinchu, Taiwan
Shang-Hong Lai
Graz University of Technology, Graz, Austria
Vincent Lepetit
Drexel University, Philadelphia, Pennsylvania, USA
Ko Nishino
The University of Tokyo, Tokyo, Japan
Yoichi Sato

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Eleftheriadis, S., Rudovic, O., Deisenroth, M.P., Pantic, M. (2017). Variational Gaussian Process Auto-Encoder for Ordinal Prediction of Facial Action Units. In: Lai, SH., Lepetit, V., Nishino, K., Sato, Y. (eds) Computer Vision – ACCV 2016. ACCV 2016. Lecture Notes in Computer Science(), vol 10112. Springer, Cham. https://doi.org/10.1007/978-3-319-54184-6_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-54184-6_10
Published: 10 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-54183-9
Online ISBN: 978-3-319-54184-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics