Handling missing labels and class imbalance challenges simultaneously for facial action unit recognition

Li, Yongqiang; Wu, Baoyuan; Zhao, Yongping; Yao, Hongxun; Ji, Qiang

doi:10.1007/s11042-018-6836-1

Handling missing labels and class imbalance challenges simultaneously for facial action unit recognition

Published: 28 February 2019

Volume 78, pages 20309–20332, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yongqiang Li¹,
Baoyuan Wu ORCID: orcid.org/0000-0003-2183-5990²,
Yongping Zhao¹,
Hongxun Yao¹ &
…
Qiang Ji³

409 Accesses
5 Citations
Explore all metrics

Abstract

Facial action unit (AU) recognition has attracted great attention because of the applications in a wide range of fields. Missing labels and class imbalance (CIB) are both challenges for facial action unit recognition. Missing labels means that there are only apart label assignments for training samples. CIB is observed from two perspectives: firstly, the number of positive AUs is much smaller than that of negative AUs for each expressional image; secondly, the rate of positive samples of different AUs are significantly different. Both missing labels and CIB lead to performance degradation in AU recognition. In this work, we propose to handle these two challenges in AU recognition simultaneously. Specifically, we formulate AU recognition with missing labels as a multi label learning with missing labels (MLML) problem, which handles the missing label challenge naturally. However, different from most existing MLML approaches which usually employ same features from whole image for all classes, we select the most related features for each AU. To handle the CIB challenge, we further introduce class cardinality bounds which constrain the number of positive AUs for each data instance, as well as the number of positive labels for each AU in the overall dataset. The class cardinality bounds serve as linear constraints for the objective function, which turns the optimization NP-hard. Thus we present convex approximation based on the Lovasz extension, which leads to a linear program that can be efficiently solved by the alternative direction method of multipliers (ADMM). Experimental results on both posed and spontaneous facial expression datasets demonstrate the superiority of the proposed method compared to state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition

Article 04 November 2023

Darshan Gera, Bobbili Veerendra Raj Kumar, … S Balasubramanian

A discriminative deep association learning for facial expression recognition

Article 23 October 2019

Xing Jin, Wenyun Sun & Zhong Jin

Learn from All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition

Notes

The details of the optimization process based on ADMM is presented in the Supplementary Material.
The evaluation data from the 2 databases are available at: https://pan.baidu.com/s/1Hz72YNVBvQt-LFCAY4c43w.

References

Bach FR (2013) Learning with submodular functions: a convex optimization perspective. arXiv: Learn 6:145–373
MATH Google Scholar
Boyd B, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. In: Foundations and trends in machine learning, pp 1–122
Boyd S, Vandenberghe L (2013) Convex optimization. Cambridge University Press, Cambridge
MATH Google Scholar
Boykov Y, Kolmogorov V (2004) An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans Pattern Anal Mach Intell 26 (9):1124
Article MATH Google Scholar
Breazeal CL (2000) Sociable machines: expressive social exchange between humans and robots. Doctoral dissertation, Massachusetts Institute of Technology
Bucak SS, Jin R, Jain AK (2011) Multi-label learning with incomplete class assignments. In: Computer vision and pattern recognition, pp 2801–2808
Cabral RS, Torre FDL, Costeira JP, Bernardino A (2011) Matrix completion for multi-label image classification. In: Advances in neural information processing systems, pp 190–198
Chen G, Song Y, Wang F, Zhang C (2008) Semi-supervised multi-label learning by solving a sylvester equation. In: Siam international conference on data mining, SDM 2008, Atlanta, pp 410–419
Cootes TF. aam tools. [online]. available: http://personalpages.manchester.ac.uk/staff/timothy.f.cootes/
Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 23(6):681–685
Article Google Scholar
Dembczynski K, Jachnik A, Kotlowski W, Waegeman W, Hullermeier E (2013) Optimizing the F-measure in multi-label classification: plug-in rule approach versus structured loss minimization. In: International conference on machine learning, pp 1130–1138
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B Methodol:1–38
Ekman PE, Friesen WV, Hager JC (2002) Facial action coding system. A human face, Salt Lake City
Google Scholar
Geman S, Geman D (1984) Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6(6):721–741
Article MATH Google Scholar
Goldberg AB, Zhu X, Recht B, Xu J, Nowak RD (2010) Transduction with matrix completion: Three birds with one stone. In: Advances in neural information processing systems, pp 757–765
Hamm J, Kohler CG, Gur RC, Verma R (2011) Automated facial action coding system for dynamic analysis of facial expressions in neuropsychiatric disorders. J Neurosci Methods 200(2):237–256
Article Google Scholar
Han S, Meng Z, Khan AS, Tong Y (2016) Incremental boosting convolutional neural network for facial 613 action unit recognition. In: Advances in neural information processing system, pp 109–117
Jiang B, Valstar M, Pantic M (2011) Action unit detection using sparse appearance descriptors in space-time video volumes. In: IEEE International conference on automatic face & gesture recognition and workshops, pp 314–321
Li Y, Chen J, Zhao Y, Ji Q (2013) Data-free prior model for facial action unit recognition. IEEE Trans Affect Comput 4(2):127–141
Article Google Scholar
Li Y, Wang S, Zhao Y, Ji Q (2013) Simultaneous facial feature tracking and facial expression recognition. IEEE Trans Image Process 22(7):2559–2573
Article Google Scholar
Li Y, Wu B, Ghanem B, Zhao Y, Yao H, Ji Q (2016) Facial action unit recognition under incomplete data based on multi-label learning with missing labels. Pattern Recogn 60:890–900
Article Google Scholar
Liao W, Ji Q (2009) Learning bayesian network parameters under incomplete data with domain knowledge. Pattern Recogn 42(11):3046–3056
Article MATH Google Scholar
Liu Z, Wang S, Wang Z, Ji Q (2013) Implicit video multi-emotion tagging by exploiting multi-expression relations. In: IEEE International conference and workshops on automatic face and gesture recognition, pp 1–6
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The Extended Cohn-Kanade Dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: Computer vision and pattern recognition, pp 94–101
Mahoor MH, Cadavid S, Messinger DS, Cohn JF (2009) A framework for automated measurement of the intensity of non-posed facial action units. In: 2009. CVPR workshops 2009. IEEE computer society conference on computer vision and pattern recognition workshops, pp 74–80
Mckeown G, Valstar M, Cowie R, Pantic M, Schroder M (2012) The semaine database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans Affect Comput 3(1):5–17
Article Google Scholar
Pantic M, Patras I (2006) Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences. IEEE Trans Syst Man Cybern Part B 36(2):433–49
Article Google Scholar
Petterson J, Caetano T (2010) Reverse multi-label learning. In: International conference on neural information processing systems, pp 1912–1920
Rudovic O, Pavlovic V, Pantic M (2015) Context-sensitive dynamic ordinal regression for intensity estimation of facial action units. IEEE Trans Pattern Anal Mach Intell 37(5):944–958
Article Google Scholar
Sandbach G, Zafeiriou S, Pantic M (2013) Markov random field structures for facial action unit intensity estimation. In: IEEE International conference on computer vision workshops, pp 738–745
Sorower MS (2010) A literature survey on algorithms for multi-label learning. Oregon State University
Sun YY, Zhang Y, Zhou ZH (2010) Multi-label learning with weak label. In: Twenty-fourth AAAI conference on artificial intelligence, pp 593–598
Tian Y, Kanade T, Cohn JF (2001) Recognizing action units for facial expression analysis. IEEE Trans Pattern Anal Mach Intell 23(2):97–115
Article Google Scholar
Tong Y, Ji Q (2008) Learning bayesian networks with qualitative constraints. In: Computer vision and pattern recognition, pp 1–8
Tong Y, Chen J, Ji Q (2010) A unified probabilistic framework for spontaneous facial action modeling and understanding. IEEE Trans Pattern Anal Mach Intell 32(2):258–273
Article Google Scholar
Tong Y, Liao W, Ji Q (2007) Facial action unit recognition by exploiting their dynamic and semantic relationships. IEEE Trans Pattern Anal Mach Intell 29(10):1683–1699
Article Google Scholar
Valstar M, Pantic M (2007) Combined support vector machines and hidden markov models for modeling facial action temporal dynamics. In: IEEE International conference on human-computer interaction, pp 118–127
Wang Q, Si L, Zhang D (2014) Learning to hash with partial tags: exploring correlation between tags and hashing bits for large scale image retrieval. In: European conference on computer vision, pp 378– 392
Wu B, Liu Z, Wang S, Hu B, Ji Q (2014) Multi-label learning with missing labels. In: International conference on pattern recognition, pp 1964–1968
Wu B, Lyu S, Ghanem B (2015) Ml-mg: multi-label learning with missing labels using a mixed graph. In: IEEE International conference on computer vision, pp 4157–4165
Wu B, Lyu S, Hu B, Ji Q (2015) Multi-label learning with missing labels for image annotation and facial action unit recognition. Pattern Recogn 48(7):2279–2289
Article Google Scholar
Wu B, Lyu S, Ghanem B (2016) Constrained submodular minimization for missing labels and class imbalance in multi-label learning. In: The thirtieth AAAI conference on artificial intelligence
Xu M, Jin R, Zhou ZH (2013) Speedup matrix completion with side information: application to multi-label learning. In: Advances in neural information processing systems, pp 2301–2309
Zehfuss G (1858) ÜBer eine gewisse determinante. Zeitschrift für Mathematik und Physik, pp 298– 301
Zelnikmanor L, Perona P (2005) Self-tuning spectral clustering. In: Advances in neural information processing systems, pp 1601–1608
Zhang ML, Li YK, Liu XY (2015) Towards class-imbalance aware multi-label learning. In: International conference on artificial intelligence, pp 4041–4047
Zhu X (2005) Semi-supervised learning literature survey. Comput Sci 37(1):63–77
MathSciNet Google Scholar

Download references

Acknowledgments

Yongqiang Li is supported by National Natural Science Foundation of China (No. 61402129), and Postdoctoral Foundation Projects (No. LBH-Z14090, No. 2015M571417 and No. 2017T100243). Baoyuan Wu is supported by Tencent AI Lab Foundation. Hongxun Yao is partially supported by National Natural Science Foundation of China (No. 61472103) and Key Program (No. 61133003).

Author information

Authors and Affiliations

Harbin Institute of Technology, Harbin, China
Yongqiang Li, Yongping Zhao & Hongxun Yao
Tencent AI Lab, Bellevue, WA, 98004, USA
Baoyuan Wu
Department of Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute, Tro, NY, 12180, USA
Qiang Ji

Authors

Yongqiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Baoyuan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yongping Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Hongxun Yao
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Ji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Baoyuan Wu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(ZIP 237 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Y., Wu, B., Zhao, Y. et al. Handling missing labels and class imbalance challenges simultaneously for facial action unit recognition. Multimed Tools Appl 78, 20309–20332 (2019). https://doi.org/10.1007/s11042-018-6836-1

Download citation

Received: 19 October 2017
Revised: 14 September 2018
Accepted: 05 November 2018
Published: 28 February 2019
Issue Date: 30 July 2019
DOI: https://doi.org/10.1007/s11042-018-6836-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Handling missing labels and class imbalance challenges simultaneously for facial action unit recognition

Abstract

Access this article

Similar content being viewed by others

Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition

A discriminative deep association learning for facial expression recognition

Learn from All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Electronic supplementary material

(ZIP 237 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Handling missing labels and class imbalance challenges simultaneously for facial action unit recognition

Abstract

Access this article

Similar content being viewed by others

Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition

A discriminative deep association learning for facial expression recognition

Learn from All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Electronic supplementary material

(ZIP 237 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation