Skip to main content
Log in

Facial action unit detection via hybrid relational reasoning

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Correlations in facial action units (AUs) convey significant information for AU detection yet have not been thoroughly exploited. Most existing methods learn the regional correlation distribution of each AU, or reason the dependencies among AUs. However, these methods typically either predefine the correlations based on prior knowledge, which often ignores useful information, or directly learn the correlations guided by AU detection, which often includes irrelevant information. To resolve these limitations, we propose a novel hybrid relational reasoning framework for AU detection. In particular, we propose to adaptively reason pixel-level correlations of each AU, under the constraint of predefined regional correlations by facial landmarks, as well as the supervision of AU detection. Moreover, we propose to adaptively reason AU-level correlations using a graph convolutional network, by considering both predefined AU relationships and learnable relationship weights. Our framework is beneficial for integrating the advantages of correlation predefinition and correlation learning. Extensive experiments demonstrate that our approach (i) soundly outperforms the state-of-the-art AU detection methods on the challenging BP4D, DISFA, and GFT benchmarks, and (ii) can precisely reason the regional correlation distribution of each AU.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Blanz, V., Vetter, T.: A morphable model for the synthesis of 3d faces. In: The 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 187–194 (1999)

  2. Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: A dataset for recognising faces across pose and age. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 67–74. IEEE (2018)

  3. Chen, Y., Song, G., Shao, Z., Cai, J., Cham, T.J., Zheng, J.: Geoconv: geodesic guided convolution for facial action unit recognition. Pattern Recognit. 122, 108355 (2022)

    Article  Google Scholar 

  4. Chu, W.S., De la Torre, F., Cohn, J.F.: Learning spatial and temporal cues for multi-label facial action unit detection. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 25–32. IEEE (2017)

  5. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001)

    Article  Google Scholar 

  6. Corneanu, C. A., Madadi, M., Escalera, S.: Deep structure inference network for facial action unit recognition. In: European Conference on Computer Vision, pp. 309–324. Springer (2018)

  7. Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, pp. 3837–3845. Curran Associates, Inc. (2016)

  8. Ekman, P., Friesen, W.V.: Facial Action Coding System: A Technique for the Measurement of Facial Movement. Consulting Psychologists Press, Palo Alto (1978)

    Google Scholar 

  9. Ekman, P., Friesen, W.V., Hager, J.C.: Facial action coding system. Research Nexus (2002)

  10. Ertugrul, I.O., Cohn, J.F., Jeni, L.A., Zhang, Z., Yin, L., Ji, Q.: Crossing domains for au coding: perspectives, approaches, and measures. IEEE Trans. Biom. Behav. Identity Sci. 2(2), 158–171 (2020)

    Article  Google Scholar 

  11. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)

    MATH  Google Scholar 

  12. Girard, J.M., Chu, W.S., Jeni, L.A., Cohn, J.F.: Sayette group formation task (GFT) spontaneous facial expression database. In: IEEE International Conference on Automatic Face & Gesture Recognition, pp. 581–588. IEEE (2017)

  13. Guo, Y., Zhang, J., Cai, J., Jiang, B., Zheng, J.: Cnn-based real-time dense face reconstruction with inverse-rendered photo-realistic face images. IEEE Trans. Pattern Anal. Mach. Intell. 41(6), 1294–1307 (2018)

    Article  Google Scholar 

  14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE (2016)

  15. Jeni, L.A., Cohn, J.F., Kanade, T.: Dense 3d face alignment from 2d video for real-time use. Image Vis. Comput. 58, 13–24 (2017)

    Article  Google Scholar 

  16. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)

  17. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (2017)

  18. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105. Curran Associates, Inc. (2012)

  19. Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)

    Article  MathSciNet  Google Scholar 

  20. Li, G., Zhu, X., Zeng, Y., Wang, Q., Lin, L.: Semantic relationships guided representation learning for facial action unit recognition. In: AAAI Conference on Artificial Intelligence, pp. 8594–8601 (2019)

  21. Li, W., Abtahi, F., Zhu, Z., Yin, L.: Eac-net: deep nets with enhancing and cropping for facial action unit detection. IEEE Trans. Pattern Anal. Mach. Intell. 40(11), 2583–2596 (2018)

    Article  Google Scholar 

  22. Li, Y., Wang, S., Zhao, Y., Ji, Q.: Simultaneous facial feature tracking and facial expression recognition. IEEE Trans. Image Process. 22(7), 2559–2573 (2013)

    Article  Google Scholar 

  23. Li, Y., Zeng, J., Shan, S., Chen, X.: Self-supervised representation learning from videos for facial action unit detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10924–10933. IEEE (2019)

  24. Lin, M., Chen, Q., Yan, S.: Network in network. In: International Conference on Learning Representations (2014)

  25. Liu, Z., Dong, J., Zhang, C., Wang, L., Dang, J.: Relation modeling with graph convolutional networks for facial action unit detection. In: International Conference on Multimedia Modeling, pp. 489–501. Springer (2020)

  26. Lowe, D.G.: Object recognition from local scale-invariant features. In: IEEE International Conference on Computer Vision, pp. 1150–1157. IEEE (1999)

  27. Ma, C., Chen, L., Yong, J.: Au r-cnn: Encoding expert prior knowledge into r-cnn for action unit detection. Neurocomputing 355, 35–47 (2019)

    Article  Google Scholar 

  28. Mavadati, S.M., Mahoor, M.H., Bartlett, K., Trinh, P., Cohn, J.F.: Disfa: a spontaneous facial action intensity database. IEEE Trans. Affect. Comput. 4(2), 151–160 (2013)

    Article  Google Scholar 

  29. Niu, X., Han, H., Yang, S., Huang, Y., Shan, S.: Local relationship learning with person-specific shape regularization for facial action unit detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 11917–11926 (2019)

  30. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, pp. 8024–8035. Curran Associates, Inc. (2019)

  31. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, Burlington (1988)

    MATH  Google Scholar 

  32. Sankaran, N., Mohan, D.D., Setlur, S., Govindaraju, V., Fedorishin, D.: Representation learning through cross-modality supervision. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 1–8. IEEE (2019)

  33. Shao, Z., Liu, Z., Cai, J., Ma, L.: Jâa-net: joint facial action unit detection and face alignment via adaptive attention. Int. J. Comput. Vis. 129(2), 321–340 (2021)

    Article  Google Scholar 

  34. Shao, Z., Liu, Z., Cai, J., Wu, Y., Ma, L.: Facial action unit detection using attention and relation learning. IEEE Trans. Affect. Comput. (2019)

  35. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)

  36. Song, T., Chen, L., Zheng, W., Ji, Q.: Uncertain graph neural networks for facial action unit detection. In: AAAI Conference on Artificial Intelligence, pp. 5993–6001 (2021)

  37. Song, T., Cui, Z., Zheng, W., Ji, Q.: Hybrid message passing with performance-driven structures for facial action unit detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6267–6276. IEEE (2021)

  38. Sutskever, I., Martens, J., Dahl, G., Hinton, G.: On the importance of initialization and momentum in deep learning. In: International Conference on Machine Learning, pp. 1139–1147 (2013)

  39. Tong, Y., Ji, Q.: Learning Bayesian networks with qualitative constraints. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)

  40. Wang, Z., Li, Y., Wang, S., Ji, Q.: Capturing global semantic relationships for facial action unit recognition. In: IEEE International Conference on Computer Vision, pp. 3304–3311. IEEE (2013)

  41. Xiong, X., De, la, Torre, F.: Supervised descent method and its applications to face alignment. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 532–539. IEEE (2013)

  42. Zhang, X., Yin, L., Cohn, J.F., Canavan, S., Reale, M., Horowitz, A., Liu, P., Girard, J.M.: Bp4d-spontaneous: a high-resolution spontaneous 3d dynamic facial expression database. Image Vis. Comput. 32(10), 692–706 (2014)

    Article  Google Scholar 

  43. Zhao, K., Chu, W.S., De la Torre, F., Cohn, J.F., Zhang, H.: Joint patch and multi-label learning for facial action unit and holistic expression recognition. IEEE Trans. Image Process. 25(8), 3931–3946 (2016)

    Article  MathSciNet  Google Scholar 

  44. Zhao, K., Chu, W.S., Zhang, H.: Deep region and multi-label learning for facial action unit detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3391–3399. IEEE (2016)

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (No. 62106268), and the High-Level Talent Program for Innovation and Entrepreneurship (ShuangChuang Doctor) of Jiangsu Province (No. JSSCBS20211220). It was also partially supported by the National Natural Science Foundation of China (No. 62101555 and No. 62002360), the Natural Science Foundation of Jiangsu Province (No. BK20201346 and No. BK20210488), and the Fundamental Research Funds for the Central Universities (No. 2021QN+1072).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yong Zhou.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shao, Z., Zhou, Y., Liu, B. et al. Facial action unit detection via hybrid relational reasoning. Vis Comput 38, 3045–3057 (2022). https://doi.org/10.1007/s00371-022-02527-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-022-02527-w

Keywords