Skip to main content

Facial Action Unit Detection by Exploring the Weak Relationships Between AU Labels

  • Conference paper
  • First Online:
Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2022)

Abstract

In recent years, facial action unit (AU) detection attracts more and more attentions and great progress has been made. However, few approaches solve AU detection problem by applying the emotion information, and the specific influence of emotion categories to AU detection is not investigated. In this paper, we firstly explore the relationship between emotion categories and AU labels, and study the influence of emotion for AU detection. With emotion weak labels, we propose a simple yet efficient deep network that uses limited emotion labels to constraint the AU detection. The proposed network contains two architectures: a main net and an assistant net. The main net can learn semantic relation between AUs, especially the AUs related to emotions. Moreover, we design a dual pooling module embedded into the main net to further promote the results. Extensive experiments on two datasets show that the AU detection can obtain benefits with the weak labels of AUs. The proposed method has a significant improvement on baseline and achieves state-of-the-art performance compared with other methods. Furthermore, because only the main net is used for testing, our model is very fast and achieves over 278 fps.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ekman, P., Rosenberg, E.L.: What the face reveals: basic and applied studies of spontaneous expression using the facial action coding system (facs), 2nd ed. (2005)

    Google Scholar 

  2. Jiang, B., Martínez, B., Valstar, M.F., Pantic, M.: Decision level fusion of domain specific regions for facial action recognition. In: 22nd International Conference on Pattern Recognition, ICPR 2014, Stockholm, Sweden, 24–28 Aug 2014, pp. 1776–1781 (2014)

    Google Scholar 

  3. Li, W., Abtahi, F., Zhu, Z., Yin, L.: EAC-net: Deep nets with enhancing and cropping for facial action unit detection. IEEE Trans. Pattern Anal. Mach. Intell. 40(11), 2583–2596 (2018)

    Article  Google Scholar 

  4. Wu, Y., Ji, Q.: Constrained joint cascade regression framework for simultaneous facial action unit recognition and facial landmark detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 3400–3408 (2016)

    Google Scholar 

  5. Zhao, K., Chu, W.-S., Zhang, H.: Deep region and multi-label learning for facial action unit detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 3391–3399 (2016)

    Google Scholar 

  6. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 770–778 (2016)

    Google Scholar 

  7. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 2261–2269 (2017)

    Google Scholar 

  8. Valstar, M.F., Pantic, M.: Fully automatic facial action unit detection and temporal analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2006, New York, NY, USA, 17–22 June 2006, p. 149 (2006)

    Google Scholar 

  9. Eleftheriadis, S., Rudovic, O., Pantic, M.: Multi-conditional latent variable model for joint facial action unit detection. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 Dec 2015, pp. 3792–3800 (2015)

    Google Scholar 

  10. Koelstra, S., Pantic, M., Patras, I.: A dynamic texture-based approach to recognition of facial actions and their temporal models. IEEE Trans. Pattern Anal. Mach. Intell. 32(11), 1940–1954 (2010)

    Article  Google Scholar 

  11. Wang, Z., Li, Y., Wang, S., Ji, Q.: Capturing global semantic relationships for facial action unit recognition. In: IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, 1–8 Dec 2013, pp. 3304–3311 (2013)

    Google Scholar 

  12. Zhao, K., Chu, W.-S., De la Torre, F., Cohn, J.F., Zhang, H.: Joint patch and multi-label learning for facial action unit detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 2207–2216 (2015)

    Google Scholar 

  13. Benitez-Quiroz, C.F., Srinivasan, R., Martínez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 5562–5570 (2016)

    Google Scholar 

  14. Song, Y., McDuff, D., Vasisht, D., Kapoor, A.: Exploiting sparsity and co-occurrence structure for action unit recognition. In: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2015, Ljubljana, Slovenia, 4–8 May 2015, pp. 1–8 (2015)

    Google Scholar 

  15. Gehrig, T., Al-Halah, Z., Ekenel, H.K., Stiefelhagen, R.: Action unit intensity estimation using hierarchical partial least squares. In: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2015, Ljubljana, Slovenia, 4–8 May 2015, pp. 1–6 (2015)

    Google Scholar 

  16. Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 1701–1708 (2014)

    Google Scholar 

  17. Corneanu, C., Madadi, M., Escalera, S.: Deep Structure Inference Network for Facial Action Unit Recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 309–324. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_19

  18. Han, S., Meng, Z., O’Reilly, J., Cai, J., Wang, X., Tong, Y.: Optimizing filter size in convolutional neural networks for facial action unit recognition. CoRR, vol. abs/1707.08630 (2017)

    Google Scholar 

  19. Hao, L., Wang, S., Peng, G., Ji, Q.: Facial action unit recognition augmented by their dependencies. In: 13th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2018, Xi’an, China, 15–19 May 2018, pp. 187–194 (2018)

    Google Scholar 

  20. Shao, Z., Liu, Z., Cai, J., Ma, L.: Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 725–740. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_43

  21. Shao, Z., Liu, Z., Cai, J., Ma, L.: Jaâ-net: joint facial action unit detection and face alignment via adaptive attention. Int. J. Comput. Vis. 129, 1–20 (2021)

    Google Scholar 

  22. Zhang, Y., Dong, W., Hu, B.-G., Ji, Q.: Classifier learning with prior probabilities for facial action unit recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018, pp. 5108–5116 (2018)

    Google Scholar 

  23. Shao, Z., Liu, Z., Cai, J., Wu, Y., Ma, L.: Facial action unit detection using attention and relation learning. In: IEEE Transactions on Affective Computing, vol. PP, p. 1 (2019)

    Google Scholar 

  24. Li, Y., Zeng, J., Shan, S., Chen, X.: Self-supervised representation learning from videos for facial action unit detection. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10924–10933 (2019)

    Google Scholar 

  25. Shao, Z., Cai, J., Cham, T.-J., Lu, X., Ma, L.: Unconstrained facial action unit detection via latent feature domain. In: IEEE Transactions on Affective Computing, vol. PP, p. 1 (2021)

    Google Scholar 

  26. Devries, T., Biswaranjan, K., Taylor, G.W.: Multi-task learning of facial landmarks and expression. In: Canadian Conference on Computer and Robot Vision, CRV 2014, Montreal, QC, Canada, 6–9 May 2014, pp. 98–103 (2014)

    Google Scholar 

  27. Martinez, A.M.: Computational models of face perception. Curr. Dir. Psychol. Sci. 26(3), 263 (2017)

    Article  Google Scholar 

  28. Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 3319–3327 (2017)

    Google Scholar 

  29. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision–ECCV 2014. ECCV 2014. LNCS, vol. 8689, pp. 818–833 Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

  30. Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 5188–5196 (2015)

    Google Scholar 

  31. Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)

    Article  Google Scholar 

  32. Zhang, X., et al.: A high-resolution spontaneous 3D dynamic facial expression database. In: IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, pp. 1–6 (2013)

    Google Scholar 

  33. Mavadati, S.M., Mahoor, M.H., Bartlett, K., Trinh, P., Cohn, J.F.: DISFA: a spontaneous facial action intensity database. In: IEEE Transactions on Affective Computing, vol. 4, no. 2, pp. 151–160 (2013)

    Google Scholar 

  34. Valstar, M.F., Pantic, M.: Fully automatic facial action unit detection and temporal analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2006, New York, NY, USA, 17–22 June 2006, p. 149 (2006)

    Google Scholar 

  35. Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM MM, pp. 675–678 (2014)

    Google Scholar 

  36. Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)

    MATH  Google Scholar 

  37. Zhong, L., Liu, Q., Yang, P., Huang, J., Metaxas, D.N.: Learning multiscale active facial patches for expression analysis. IEEE Trans. Cybernetics 45(8), 1499–1510 (2015)

    Article  Google Scholar 

  38. Zeng, J., Chu, W.-S., De la Torre, F., Cohn, J.F., Xiong, Z.: Confidence preserving machine for facial action unit detection. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 Dec 2015, pp. 3622–3630 (2015)

    Google Scholar 

  39. Li, W., Abtahi, F., Zhu, Z.: Action unit detection with region adaptation, multi-labeling learning and optimal temporal fusing. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 6766–6775 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hengliang Zhu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tian, M. et al. (2022). Facial Action Unit Detection by Exploring the Weak Relationships Between AU Labels. In: Gao, H., Wang, X., Wei, W., Dagiuklas, T. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 461. Springer, Cham. https://doi.org/10.1007/978-3-031-24386-8_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24386-8_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24385-1

  • Online ISBN: 978-3-031-24386-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics