Facial Action Unit Detection by Exploring the Weak Relationships Between AU Labels

Tian, Mengke; Zhu, Hengliang; Wang, Yong; Cai, Yimao; Liu, Feng; Lin, Pengrong; Huang, Yingzhuo; Xie, Xiaochen

doi:10.1007/978-3-031-24386-8_26

Mengke Tian^19,20,
Hengliang Zhu²¹,
Yong Wang²⁰,
Yimao Cai¹⁹,
Feng Liu²²,
Pengrong Lin²⁰,
Yingzhuo Huang²⁰ &
…
Xiaochen Xie²⁰

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 461))

Included in the following conference series:

International Conference on Collaborative Computing: Networking, Applications and Worksharing

584 Accesses

Abstract

In recent years, facial action unit (AU) detection attracts more and more attentions and great progress has been made. However, few approaches solve AU detection problem by applying the emotion information, and the specific influence of emotion categories to AU detection is not investigated. In this paper, we firstly explore the relationship between emotion categories and AU labels, and study the influence of emotion for AU detection. With emotion weak labels, we propose a simple yet efficient deep network that uses limited emotion labels to constraint the AU detection. The proposed network contains two architectures: a main net and an assistant net. The main net can learn semantic relation between AUs, especially the AUs related to emotions. Moreover, we design a dual pooling module embedded into the main net to further promote the results. Extensive experiments on two datasets show that the AU detection can obtain benefits with the weak labels of AUs. The proposed method has a significant improvement on baseline and achieves state-of-the-art performance compared with other methods. Furthermore, because only the main net is used for testing, our model is very fast and achieves over 278 fps.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ekman, P., Rosenberg, E.L.: What the face reveals: basic and applied studies of spontaneous expression using the facial action coding system (facs), 2nd ed. (2005)
Google Scholar
Jiang, B., Martínez, B., Valstar, M.F., Pantic, M.: Decision level fusion of domain specific regions for facial action recognition. In: 22nd International Conference on Pattern Recognition, ICPR 2014, Stockholm, Sweden, 24–28 Aug 2014, pp. 1776–1781 (2014)
Google Scholar
Li, W., Abtahi, F., Zhu, Z., Yin, L.: EAC-net: Deep nets with enhancing and cropping for facial action unit detection. IEEE Trans. Pattern Anal. Mach. Intell. 40(11), 2583–2596 (2018)
Article Google Scholar
Wu, Y., Ji, Q.: Constrained joint cascade regression framework for simultaneous facial action unit recognition and facial landmark detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 3400–3408 (2016)
Google Scholar
Zhao, K., Chu, W.-S., Zhang, H.: Deep region and multi-label learning for facial action unit detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 3391–3399 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 770–778 (2016)
Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 2261–2269 (2017)
Google Scholar
Valstar, M.F., Pantic, M.: Fully automatic facial action unit detection and temporal analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2006, New York, NY, USA, 17–22 June 2006, p. 149 (2006)
Google Scholar
Eleftheriadis, S., Rudovic, O., Pantic, M.: Multi-conditional latent variable model for joint facial action unit detection. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 Dec 2015, pp. 3792–3800 (2015)
Google Scholar
Koelstra, S., Pantic, M., Patras, I.: A dynamic texture-based approach to recognition of facial actions and their temporal models. IEEE Trans. Pattern Anal. Mach. Intell. 32(11), 1940–1954 (2010)
Article Google Scholar
Wang, Z., Li, Y., Wang, S., Ji, Q.: Capturing global semantic relationships for facial action unit recognition. In: IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, 1–8 Dec 2013, pp. 3304–3311 (2013)
Google Scholar
Zhao, K., Chu, W.-S., De la Torre, F., Cohn, J.F., Zhang, H.: Joint patch and multi-label learning for facial action unit detection. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 2207–2216 (2015)
Google Scholar
Benitez-Quiroz, C.F., Srinivasan, R., Martínez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 5562–5570 (2016)
Google Scholar
Song, Y., McDuff, D., Vasisht, D., Kapoor, A.: Exploiting sparsity and co-occurrence structure for action unit recognition. In: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2015, Ljubljana, Slovenia, 4–8 May 2015, pp. 1–8 (2015)
Google Scholar
Gehrig, T., Al-Halah, Z., Ekenel, H.K., Stiefelhagen, R.: Action unit intensity estimation using hierarchical partial least squares. In: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2015, Ljubljana, Slovenia, 4–8 May 2015, pp. 1–6 (2015)
Google Scholar
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, 23–28 June 2014, pp. 1701–1708 (2014)
Google Scholar
Corneanu, C., Madadi, M., Escalera, S.: Deep Structure Inference Network for Facial Action Unit Recognition. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 309–324. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_19
Han, S., Meng, Z., O’Reilly, J., Cai, J., Wang, X., Tong, Y.: Optimizing filter size in convolutional neural networks for facial action unit recognition. CoRR, vol. abs/1707.08630 (2017)
Google Scholar
Hao, L., Wang, S., Peng, G., Ji, Q.: Facial action unit recognition augmented by their dependencies. In: 13th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2018, Xi’an, China, 15–19 May 2018, pp. 187–194 (2018)
Google Scholar
Shao, Z., Liu, Z., Cai, J., Ma, L.: Deep Adaptive Attention for Joint Facial Action Unit Detection and Face Alignment. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 725–740. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_43
Shao, Z., Liu, Z., Cai, J., Ma, L.: Jaâ-net: joint facial action unit detection and face alignment via adaptive attention. Int. J. Comput. Vis. 129, 1–20 (2021)
Google Scholar
Zhang, Y., Dong, W., Hu, B.-G., Ji, Q.: Classifier learning with prior probabilities for facial action unit recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018, pp. 5108–5116 (2018)
Google Scholar
Shao, Z., Liu, Z., Cai, J., Wu, Y., Ma, L.: Facial action unit detection using attention and relation learning. In: IEEE Transactions on Affective Computing, vol. PP, p. 1 (2019)
Google Scholar
Li, Y., Zeng, J., Shan, S., Chen, X.: Self-supervised representation learning from videos for facial action unit detection. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10924–10933 (2019)
Google Scholar
Shao, Z., Cai, J., Cham, T.-J., Lu, X., Ma, L.: Unconstrained facial action unit detection via latent feature domain. In: IEEE Transactions on Affective Computing, vol. PP, p. 1 (2021)
Google Scholar
Devries, T., Biswaranjan, K., Taylor, G.W.: Multi-task learning of facial landmarks and expression. In: Canadian Conference on Computer and Robot Vision, CRV 2014, Montreal, QC, Canada, 6–9 May 2014, pp. 98–103 (2014)
Google Scholar
Martinez, A.M.: Computational models of face perception. Curr. Dir. Psychol. Sci. 26(3), 263 (2017)
Article Google Scholar
Bau, D., Zhou, B., Khosla, A., Oliva, A., Torralba, A.: Network dissection: quantifying interpretability of deep visual representations. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 3319–3327 (2017)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision–ECCV 2014. ECCV 2014. LNCS, vol. 8689, pp. 818–833 Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Mahendran, A., Vedaldi, A.: Understanding deep image representations by inverting them. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 5188–5196 (2015)
Google Scholar
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2017)
Article Google Scholar
Zhang, X., et al.: A high-resolution spontaneous 3D dynamic facial expression database. In: IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, pp. 1–6 (2013)
Google Scholar
Mavadati, S.M., Mahoor, M.H., Bartlett, K., Trinh, P., Cohn, J.F.: DISFA: a spontaneous facial action intensity database. In: IEEE Transactions on Affective Computing, vol. 4, no. 2, pp. 151–160 (2013)
Google Scholar
Valstar, M.F., Pantic, M.: Fully automatic facial action unit detection and temporal analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR Workshops 2006, New York, NY, USA, 17–22 June 2006, p. 149 (2006)
Google Scholar
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: ACM MM, pp. 675–678 (2014)
Google Scholar
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
MATH Google Scholar
Zhong, L., Liu, Q., Yang, P., Huang, J., Metaxas, D.N.: Learning multiscale active facial patches for expression analysis. IEEE Trans. Cybernetics 45(8), 1499–1510 (2015)
Article Google Scholar
Zeng, J., Chu, W.-S., De la Torre, F., Cohn, J.F., Xiong, Z.: Confidence preserving machine for facial action unit detection. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 Dec 2015, pp. 3622–3630 (2015)
Google Scholar
Li, W., Abtahi, F., Zhu, Z.: Action unit detection with region adaptation, multi-labeling learning and optimal temporal fusing. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 6766–6775 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Integrated Circuits, Peking University, Beijing, 100871, China
Mengke Tian & Yimao Cai
Beijing Microelectronics Technology Institute, Beijing, 100076, China
Mengke Tian, Yong Wang, Pengrong Lin, Yingzhuo Huang & Xiaochen Xie
Fujian University of Technology, Fuzhou, 350118, China
Hengliang Zhu
Ericsson Communications Co. Ltd., Beijing, 100102, China
Feng Liu

Authors

Mengke Tian
View author publications
You can also search for this author in PubMed Google Scholar
Hengliang Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Yong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yimao Cai
View author publications
You can also search for this author in PubMed Google Scholar
Feng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Pengrong Lin
View author publications
You can also search for this author in PubMed Google Scholar
Yingzhuo Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaochen Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hengliang Zhu .

Editor information

Editors and Affiliations

Shanghai University, Shanghai, China
Honghao Gao
Xi’an Jiaotong-Liverpool University, Suzhou, China
Xinheng Wang
Zhejiang University City College, Hangzhou, China
Wei Wei
London South Bank University, London, UK
Tasos Dagiuklas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tian, M. et al. (2022). Facial Action Unit Detection by Exploring the Weak Relationships Between AU Labels. In: Gao, H., Wang, X., Wei, W., Dagiuklas, T. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 461. Springer, Cham. https://doi.org/10.1007/978-3-031-24386-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-031-24386-8_26
Published: 25 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24385-1
Online ISBN: 978-3-031-24386-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Facial Action Unit Detection by Exploring the Weak Relationships Between AU Labels