skip to main content
10.1145/3647649.3647650acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicigpConference Proceedingsconference-collections
research-article

Facial Action Unit Detection based on Multi-task Learning

Published:03 May 2024Publication History

ABSTRACT

The detection of Facial Action Units (AUs) refers to the use of computer vision techniques or machine learning methods to identify or detect specific facial expressions. The combination of different AUs can describe and quantify changes in facial expressions, making it a very important task in facial attribute analysis. In recent years, with the development of deep learning, significant progress has been made in facial AU detection. However, most research has focused on single tasks, training specific models for facial AU detection. This approach overlooks the relationship between AUs and other facial attributes, leading to limited noise resistance and adaptability. In this paper, a multi-task learning method is proposed for facial AU detection. Specifically, the model simultaneously learns facial AU detection, facial landmark detection, and facial emotion recognition. By sharing the underlying network, the model can learn more general feature representations, thereby improving its generalization capability. Additionally, the landmark coordinates obtained from facial landmark detection provide attention maps for AUs, which can help avoid interference from other facial information and improve detection performance. The proposed method achieved competitive results on the widely used BP4D dataset.

References

  1. Shao Z, Liu Z, Cai J, Facial action unit detection using attention and relation learning[J]. IEEE transactions on affective computing, 2019, 13(3): 1274-1289.Google ScholarGoogle Scholar
  2. Li W, Abtahi F, Zhu Z, Eac-net: Deep nets with enhancing and cropping for facial action unit detection[J]. IEEE transactions on pattern analysis and machine intelligence, 2018, 40(11): 2583-2596.Google ScholarGoogle Scholar
  3. Kaili Z, Chu W S, Zhang H. Deep region and multi-label learning for facial action unit detection[C]//In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 3391-3399.Google ScholarGoogle Scholar
  4. Ma C, Chen L, Yong J. Au r-cnn: Encoding expert prior knowledge into r-cnn for action unit detection[J]. neurocomputing, 2019, 355: 35-47.Google ScholarGoogle Scholar
  5. Zhang K, Zhang Z, Li Z, Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE signal processing letters, 2016, 23(10): 1499-1503.Google ScholarGoogle ScholarCross RefCross Ref
  6. Zhang Z, Luo P, Loy C C, Facial landmark detection by deep multi-task learning[C]//Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VI 13. Springer International Publishing, 2014: 94-108.Google ScholarGoogle Scholar
  7. Xu D, Ouyang W, Wang X, Pad-net: Multi-tasks guided prediction-and-distillation network for simultaneous depth estimation and scene parsing[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 675-684.Google ScholarGoogle Scholar
  8. Li W, Abtahi F, Zhu Z, Eac-net: A region-based deep enhancing and crop** approach for facial action unit detection[C]//2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017). IEEE, 2017: 103-110.Google ScholarGoogle Scholar
  9. Shao Z, Liu Z, Cai J, Facial action unit detection using attention and relation learning[J]. IEEE transactions on affective computing, 2019, 13(3): 1274-1289.Google ScholarGoogle Scholar
  10. Li G, Zhu X, Zeng Y, Semantic relationships guided representation learning for facial action unit recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2019, 33(01): 8594-8601.Google ScholarGoogle Scholar
  11. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, 2014.Google ScholarGoogle Scholar
  12. Li Y, Tarlow D, Brockschmidt M, Gated graph sequence neural networks[J]. arXiv preprint arXiv:1511.05493, 2015.Google ScholarGoogle Scholar
  13. Tang Y, Zeng W, Zhao D, Piap-df: Pixel-interested and anti person-specific facial action unit detection net with discrete feedback learning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 12899-12908.Google ScholarGoogle Scholar
  14. Shao Z, Liu Z, Cai J, Jaa-net: joint facial action unit detection and face alignment via adaptive attention[J]. International Journal of Computer Vision, 2021, 129: 321-340.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Zhang X, Yin L, Cohn J F, Bp4d-spontaneous: a high-resolution spontaneous 3d dynamic facial expression database[J]. Image and Vision Computing, 2014, 32(10): 692-706.Google ScholarGoogle ScholarCross RefCross Ref
  16. Zhang X, Yin L, Cohn J F, A high-resolution spontaneous 3d dynamic facial expression database[C]//2013 10th IEEE international conference and workshops on automatic face and gesture recognition (FG). IEEE, 2013: 1-6.Google ScholarGoogle Scholar
  17. Kaili Z, Chu W S, Zhang H. Deep region and multi-label learning for facial action unit detection[C]//In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 3391-3399.Google ScholarGoogle Scholar
  18. Yang H, Yin L, Zhou Y, Exploiting semantic embedding and visual feature for facial action unit detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 10482-10491.Google ScholarGoogle Scholar

Index Terms

  1. Facial Action Unit Detection based on Multi-task Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICIGP '24: Proceedings of the 2024 7th International Conference on Image and Graphics Processing
      January 2024
      480 pages
      ISBN:9798400716720
      DOI:10.1145/3647649

      Copyright © 2024 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 May 2024

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)7

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format