research-article

Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition

Authors:
Jianjian Shao

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Zhenqian Wu

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Yuanyan Luo

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Shudong Huang

Sichuan University, Chengdu, China

Sichuan University, Chengdu, China
View Profile

,
Xiaorong Pu

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

,
Yazhou Ren

University of Electronic Science and Technology of China, Chengdu, China

University of Electronic Science and Technology of China, Chengdu, China
View Profile

MM '22: Proceedings of the 30th ACM International Conference on MultimediaOctober 2022Pages 161–169https://doi.org/10.1145/3503161.3547960

Published:10 October 2022Publication History

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 161–169

ABSTRACT

Label distribution learning (LDL) has achieved great progress in facial expression recognition (FER), where the generating label distribution is a key procedure for LDL-based FER. However, many existing researches have shown the common problem with noisy samples in FER, especially on in-the-wild datasets. This issue may lead to generating unreliable label distributions (which can be seen as label noise), and will further negatively affect the FER model. To this end, we propose a play-and-plug method of self-paced label distribution learning (SPLDL) for in-the-wild FER. Specifically, a simple yet efficient label distribution generator is adopted to generate label distributions to guide label distribution learning. We then introduce self-paced learning (SPL) paradigm and develop a novel self-paced label distribution learning strategy, which considers both classification losses and distribution losses. SPLDL first learns easy samples with reliable label distributions and gradually steps to complex ones, effectively suppressing the negative impact introduced by noisy samples and unreliable label distributions. Extensive experiments on in-the-wild FER datasets (\emphi.e., RAF-DB and AffectNet) based on three backbone networks demonstrate the effectiveness of the proposed method.

Supplemental Material

MM22-fp0868.mp4

mp4

145.9 MB

Download

References

Emad Barsoum, Cha Zhang, Cristian Canton Ferrer, and Zhengyou Zhang. 2016. Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. 279--283.Google ScholarDigital Library
Shikai Chen, Jianfeng Wang, Yuedong Chen, Zhongchao Shi, Xin Geng, and Yong Rui. 2020. Label distribution learning on auxiliary label space graphs for facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 13984--13993.Google ScholarCross Ref
Charles Darwin and Phillip Prodger. 1998. The expression of the emotions in man and animals. Oxford University Press (1998).Google ScholarCross Ref
Jiankang Deng, Jia Guo, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. 2020. Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5203-- 5212.Google ScholarCross Ref
Amir Hossein Farzaneh and Xiaojun Qi. 2020. Discriminant distribution-agnostic loss for facial expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 406--407.Google ScholarCross Ref
Yixiao Ge, Feng Zhu, Dapeng Chen, Rui Zhao, et al. 2020. Self-paced contrastive learning with hybrid memory for domain adaptive object re-id. In Advances in Neural Information Processing Systems. 11309--11321.Google Scholar
Xin Geng. 2016. Label distribution learning. IEEE Transactions on Knowledge and Data Engineering 28, 7 (2016), 1734--1748.Google ScholarCross Ref
Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. 2016. Msceleb-1m: A dataset and benchmark for large-scale face recognition. In Proceedings of the European Conference on Computer Vision. 87--102.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarCross Ref
Zongmo Huang, Yazhou Ren, Xiaorong Pu, and Lifang He. 2021. Non-Linear Fusion for Self-Paced Multi-View Clustering. In Proceedings of the 29th ACM International Conference on Multimedia. 3211--3219.Google ScholarDigital Library
Youngkyoon Jang, Hatice Gunes, and Ioannis Patras. 2017. SmileNet: RegistrationFree Smiling Face Detection in the Wild. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1581--1589.Google ScholarCross Ref
Xiuyi Jia, Xiang Zheng, Weiwei Li, Changqing Zhang, and Zechao Li. 2019. Facial emotion distribution learning by exploiting low-rank label correlations locally. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9841--9850.Google ScholarCross Ref
Lu Jiang, Deyu Meng, Shoou-I Yu, Zhenzhong Lan, Shiguang Shan, and Alexander Hauptmann. 2014. Self-paced learning with diversity. In Advances in Neural Information Processing Systems. 2078--2086.Google Scholar
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
Dimitrios Kollias, Shiyang Cheng, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. 2020. Deep neural network augmentation: Generating faces for affect analysis. International Journal of Computer Vision 128, 5 (2020), 1455--1484.Google ScholarDigital Library
M Kumar, Benjamin Packer, and Daphne Koller. 2010. Self-paced learning for latent variable models. In Advances in Neural Information Processing Systems. 1189--1197.Google Scholar
Hao Li and Maoguo Gong. 2017. Self-paced Convolutional Neural Networks. In Proceedings of the International Joint Conference on Artificial Intelligence. 2110-- 2116.Google ScholarCross Ref
Shan Li and Weihong Deng. 2018. Reliable crowdsourcing and deep localitypreserving learning for unconstrained facial expression recognition. IEEE Transactions on Image Processing 28, 1 (2018), 356--370.Google ScholarDigital Library
Shan Li and Weihong Deng. 2020. Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing (2020), 1--1. https://doi.org/10.1109/ TAFFC.2020.2981446Google Scholar
Shan Li, Weihong Deng, and JunPing Du. 2017. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2852--2861.Google ScholarCross Ref
Yingjian Li, Yao Lu, Jinxing Li, and Guangming Lu. 2019. Separate loss for basic and compound facial expression recognition in the wild. In Proceedings of the Asian Conference on Machine Learning. 897--911.Google Scholar
Ping Liu, Shizhong Han, Zibo Meng, and Yan Tong. 2014. Facial expression recognition via a boosted deep belief network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1805--1812.Google ScholarDigital Library
Ping Liu, Yuewei Lin, Zibo Meng, Lu Lu, Weihong Deng, Joey Tianyi Zhou, and Yi Yang. 2021. Point adversarial self-mining: A simple method for facial expression recognition. IEEE Transactions on Cybernetics (2021), 1--12.Google Scholar
Ali Mollahosseini, Behzad Hasani, and Mohammad H Mahoor. 2017. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing 10, 1 (2017), 18--31.Google ScholarDigital Library
Ali Mollahosseini, Behzad Hasani, Michelle J Salvador, Hojjat Abdollahi, David Chan, and Mohammad H Mahoor. 2016. Facial expression recognition from world wild web. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 58--65.Google ScholarCross Ref
Bowen Pan, Shangfei Wang, and Bin Xia. 2019. Occluded facial expression recognition enhanced through privileged information. In Proceedings of the 27th ACM International Conference on Multimedia. 566--573.Google ScholarDigital Library
Lili Pan, Shijie Ai, Yazhou Ren, and Zenglin Xu. 2020. Self-paced deep regression forests with consideration on underrepresented examples. In Proceedings of the European Conference on Computer Vision. 271--287.Google ScholarDigital Library
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).Google Scholar
Robert Plutchik. 1980. A general psychoevolutionary theory of emotion. In Theories of Emotion. 3--33.Google Scholar
Yazhou Ren, Peng Zhao, Yongpan Sheng, Dezhong Yao, and Zenglin Xu. 2017. Robust softmax regression for multi-class classification with self-paced learning. In Proceedings of the International Joint Conference on Artificial Intelligence. 2641-- 2647.Google ScholarCross Ref
Delian Ruan, Yan Yan, Si Chen, Jing-Hao Xue, and Hanzi Wang. 2020. Deep disturbance-disentangled learning for facial expression recognition. In Proceedings of the 28th ACM International Conference on Multimedia. 2833--2841.Google ScholarDigital Library
Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, and Hanzi Wang. 2021. Feature decomposition and reconstruction learning for effective facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7660--7669.Google ScholarCross Ref
Henrique Siqueira, Sven Magg, and Stefan Wermter. 2020. Efficient facial feature learning with wide ensemble-based convolutional neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence. 5800--5809.Google ScholarCross Ref
Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, and Yu Qiao. 2020. Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6897--6906.Google ScholarCross Ref
Kai Wang, Xiaojiang Peng, Jianfei Yang, Debin Meng, and Yu Qiao. 2020. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Transactions on Image Processing 29 (2020), 4057--4069.Google ScholarDigital Library
Chao Xing, Xin Geng, and Hui Xue. 2016. Logistic boosting regression for label distribution learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4489--4497.Google ScholarCross Ref
Ning Xu, Yun-Peng Liu, and Xin Geng. 2019. Label enhancement for label distribution learning. IEEE Transactions on Knowledge and Data Engineering 33, 4 (2019), 1632--1643.Google ScholarDigital Library
Huiyuan Yang, Umur Ciftci, and Lijun Yin. 2018. Facial expression recognition by de-expression residue learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2168--2177.Google ScholarCross Ref
Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Facial expression recognition with inconsistently annotated datasets. In Proceedings of the European Conference on Computer Vision. 222--237.Google ScholarCross Ref
Dingwen Zhang, Deyu Meng, and Junwei Han. 2016. Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 5 (2016), 865--878.Google ScholarDigital Library
Ting Zhang. 2017. Facial expression recognition based on deep learning: a survey. In Proceedings of the International Conference on Intelligent and Interactive Systems and Applications. 345--352.Google Scholar
Qian Zhao, Deyu Meng, Lu Jiang, Qi Xie, Zongben Xu, and Alexander G Hauptmann. 2015. Self-paced learning for matrix factorization. In Proceedings of the AAAI Conference on Artificial Intelligence. 3196--3202.Google ScholarCross Ref
Zengqun Zhao, Qingshan Liu, and Feng Zhou. 2021. Robust lightweight facial expression recognition network with label distribution training. In Proceedings of the AAAI Conference on Artificial Intelligence. 3510--3519.Google ScholarCross Ref
Ying Zhou, Hui Xue, and Xin Geng. 2015. Emotion distribution recognition from facial expressions. In Proceedings of the 23rd ACM International Conference on Multimedia. 1247--1250Google ScholarDigital Library

Index Terms

Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
  2. Machine learning
    1. Machine learning approaches

Recommendations

Expression-invariant face recognition by facial expression transformations

In this paper, we present a method of expression-invariant face recognition that transforms input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is ...
Read More
Pose-robust feature learning for facial expression recognition

Automatic facial expression recognition (FER) from non-frontal views is a challenging research topic which has recently started to attract the attention of the research community. Pose variations are difficult to tackle and many face analysis methods ...
Read More
Facial expression recognition using dual dictionary learning

Comprehensive feature extraction method is proposed for facial expression recognition.A sparse dictionary learning approach is proposed for facial expression recognition.A regression dictionary is proposed for regression facial expression ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
October 2022
7537 pages
ISBN:9781450392037
DOI:10.1145/3503161
General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 October 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
facial expression recognition
label distribution learning
self-paced learning.
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 290
  Total Downloads
- Downloads (Last 12 months)141
- Downloads (Last 6 weeks)11
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Expression-invariant face recognition by facial expression transformations

Pose-robust feature learning for facial expression recognition

Facial expression recognition using dual dictionary learning