ABSTRACT
Label distribution learning (LDL) has achieved great progress in facial expression recognition (FER), where the generating label distribution is a key procedure for LDL-based FER. However, many existing researches have shown the common problem with noisy samples in FER, especially on in-the-wild datasets. This issue may lead to generating unreliable label distributions (which can be seen as label noise), and will further negatively affect the FER model. To this end, we propose a play-and-plug method of self-paced label distribution learning (SPLDL) for in-the-wild FER. Specifically, a simple yet efficient label distribution generator is adopted to generate label distributions to guide label distribution learning. We then introduce self-paced learning (SPL) paradigm and develop a novel self-paced label distribution learning strategy, which considers both classification losses and distribution losses. SPLDL first learns easy samples with reliable label distributions and gradually steps to complex ones, effectively suppressing the negative impact introduced by noisy samples and unreliable label distributions. Extensive experiments on in-the-wild FER datasets (\emphi.e., RAF-DB and AffectNet) based on three backbone networks demonstrate the effectiveness of the proposed method.
Supplemental Material
- Emad Barsoum, Cha Zhang, Cristian Canton Ferrer, and Zhengyou Zhang. 2016. Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. 279--283.Google ScholarDigital Library
- Shikai Chen, Jianfeng Wang, Yuedong Chen, Zhongchao Shi, Xin Geng, and Yong Rui. 2020. Label distribution learning on auxiliary label space graphs for facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 13984--13993.Google ScholarCross Ref
- Charles Darwin and Phillip Prodger. 1998. The expression of the emotions in man and animals. Oxford University Press (1998).Google ScholarCross Ref
- Jiankang Deng, Jia Guo, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. 2020. Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5203-- 5212.Google ScholarCross Ref
- Amir Hossein Farzaneh and Xiaojun Qi. 2020. Discriminant distribution-agnostic loss for facial expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 406--407.Google ScholarCross Ref
- Yixiao Ge, Feng Zhu, Dapeng Chen, Rui Zhao, et al. 2020. Self-paced contrastive learning with hybrid memory for domain adaptive object re-id. In Advances in Neural Information Processing Systems. 11309--11321.Google Scholar
- Xin Geng. 2016. Label distribution learning. IEEE Transactions on Knowledge and Data Engineering 28, 7 (2016), 1734--1748.Google ScholarCross Ref
- Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, and Jianfeng Gao. 2016. Msceleb-1m: A dataset and benchmark for large-scale face recognition. In Proceedings of the European Conference on Computer Vision. 87--102.Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.Google ScholarCross Ref
- Zongmo Huang, Yazhou Ren, Xiaorong Pu, and Lifang He. 2021. Non-Linear Fusion for Self-Paced Multi-View Clustering. In Proceedings of the 29th ACM International Conference on Multimedia. 3211--3219.Google ScholarDigital Library
- Youngkyoon Jang, Hatice Gunes, and Ioannis Patras. 2017. SmileNet: RegistrationFree Smiling Face Detection in the Wild. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1581--1589.Google ScholarCross Ref
- Xiuyi Jia, Xiang Zheng, Weiwei Li, Changqing Zhang, and Zechao Li. 2019. Facial emotion distribution learning by exploiting low-rank label correlations locally. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9841--9850.Google ScholarCross Ref
- Lu Jiang, Deyu Meng, Shoou-I Yu, Zhenzhong Lan, Shiguang Shan, and Alexander Hauptmann. 2014. Self-paced learning with diversity. In Advances in Neural Information Processing Systems. 2078--2086.Google Scholar
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
- Dimitrios Kollias, Shiyang Cheng, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. 2020. Deep neural network augmentation: Generating faces for affect analysis. International Journal of Computer Vision 128, 5 (2020), 1455--1484.Google ScholarDigital Library
- M Kumar, Benjamin Packer, and Daphne Koller. 2010. Self-paced learning for latent variable models. In Advances in Neural Information Processing Systems. 1189--1197.Google Scholar
- Hao Li and Maoguo Gong. 2017. Self-paced Convolutional Neural Networks. In Proceedings of the International Joint Conference on Artificial Intelligence. 2110-- 2116.Google ScholarCross Ref
- Shan Li and Weihong Deng. 2018. Reliable crowdsourcing and deep localitypreserving learning for unconstrained facial expression recognition. IEEE Transactions on Image Processing 28, 1 (2018), 356--370.Google ScholarDigital Library
- Shan Li and Weihong Deng. 2020. Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing (2020), 1--1. https://doi.org/10.1109/ TAFFC.2020.2981446Google Scholar
- Shan Li, Weihong Deng, and JunPing Du. 2017. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2852--2861.Google ScholarCross Ref
- Yingjian Li, Yao Lu, Jinxing Li, and Guangming Lu. 2019. Separate loss for basic and compound facial expression recognition in the wild. In Proceedings of the Asian Conference on Machine Learning. 897--911.Google Scholar
- Ping Liu, Shizhong Han, Zibo Meng, and Yan Tong. 2014. Facial expression recognition via a boosted deep belief network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1805--1812.Google ScholarDigital Library
- Ping Liu, Yuewei Lin, Zibo Meng, Lu Lu, Weihong Deng, Joey Tianyi Zhou, and Yi Yang. 2021. Point adversarial self-mining: A simple method for facial expression recognition. IEEE Transactions on Cybernetics (2021), 1--12.Google Scholar
- Ali Mollahosseini, Behzad Hasani, and Mohammad H Mahoor. 2017. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing 10, 1 (2017), 18--31.Google ScholarDigital Library
- Ali Mollahosseini, Behzad Hasani, Michelle J Salvador, Hojjat Abdollahi, David Chan, and Mohammad H Mahoor. 2016. Facial expression recognition from world wild web. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 58--65.Google ScholarCross Ref
- Bowen Pan, Shangfei Wang, and Bin Xia. 2019. Occluded facial expression recognition enhanced through privileged information. In Proceedings of the 27th ACM International Conference on Multimedia. 566--573.Google ScholarDigital Library
- Lili Pan, Shijie Ai, Yazhou Ren, and Zenglin Xu. 2020. Self-paced deep regression forests with consideration on underrepresented examples. In Proceedings of the European Conference on Computer Vision. 271--287.Google ScholarDigital Library
- Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).Google Scholar
- Robert Plutchik. 1980. A general psychoevolutionary theory of emotion. In Theories of Emotion. 3--33.Google Scholar
- Yazhou Ren, Peng Zhao, Yongpan Sheng, Dezhong Yao, and Zenglin Xu. 2017. Robust softmax regression for multi-class classification with self-paced learning. In Proceedings of the International Joint Conference on Artificial Intelligence. 2641-- 2647.Google ScholarCross Ref
- Delian Ruan, Yan Yan, Si Chen, Jing-Hao Xue, and Hanzi Wang. 2020. Deep disturbance-disentangled learning for facial expression recognition. In Proceedings of the 28th ACM International Conference on Multimedia. 2833--2841.Google ScholarDigital Library
- Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, and Hanzi Wang. 2021. Feature decomposition and reconstruction learning for effective facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7660--7669.Google ScholarCross Ref
- Henrique Siqueira, Sven Magg, and Stefan Wermter. 2020. Efficient facial feature learning with wide ensemble-based convolutional neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence. 5800--5809.Google ScholarCross Ref
- Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, and Yu Qiao. 2020. Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6897--6906.Google ScholarCross Ref
- Kai Wang, Xiaojiang Peng, Jianfei Yang, Debin Meng, and Yu Qiao. 2020. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Transactions on Image Processing 29 (2020), 4057--4069.Google ScholarDigital Library
- Chao Xing, Xin Geng, and Hui Xue. 2016. Logistic boosting regression for label distribution learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4489--4497.Google ScholarCross Ref
- Ning Xu, Yun-Peng Liu, and Xin Geng. 2019. Label enhancement for label distribution learning. IEEE Transactions on Knowledge and Data Engineering 33, 4 (2019), 1632--1643.Google ScholarDigital Library
- Huiyuan Yang, Umur Ciftci, and Lijun Yin. 2018. Facial expression recognition by de-expression residue learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2168--2177.Google ScholarCross Ref
- Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Facial expression recognition with inconsistently annotated datasets. In Proceedings of the European Conference on Computer Vision. 222--237.Google ScholarCross Ref
- Dingwen Zhang, Deyu Meng, and Junwei Han. 2016. Co-saliency detection via a self-paced multiple-instance learning framework. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 5 (2016), 865--878.Google ScholarDigital Library
- Ting Zhang. 2017. Facial expression recognition based on deep learning: a survey. In Proceedings of the International Conference on Intelligent and Interactive Systems and Applications. 345--352.Google Scholar
- Qian Zhao, Deyu Meng, Lu Jiang, Qi Xie, Zongben Xu, and Alexander G Hauptmann. 2015. Self-paced learning for matrix factorization. In Proceedings of the AAAI Conference on Artificial Intelligence. 3196--3202.Google ScholarCross Ref
- Zengqun Zhao, Qingshan Liu, and Feng Zhou. 2021. Robust lightweight facial expression recognition network with label distribution training. In Proceedings of the AAAI Conference on Artificial Intelligence. 3510--3519.Google ScholarCross Ref
- Ying Zhou, Hui Xue, and Xin Geng. 2015. Emotion distribution recognition from facial expressions. In Proceedings of the 23rd ACM International Conference on Multimedia. 1247--1250Google ScholarDigital Library
Index Terms
- Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition
Recommendations
Expression-invariant face recognition by facial expression transformations
In this paper, we present a method of expression-invariant face recognition that transforms input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is ...
Pose-robust feature learning for facial expression recognition
Automatic facial expression recognition (FER) from non-frontal views is a challenging research topic which has recently started to attract the attention of the research community. Pose variations are difficult to tackle and many face analysis methods ...
Facial expression recognition using dual dictionary learning
Comprehensive feature extraction method is proposed for facial expression recognition.A sparse dictionary learning approach is proposed for facial expression recognition.A regression dictionary is proposed for regression facial expression ...
Comments