Effective attention feature reconstruction loss for facial expression recognition in the wild

Gong, Weijun; Fan, Yingying; Qian, Yurong

doi:10.1007/s00521-022-07016-8

Effective attention feature reconstruction loss for facial expression recognition in the wild

Original Article
Published: 04 March 2022

Volume 34, pages 10175–10187, (2022)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

752 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Facial expression recognition (FER) in the wild is very challenging due to occlusion, posture, illumination, and other uncontrolled factors. Learning discriminant features for FER using Convolutional Neural Networks is a momentous task for the significant class imbalance, wrong labels, inter-class similarities, and intra-class variations. The traditional method utilizes the Cross entropy loss function to optimize the convolutional network to obtain discriminative features for classification. However, this loss function cannot effectively solve the above problems in practice and cannot contribute to obtaining highly discriminant facial features for further analysis. Center loss improves the learning efficiency by reducing the intra-class distance of similar expressions, while the improvement of inter-class similarity, class imbalance, and generalization is insufficient. In this paper, we propose a lightweight Effective Attention Feature Reconstruction loss (EAFR loss), which can further optimize the feature space and enhance the discriminability of expression. The loss model is composed of the Focal Smoothing loss (FS loss) and the Aggregation-Separation loss (AS loss). Firstly, the FS loss can improve the poor recognition performance caused by imbalanced classes and prevent paranoid knowledge learning behaviors. Meanwhile, AS loss further accurately condenses the intra-class expression features and expands the inter-class distance, which is achieved by using progressive stage max-pooling channel and position attention mechanism and lightweight asymmetric autoencoder model for feature reconstruction. Finally, the EAFR loss joins the above two loss functions to more comprehensively solve the above typical problems for FER in the wild. We validate the proposed loss function on three most commonly used large-scale wild expression datasets (RAF-DB, FERPlus, and AffectNet), and the results show that our model achieves superior performance to several state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CBAM: Convolutional Block Attention Module

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

Deepfake: An Overview

References

Acharya D, Huang Z, Pani Paudel D, Van Gool L (2018) Covariance pooling for facial expression recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 367–374. IEEE
Albanie S, Nagrani A, Vedaldi A, Zisserman A (2018) Emotion recognition in speech using cross-modal transfer in the wild. In: Proceedings of the 26th ACM international conference on Multimedia, pp 292–301
Barsoum E, Zhang C, Ferrer CC, Zhang Z (2016) Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction (ICMI), pp 279–283
Cai J, Meng Z, Khan AS, Li Z, O'Reilly J, Tong Y (2018) Island loss for learning discriminative features in facial expression recognition. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG), pp 302–309. IEEE
Fan X, Deng Z, Wang K, Peng X, Qiao Y (2020) Learning discriminative representation for facial expression recognition from uncertainties. In: 2020 IEEE International Conference on Image Processing (ICIP), pp 903–907. IEEE
Farzaneh AH, Qi X (2020) Discriminant Distribution-Agnostic Loss for Facial Expression Recognition in the Wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 406–407. IEEE
Georgescu MI, Ionescu RT, Popescu M (2019) Local learning with deep and handcrafted features for facial expression recognition. IEEE Access 7:64827–64836
Article Google Scholar
Goodfellow IJ, Erhan D, Carrier PL, Courville A, Mirza M, Hamner B, Bengio Y (2013) Challenges in representation learning: A report on three machine learning contests. In: International Conference on Neural Information Processing, pp 117–124. Springer
Guo Y, Zhang L, Hu Y, He X, Gao J (2016) Ms-celeb-1m: A dataset and benchmark for large-scale face recognition. In: Proceedings of the European conference on computer vision (ECCV), pp 87–102. Springer
Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 13713–13722. IEEE
Huang C (2017) Combining convolutional neural networks for emotion recognition. In: 2017 IEEE MIT Undergraduate Research Technology Conference (URTC), pp 1–4. IEEE
Karnati M, Seal A, Krejcar O, Yazidi A (2020) Facial expression recognition using local gravitational force descriptor-based deep convolution neural networks. IEEE Trans Instrum Meas 70:1–12
Google Scholar
Karnati M, Seal A, Krejcar O, Yazidi A (2021) FER-net: facial expression recognition using deep neural net. Neural Comput Appl 33:9125–9136
Article Google Scholar
Karnati M, Seal A, Yazidi A, Krejcar O (2021) LieNet: a deep convolution neural networks framework for detecting deception. IEEE Trans Cogn Develop Syst. 126(5): 550–569
Li Y, Zeng J, Shan S, Chen X (2018) Patch-gated CNN for occlusion-aware facial expression recognition. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp 2209–2214. IEEE
Li Y, Lu Y, Li J, Lu G (2019) Separate loss for basic and compound facial expression recognition in the wild. In: Asian Conference on Machine Learning (ACML), pp 897–911
Li S, Deng W, Du JP (2017) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE Conference On Computer Vision And Pattern Recognition (CVPR), pp 2852–2861. IEEE
Li Y, Zeng J, Shan S, Chen X (2018) Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans Image Process 28(5):2439–2450
Article MathSciNet Google Scholar
Li H, Wang N, Ding X, Yang X, Gao X (2021) Adaptively learning facial expression representation via CF labels and distillation. IEEE Trans Image Process 30:2016–2028
Article Google Scholar
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2980–2988. IEEE
Liu W, Wen Y, Yu Z, Li M, Raj B, Song L (2017) Sphereface: Deep hypersphere embedding for face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 212–220. IEEE
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Mat thews I (2010) The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference On Computer Vision And Pattern Recognition-Workshops (CVPRW), pp 94–101. IEEE
Lyons M, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with gabor wavelets. In: Third IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp 200–205. IEEE
Misra D (2019) Mish: A self regularized non-monotonic neural activation function. arXiv preprint arXiv:1908.08681
Mollahosseini A, Hasani B, Mahoor MH (2017) Affectnet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10(1):18–31
Article Google Scholar
Müller R, Kornblith S, Hinton G (2019) When does label smoothing help? In: Proceedings of the 33rd International Conference on Neural Information Processing Systems (NIPS), pp 4694–4703
Ou J, Bai XB, Pei Y, Ma L, Liu W (2010) Automatic facial expression recognition using Gabor filter and expression analysis. In: 2010 Second International Conference on Computer Modeling and Simulation (ICCMS), pp 215–218. IEEE
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 618–626. IEEE
Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(6):803–816
Article Google Scholar
Shao J, Qian Y (2019) Three convolutional neural network models for facial expression recognition in the wild. Neurocomputing 355:82–92
Article Google Scholar
Siqueira H, Magg S, Wermter S (2020) Efficient facial feature learning with wide ensemble-based convolutional neural networks. In: Proceedings of the AAAI conference on artificial intelligence (AAAI), pp 5800–5809
Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11):2579–2605
MATH Google Scholar
Vo TH, Lee GS, Yang HJ, Kim SH (2020) Pyramid with super resolution for in-the-wild facial expression recognition. IEEE Access 8:131988–132001
Article Google Scholar
Wang K, Peng X, Yang J, Lu S, Qiao Y (2020) Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 6897–6906. IEEE
Wang K, Peng X, Yang J, Meng D, Qiao Y (2020) Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans Image Process 29:4057–4069
Article Google Scholar
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In: European conference on computer vision (ECCV), pp 499–515. Springer
Xia HY, Li C, Tan Y, Li L, Song S (2021) Destruction and reconstruction learning for facial expression recognition. IEEE Multimedia 28(2):20–28
Article Google Scholar
Yu Z, Zhang C (2015) Image based static facial expression recognition with multiple deep network learning. In: Proceedings of the 2015 ACM on international conference on multimodal interaction (ICMI), pp 435–442
Zeng J, Shan S, Chen X (2018) Facial expression recognition with inconsistently annotated datasets. In: Proceedings of the European conference on computer vision (ECCV), pp 222–237
Zhang Z, Luo P, Loy CC, Tang X (2018) From facial expression recognition to interpersonal relation prediction. Int J Comput Vis 126(5):550–569
Article MathSciNet Google Scholar
Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell 29(6):915–928
Article Google Scholar
Zhao Z, Liu Q, Zhou F (2021) Robust lightweight facial expression recognition network with label distribution training. In: Proceedings of the AAAI Conference on Artificial Intelligence. 35(4), pp 3510–3519
Zhao G, Huang X, Taini M, Li SZ, PietikäInen M (2011) Facial expression recognition from near-infrared videos. Image Vis Comput 29(9):607–619
Article Google Scholar

Download references

Acknowledgments

This research is supported by the National Science Foundation of China under Grant 61966035 and U1803261, by the Autonomous Region Science and Technology Department International Cooperation Project under Grant 2020E01023, by Tianshan Innovation Team Plan Project of Xinjiang Uygur Autonomous Region under Grant 202101642, and by the Funds for Creative Research Groups of Higher Education of Xinjiang Uygur Autonomous Region under Grant XJEDU 2017T002.

Author information

Authors and Affiliations

College of Information Science and Engineering, Xinjiang University, Urumqi, 830000, China
Weijun Gong, Yingying Fan & Yurong Qian
College of Software, Xinjiang University, Urumqi, 830000, China
Yurong Qian
Key Laboratory of Signal Detection and Processing in Xinjiang Uygur Autonomous Region, Urumqi, 830000, China
Yurong Qian

Authors

Weijun Gong
View author publications
You can also search for this author in PubMed Google Scholar
Yingying Fan
View author publications
You can also search for this author in PubMed Google Scholar
Yurong Qian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yurong Qian.

Ethics declarations

Conflict of interest

We wish to submit a new manuscript entitled “Effective Attention Feature Reconstruction Loss for Facial Expression Recognition in the Wild” for consideration in the Neural Computing and Applications Journal. We declare that this work is original and there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gong, W., Fan, Y. & Qian, Y. Effective attention feature reconstruction loss for facial expression recognition in the wild. Neural Comput & Applic 34, 10175–10187 (2022). https://doi.org/10.1007/s00521-022-07016-8

Download citation

Received: 07 October 2021
Accepted: 30 January 2022
Published: 04 March 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s00521-022-07016-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective attention feature reconstruction loss for facial expression recognition in the wild

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

Facial emotion recognition using convolutional neural networks (FERC)

Deepfake: An Overview

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Effective attention feature reconstruction loss for facial expression recognition in the wild

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

Facial emotion recognition using convolutional neural networks (FERC)

Deepfake: An Overview

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation