Multi-level spatial and semantic enhancement network for expression recognition

Ma, Yingdong; Wang, Xia; Wei, Lihua

doi:10.1007/s10489-021-02254-0

Multi-level spatial and semantic enhancement network for expression recognition

Published: 07 April 2021

Volume 51, pages 8565–8578, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

1040 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Facial expression recognition (FER) on real world databases is an active and challenging research topic. Existing CNN-based facial expression classifiers usually have good performance on common expressions, including happy and surprise, but have lower accuracy on difficult expressions, such as disgust and fear. Two main factors are responsible for this problem. Firstly, intra-class variation makes classification of difficult expressions more complex than other expressions. Secondly, severe data imbalance of difficult expressions in most FER datasets leads to overfitting during training. In this work, a new network architecture is proposed to address the intra-class variation problem. The proposed model consists of a spatial enhancement module and a semantic aggregation module to enhance fine-level expression features and high-level semantic features. To alleviate the data imbalance problem, an iterative learning method is introduced to collect difficult expression samples. New samples with inconsistent labels are classified by using a fuzzy clustering algorithm. The proposed FER framework has been evaluated on three real world expression datasets. Experimental results demonstrate that the proposed method significantly improved the recognition accuracy of difficult expressions and achieved top performance compared with state-of-the-art works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 5

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

Ninad Mehendale

Transfer learning for image classification using VGG19: Caltech-101 image data set

Article 17 September 2021

Monika Bansal, Munish Kumar, … Ajay Mittal

Real-time facial emotion recognition system among children with autism based on deep learning and IoT

Article Open access 07 March 2023

Fatma M. Talaat

References

Du S, Tao Y, Martinez A (2014) Compound facial expressions of emotion. Proc Natl Acad Sci 111:1454–1462
Article Google Scholar
Li S, Deng W, Du J (2017) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. IEEE/CVF conference on computer vision and pattern recognition, pp 2584–2593
Zhang F, Zhang T, Mao Q, Xu C (2018) Joint pose and expression modeling for facial expression recognition. IEEE/CVF conference on computer vision and pattern recognition, pp 3359–3368
Lin F, Hong R, Zhou W, Li H (2018) Facial expression recognition with data augmentation and compact feature learning. IEEE international conference on image processing, https://doi.org/10.1109/ICIP.2018.8451039
Agarwal S, Mukherjee DP (2019) Synthesis of realistic facial expressions using expression map. IEEE Trans Multimed 21:902–914
Article Google Scholar
Li Y, Zeng J, Shan S, Chen X (2019) Occlusion aware facial expression recognition using cnn with attention mechanism. IEEE Trans Image Process 28:2439–2450. https://doi.org/10.1109/TIP.2018.2886767
Article MathSciNet Google Scholar
Kim D, Baddar WJ, Jang J, Ro YM (2017) Multi-objective based spatial-temporal feature representation learning robust to expression intensity variations for facial expression recognition. IEEE Trans Affect Comput 10:223–236. https://doi.org/10.1109/TAFFC.2017.2695999
Article Google Scholar
Ma H, Celik T (2019) Fer-net facial expression recognition using densely connected convolutional network. Electron Lett 55:184–186
Article Google Scholar
Zhang X, Ma Y (2019) Learning of complicate facial expression categories. International conference on image, video and signal process
Goodfellow IJ, Erhan D, Carrier PL, Courville A, Mirza M, Hamner B, Cukierski W, Tang Y, Thaler D, Lee DH (2015) Challenges in representation learning: A report on three machine learning contests. Neural Netw 64:59–63. https://doi.org/j.neunet.2014.09.005
Article Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. International conference on learning representations
Lin T, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. IEEE/CVF conference on computer vision and pattern recognition
Fu C, Liu W, Ranga A, Tyagi A, Berg A (2017) Dssd: Deconvolutional single shot detector. arXiv:1701.06659
Kuo C, Lai S, Sarkis M (2018) A compact deep learning model for robust facial expression recognition. IEEE/CVF conference on computer vision and pattern recognition workshops, https://doi.org/10.1109/CVPRW.2018.00286
Xie S, Hu H (2019) Facial expression recognition using hierarchical features with deep comprehensive multi-patches aggregation convolutional neural networks. IEEE Trans Multimed 21:211–220
Article Google Scholar
Mollahosseini A, Hasani B, Mahoor MH (2017) Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans Affect Comput 10:18–31
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. IEEE/CVF Conference on computer vision and pattern recognition, pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Meng Z, Liu P, Cai J, Han S, Tong Y (2017) Identity-aware convolutional neural network for facial expression recognition. IEEE international conference on automatic face and gesture recognition, pp 558–565
Zeng N, Zhang H, Song B, Liu W, Li Y, Dobaie AM (2018) Facial expression recognition via learning deep sparse autoencoders. Neurocomputing 273:643–649
Article Google Scholar
Zia MS, Hussain M, Jaffar MA (2018) A novel spontaneous facial expression recognition using dynamically weighted majority voting based ensemble classifier. Multimed Tools Appl 77:25537–25567
Article Google Scholar
Li D, Wen G, Li X, Cai X (2019) Graph-based dynamic ensemble pruning for facial expression recognition. Appl Intell 49:3188–3206
Article Google Scholar
Li H, Wen G (2019) Sample awareness-based personalized facial expression recognition. Appl Intell 49:2956–2969
Article Google Scholar
Lopes A, Aguiar E, Souza AD, Oliveira-Santos T (2017) Facial expression recognition with convolutional neural networks: Coping with few data and the training sample order. Pattern Recogn 61:610–628
Article Google Scholar
Douzas G, Bacao F (2018) Effective data generation for imbalanced learning using conditional generative adversarial networks. Expert Systems with Applications 91:464–471. https://doi.org/10.1016/j.eswa.2017.09.030
Article Google Scholar
Li S, Deng W (2016) Real world expression recognition: A highly imbalanced detection problem. IEEE international conference on biometrics, pp 1–6. https://doi.org/10.1109/ICB.2016.7550074
Dhall A, Goecke R, Lucey S, Gedeon T (2011) Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark. IEEE International conference on computer vision workshops, pp 2106–2112. https://doi.org/10.1109/ICCVW.2011.6130508
Ekman P, Friesen W (1978) Facial action coding system: A technique for the measurement of facial movement. Facial action coding system
Liu M, Li S, Shan S, Chen X (2015) Au-inspired deep networks for facial expression feature learning. Neurocomputing 159:126–136
Article Google Scholar
Yu F, Koltun V (2016) Multi-scale context aggregation by dilated convolutions. International conference on learning representations
Zeng J, Shan S, Chen X (2018) Facial expression recognition with inconsistently annotated datasets. European conference on computer vision, pp 1–16
Wang Z (2020) A new clustering method based on morphological operations. Expert Sys Appl, vol 145
Wang Z (2017) Determining the clustering centers by slope difference distribution. IEEE Access 5:10995–11002
Article Google Scholar
Pantic M, Valstar M, Rademaker R, Maat L (2005) Web-based database for facial expression analysis. IEEE International conference on multimedia and expo, pp 317–321. https://doi.org/10.1109/ICME.2005.1521424
Lucey P, Cohn JF, Kanade T, Saragih J (2010) The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. IEEE conference on computer vision and pattern recognition workshops, pp 94–101
Lyons MJ, Akamatsu S, Kamachi M, Gyoba J, Budynek J (1998) The japanese female facial expression (jaffe) database. Proceedings of third international conference on automatic face and gesture recognition, pp 14–16
Dollar P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36:1532–1545
Article Google Scholar
Zhao C, Chen K, Wei Z, Chen Y, Miao D, Wang W (2019) Multilevel triplet deep learning model for person re-identification. Pattern Recogn Lett 117:161–168
Article Google Scholar
Zhao C, Lv X, Zhang Z, Zuo W, Wu J, Miao D (2020) Deep fusion feature representation learning with hard mining center-triplet loss for person re-identification. IEEE Trans Multimed 22:3180–3195
Article Google Scholar
Li S, Den W (2020) A deeper look at facial expression dataset bias. IEEE Trans Affect Comput, pp 1–13
Nguyen D, Kim S, Lee G, Yang H, Na I, Kim S (2020) Facial expression recognition using a temporal ensemble of multi-level convolutional neural networks. IEEE Trans Affect Comput, pp 1–12
Georgescu M, Ionescu RT, Popescu M (2019) Local learning with deep and handcrafted features for facial expression recognition. IEEE Access 7:64827–64836
Article Google Scholar
Tsai KY, Ding JJ, Lee YC (2018) Frontalization with adaptive exponentially-weighted average ensemble rule for deep learning based facial expression recognition. IEEE Asia Pacific conference on circuits and systems, pp 447–450
Acharya D, Huang Z, Paudel D, Gool LV (2018) Covariance pooling for facial expression recognition. IEEE conference on computer vision and pattern recognition, pp 2584–2593
Fu Y, Wu X, Li X, Pan Z, Luo D (2020) Semantic neighborhood-aware deep facial expression recognition. IEEE Trans Image Process 29:6535–6548
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science College, Inner Mongolia University, College Road No. 235, Hohhot, Inner Mongolia, China
Yingdong Ma, Xia Wang & Lihua Wei

Authors

Yingdong Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xia Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lihua Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yingdong Ma.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by the National Natural Science Foundation of China under Grant 61461039.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, Y., Wang, X. & Wei, L. Multi-level spatial and semantic enhancement network for expression recognition. Appl Intell 51, 8565–8578 (2021). https://doi.org/10.1007/s10489-021-02254-0

Download citation

Accepted: 01 February 2021
Published: 07 April 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s10489-021-02254-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Multi-level spatial and semantic enhancement network for expression recognition

Abstract

Access this article

Similar content being viewed by others

Facial emotion recognition using convolutional neural networks (FERC)

Transfer learning for image classification using VGG19: Caltech-101 image data set

Real-time facial emotion recognition system among children with autism based on deep learning and IoT

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-level spatial and semantic enhancement network for expression recognition

Abstract

Access this article

Similar content being viewed by others

Facial emotion recognition using convolutional neural networks (FERC)

Transfer learning for image classification using VGG19: Caltech-101 image data set

Real-time facial emotion recognition system among children with autism based on deep learning and IoT

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation