A framework for facial expression recognition using deep self-attention network

Indolia, Sakshi; Nigam, Swati; Singh, Rajiv

doi:10.1007/s12652-023-04627-4

A framework for facial expression recognition using deep self-attention network

Original Research
Published: 17 May 2023

Volume 14, pages 9543–9562, (2023)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

260 Accesses
4 Citations
Explore all metrics

Abstract

Facial expression recognition (FER) is a widely used technique for emotion recognition. In recent years, numerous deep convolutional neural network (CNN) models have been implemented for this purpose. However, CNN is not capable of representing the most relevant parts of the input data, and moreover, the existing deep models do not perform well on small datasets and cannot deal with intra-class variation and inter-class similarity. Therefore, in this work, we address these issues by proposing a deep learning framework for FER using self-attention and data augmentation. The proposed self-attention model addresses intra-class variation and inter-class similarity issues, whereas the data augmentation technique improves model performance by increasing the size of smaller datasets and avoiding overfitting. The proposed model handles both posed and spontaneous expressions and has been tested on the JAFFE, CK + , RAF, FER2013, MUG, and YALE datasets. A series of experiments have been conducted with and without self-attention to validate our approach. Furthermore, we have used the sigmoid activation function in the self-attention mechanism to improve the performance of the proposed deep learning model. Experimental results show that the classification performance of a deep learning model is improved by incorporating the proposed self-attention with sigmoid activation function and data augmentation technique. Comparative analysis using quantitative evaluation metrics (precision, recall, F1-score, and accuracy) shows that the proposed method works better than existing machine learning and deep learning methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

Article 04 May 2022

Effective attention feature reconstruction loss for facial expression recognition in the wild

Article 04 March 2022

NA-Resnet: neighbor block and optimized attention module for global-local feature extraction in facial expression recognition

Article 11 November 2022

Data availability

The data analyzed for this study are available upon reasonable request from the corresponding author.

References

Acharya D, Huang Z, Pani Paudel D, Van Gool L (2018) Covariance pooling for facial expression recognition. In: Proceedings of the IEEE Conference on Computer vision and pattern recognition Workshops, pp 367–374
Aghamaleki JA, Ashkani Chenarlogh V (2019) Multi-stream CNN for facial expression recognition in limited training data. Multimed Tools Appl 78(16):22861–22882
Article Google Scholar
Aifanti N, Papachristou C, Delopoulos A (2010, April) The MUG facial expression database. In: 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10, pp 1–4. IEEE
Alphonse AS, Shankar K, Jeyasheela Rakkini MJ, Ananthakrishnan S, Athisayamani S, Robert Singh A, Gobi R (2021) A multi-scale and rotation-invariant phase pattern (MRIPP) and a stack of restricted Boltzmann machine (RBM) with preprocessing for facial expression classification. J Ambient Intell Humaniz Comput 12(3):3447–3463
Article Google Scholar
Alreshidi A, Ullah M (2020) Facial emotion recognition using hybrid features. Informatics 7(1):6 (Multidisciplinary Digital Publishing Institute)
Article Google Scholar
Aouayeb, M, Hamidouche, W, Soladie, C, Kpalma, K, & Seguier, R. (2021) Learning vision transformer with squeeze and excitation for facial expression recognition. arXiv preprint arXiv:2107.03107.
Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw 5(2):157–166
Article Google Scholar
Bhatti YK, Jamil A, Nida N, Yousaf MH, Viriri S, Velastin SA (2021) Facial expression recognition of instructor using deep features and extreme learning machine. Comput Intell Neurosci 2021:1–17
Bianco S, Cadene R, Celona L, Napoletano P (2018) Benchmark analysis of representative deep neural network architectures. IEEE Access 6:64270–64277
Article Google Scholar
Bodapati JD, Naik DS, Suvarna B, Naralasetti V (2022) A deep learning framework with cross pooled soft attention for facial expression recognition. J Inst Eng (India) Ser B, pp 1–11
Boughida A, Kouahla MN, Lafifi Y (2021) A novel approach for facial expression recognition based on Gabor filters and genetic algorithm. Evolv Syst, pp 1–15
Chattopadhyay J, Kundu S, Chakraborty A, Banerjee JS (2018) Facial expression recognition for human computer interaction. In: International Conference on computational vision and bio inspired computing, pp 1181–1192. Springer, Cham
Chen X, Ke L, Du Q, Li J, Ding X (2021a) Facial expression recognition using kernel entropy component analysis network and DAGSVM. Complexity 2021:1–12
Chen Y, Phonevilay V, Tao J, Chen X, Xia R, Zhang Q, Xie J (2021b) The face image super-resolution algorithm based on combined representation learning. Multimed Tools Appl 80:30839–30861
Article Google Scholar
Chen Y, Liu L, Phonevilay V, Gu K, Xia R, Xie J, Yang K (2021c) Image super-resolution reconstruction based on feature map attention mechanism. Appl Intell 51:4367–4380
Article Google Scholar
Chen, Y, Zhang, H, Liu, L, Tao, J, Zhang, Q, Yang, K, Xia R, Xie, J (2021d) Research on image inpainting algorithm of improved total variation minimization method. J Ambient Intell Humaniz Comput, pp 1–10
Chirra VRR, Uyyala SR, Kolli VKK (2021) Virtual facial expression recognition using deep CNN with ensemble learning. J Ambient Intell Humaniz Comput 12(12):10581–10599
Article Google Scholar
Darwin C (1965) The expression of the emotions in man and animals. University of Chicago Press, Chicago
Book Google Scholar
Fan Y, Lam JC, Li VO (2018) Multi-region ensemble convolutional neural network for facial expression recognition. In: International Conference on artificial neural networks, pp 84–94. Springer, Cham
Fan Y, Li V, Lam JC (2020) Facial expression recognition with deeply supervised attention network. IEEE Trans Affect Comput 13:1057–1071
Farzaneh AH, Qi X (2021) Facial expression recognition in the wild via deep attentive center loss. In: Proceedings of the IEEE/CVF Winter Conference on applications of computer vision, pp 2402–2411
Gan Y, Chen J, Yang Z, Xu L (2020) Multiple attention network for facial expression recognition. IEEE Access 8:7383–7393
Article Google Scholar
Gan C, Xiao J, Wang Z, Zhang Z, Zhu Q (2022) Facial expression recognition using densely connected convolutional neural network and hierarchical spatial attention. Image vis Comput 117:104342
Article Google Scholar
Ghimire D, Jeong S, Yoon S, Choi J, Lee J (2015) Facial expression recognition based on region specific appearance and geometric features. In: 2015 Tenth International Conference on digital information management (ICDIM), pp 142–147. IEEE
González-Lozoya SM, Dela Calleja J, Pellegrin L, Escalante HJ, Medina MA, Benitez-Ruiz A (2020) Recognition of facial expressions based on CNN features. Multimed Tools Appl 79(19):13987–14007
Article Google Scholar
Goodfellow IJ, Erhan D, Carrier PL, Courville A, Mirza M, Hamner B, Bengio Y (2013) Challenges in representation learning: a report on three machine learning contests. In: International Conference on neural information processing, pp 117–124. Springer, Berlin, Heidelberg
Gopalan NP, Bellamkonda S, Chaitanya VS (2018, July) Facial expression recognition using geometric landmark points and convolutional neural networks. In: 2018 International Conference on inventive research in computing applications (ICIRCA), pp 1149–1153. IEEE
He K, Zhang, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 770–778
He K, Zhang X, Ren S, Sun J (2016b) Identity mappings in deep residual networks. In: European conference on computer vision, pp 630–645. Springer, Cham.
Ioffe S, Szegedy C. (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on machine learning, pp 448–456. PMLR.
Jang J, Cho H, Kim J, Lee J, Yang S (2018) Facial attribute recognition by recurrent learning with visual fixation. IEEE Trans Cybern 49(2):616–625
Article Google Scholar
Jung H, Lee S, Yim J, Park S, Kim J (2015) Joint fine-tuning in deep neural networks for facial expression recognition. In: Proceedings of the IEEE International Conference on computer vision, pp 2983–2991
Kartheek MN, Prasad MV, Bhukya R (2021) Radial mesh pattern: a handcrafted feature descriptor for facial expression recognition. J Ambient Intell Humaniz Comput, pp 1–13
Kola DGR, Samayamantula SK (2021) A novel approach for facial expression recognition using local binary pattern with adaptive window. Multimed Tools Appl 80(2):2243–2262
Article Google Scholar
Li S, Deng W (2018) Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans Image Process 28(1):356–370
Article MathSciNet MATH Google Scholar
Li S, Deng W, Du J (2017) Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 2852–2861
Li Y, Zeng J, Shan S, Chen X (2018a) Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans Image Process 28(5):2439–2450
Article MathSciNet Google Scholar
Li Y, Zeng J, Shan S, Chen X (2018b) Patch-gated CNN for occlusion-aware facial expression recognition. In: 2018 24th International Conference on pattern recognition (ICPR), pp 2209–2214. IEEE
Li J, Jin K, Zhou D, Kubota N, Ju Z (2020) Attention mechanism-based CNN for facial expression recognition. Neurocomputing 411:340–350
Article Google Scholar
Liang X, Xu L, Liu J, Liu Z, Cheng G, Xu J, Liu L (2021) Patch attention layer of embedding handcrafted features in CNN for facial expression recognition. Sensors 21(3):833
Article Google Scholar
Liu M, Shan S, Wang R, Chen X (2014a) Learning expressionlets on spatio-temporal manifold for dynamic facial expression recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1749–1756
Liu P, Han S, Meng Z, Tong Y (2014b) Facial expression recognition via a boosted deep belief network. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 1805–1812
Liu X, Cheng X, Lee K (2020) GA-SVM-based facial emotion recognition using facial geometric features. IEEE Sens J 21(10):11532–11542
Article Google Scholar
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on computer vision and pattern recognition-workshops, pp 94–101. IEEE
Luo Z, Hu J, Deng W (2018) Local subclass constraint for facial expression recognition in the wild. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp 3132–3137. IEEE
Lyons M, Kamachi M, Gyoba J (1998) The Japanese Female Facial Expression (JAFFE) dataset. Zenodo. https://doi.org/10.5281/zenodo.3451524
Mahesh VG, Chen C, Rajangam V, Raj ANJ, Krishnan PT (2021) Shape and texture aware facial expression recognition using spatial pyramid Zernike moments and law’s textures feature set. IEEE Access 9:52509–52522
Article Google Scholar
Marrero Fernandez PD, Guerrero Pena FA, Ren T, Cunha A (2019) FERATT: Facial expression recognition with attention net. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp 0–0
Minaee S, Minaei M, Abdolrashidi A (2021) Deep-emotion: Facial expression recognition using attentional convolutional network. Sensors 21(9):3046
Article Google Scholar
Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: 2016 IEEE Winter Conference on applications of computer vision (WACV), pp 1–10. IEEE
Nigam S, Singh R, Misra AK (2018) Efficient facial expression recognition using histogram of oriented gradients in wavelet domain. Multimed Tools Appl 77(21):28725–28747
Article Google Scholar
Nigam S, Singh R, Misra AK (2019) A review of computational approaches for human behavior detection. Arch Comput Methods Eng 26:831–863
Google Scholar
Niu B, Gao Z, Guo B (2021) Facial expression recognition with LBP and ORB features. Comput Intell Neurosci 2021:1–10
Oztel I, Yolcu G, Oz C (2019) Performance comparison of transfer learning and training from scratch approaches for deep facial expression recognition. In: 2019 4th International Conference on Computer Science and Engineering (UBMK), pp 1–6. IEEE
Qu X, Zou Z, Su X, Zhou P, Wei W, Wen S, Wu D (2021) Attend to where and when: cascaded attention network for facial expression recognition. IEEE Trans Emerg Top Comput Intell 6:580–592
Article Google Scholar
Ravi R, Yadhukrishna SV (2020) A face expression recognition using CNN & LBP. In: 2020 Fourth International Conference on computing methodologies and communication (ICCMC), pp 684–689. IEEE
Ruiz-Garcia A, Webb N, Palade V, Eastwood M, Elshaw M (2018) Deep learning for real time facial expression recognition in social robots. In: International Conference on neural information processing, pp 392–402. Springer, Cham
Sadeghi H, Raie AA (2017) Approximated Chi-square distance for histogram matching in facial image analysis: face and expression recognition. In: 2017 10th Iranian Conference on machine vision and image processing (MVIP), pp 188–191. IEEE.
Sadeghi H, Raie AA (2019) Human vision inspired feature extraction for facial expression recognition. Multimed Tools Appl 78(21):30335–30353
Article Google Scholar
Saurav S, Gidde P, Saini R, Singh S (2022) Dual integrated convolutional neural network for real-time facial expression recognition in the wild. Vis Comput 38(3):1083–1096
Article Google Scholar
Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image vis Comput 27(6):803–816
Article Google Scholar
Shehu HA, Sharif MH, Uyaver S (2021) Facial expression recognition using deep learning. In: AIP Conference Proceedings, Vol. 2334, No. 1, p. 070003. AIP Publishing LLC.
Sun W, Zhao H, Jin Z (2018) A visual attention-based ROI detection method for facial expression recognition. Neurocomputing 296:12–22
Article Google Scholar
Sun X, Zheng S, Fu H (2020) ROI-attention vectorized CNN model for static facial expression recognition. IEEE Access 8:7183–7194
Article Google Scholar
Verma B, Choudhary A (2018) A framework for driver emotion recognition using deep learning and Grassmann manifolds. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp 1421–1426. IEEE
Viola P, Jones M (2001, December) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on computer vision and pattern recognition. CVPR 2001, Vol. 1, pp. I-I. IEEE
Wang W, Sun Q, Chen T, Cao C, Zheng Z, Xu G, Fu Y (2019) A fine-grained facial expression database for end-to-end multi-pose facial expression recognition. arXiv preprint arXiv:1907.10838
Xia R, Chen Y, Ren B (2022) Improved anti-occlusion object tracking algorithm using Unscented Rauch-Tung-Striebel smoother and kernel correlation filter. J King Saud Univ-Comput Inf Sci 34(8):6008–6018
Google Scholar
Xie S, Hu H, Wu Y (2019) Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recogn 92:177–191
Article Google Scholar
Yale Face Database (2017). http://cvc.cs.yale.edu/cvc/projects/yalefaces/yalefaces.html. Accessed 28 Dec 2017
Yu N, Bai D (2021) A visual self-attention network for facial expression recognition. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp 1–8. IEEE
Yu W, Xu H (2022) Co-attentive multi-task convolutional neural network for facial expression recognition. Pattern Recogn 123:108401
Article Google Scholar
Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell 29(6):915–928
Article Google Scholar
Zhao Y, Zeng J (2020) Library intelligent book recommendation system using facial expression recognition. In: 2020 9th International Congress on Advanced Applied Informatics (IIAI-AAI), pp 55–58. IEEE
Zhao X, Liang X, Liu L, Li T, Han Y, Vasconcelos N, Yan S (2016) Peak-piloted deep network for facial expression recognition. In: European conference on computer vision, pp 425–442. Springer, Cham
Zhao S, Cai H, Liu H, Zhang J, Chen S (2018) Feature Selection Mechanism in CNNs for Facial Expression Recognition. In: BMVC, p 317
Zhou L, Fan X, Tjahjadi T, Das Choudhury S (2022) Discriminative attention-augmented feature learning for facial expression recognition in the wild. Neural Comput Appl 34(2):925–936
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Banasthali Vidyapith, Banasthali, Rajasthan, 304022, India
Sakshi Indolia, Swati Nigam & Rajiv Singh
Centre for Artificial Intelligence, Banasthali Vidyapith, Banasthali, Rajasthan, 304022, India
Sakshi Indolia, Swati Nigam & Rajiv Singh

Authors

Sakshi Indolia
View author publications
You can also search for this author in PubMed Google Scholar
Swati Nigam
View author publications
You can also search for this author in PubMed Google Scholar
Rajiv Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajiv Singh.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest associated with the manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Indolia, S., Nigam, S. & Singh, R. A framework for facial expression recognition using deep self-attention network. J Ambient Intell Human Comput 14, 9543–9562 (2023). https://doi.org/10.1007/s12652-023-04627-4

Download citation

Received: 04 April 2022
Accepted: 02 May 2023
Published: 17 May 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s12652-023-04627-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A framework for facial expression recognition using deep self-attention network

Abstract

Access this article

Similar content being viewed by others

A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

Effective attention feature reconstruction loss for facial expression recognition in the wild

NA-Resnet: neighbor block and optimized attention module for global-local feature extraction in facial expression recognition

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A framework for facial expression recognition using deep self-attention network

Abstract

Access this article

Similar content being viewed by others

A Deep Learning Framework with Cross Pooled Soft Attention for Facial Expression Recognition

Effective attention feature reconstruction loss for facial expression recognition in the wild

NA-Resnet: neighbor block and optimized attention module for global-local feature extraction in facial expression recognition

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation