Skip to main content

Visual attention based composite dense neural network for facial expression recognition

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Facial Expression Recognition (FER) models have received special attention in the field of computer vision and provide the basis for many real-time applications. This article proposes a unique deep learning model called Visual Attention based Composite Dense Neural Network (VA-CDNN) for recognising expressions from facial images. We extract eye-pair, mouth, and normalized face regions from facial images using localized facial landmark points. Eye-pair and mouth regions provide local information, and normalized face provides comprehensive and holistic information about facial expression. All these cropped facial regions are passed through the pre-trained Xception deep ConvNet independently to obtain the most discriminating spatial representations from each of the regions. These representations serve as input to proposed Visual Attention block. Rather than giving equal importance to each feature in the spatial representation, attention weight is computed for each feature map to indicate the amount of attention to be paid. These attention based features obtained from all the three regions are then fused to obtain a compact and discriminatory representation that ultimately leads to better identification of facial expressions. A regularized dense neural network is trained on these visual attention based features to identify the type of facial expression. Efficacy and robustness of the attention based approach are proved based on the experimental studies on the benchmark JAFFE and CK+ datasets. Proposed VA-CDNN achieved a highest test accuracy of 97.67% and 97.46% on CK+ and JAFFE datasets respectively. Results obtained from the experimental studies reveal that the proposed method using attention based features is comparable to the recent best models with consistently improving performance regardless of the number of expressions being considered for recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Balootaki HR, Moeinkhah H, Mohammadzadeh A (2020) On the synchronization and stabilization of fractional-order chaotic systems: recent advances and future perspectives. Phys A 551:124203

    Article  MathSciNet  Google Scholar 

  • Bodapati JD, Veeranjaneyulu N, Shaik S (2019) Sentiment analysis from movie reviews using lstms. Ing Syst Inf 24:1

    Google Scholar 

  • Bodapati JD, Veeranjaneyulu N, Shaik S (2021) Deep convolution feature aggregation: an application to diabetic retinopathy severity level prediction. Signal Image Video Process 20:1–8

    Google Scholar 

  • Burkert P, Trier F, Afzal MZ, Dengel A, Liwicki M (2015) Dexpression: deep convolutional neural network for expression recognition. arXiv:1509.05371 (arXiv preprint)

  • Carcagnì P, Del Coco M, Leo M, Distante C (2015) Facial expression recognition and histograms of oriented gradients: a comprehensive study. SpringerPlus 4(1):645

    Article  Google Scholar 

  • Ch S et al (2021) An efficient facial emotion recognition system using novel deep learning neural network-regression activation classifier. Multimed Tools Appl 80(12):17543–17568

    Article  Google Scholar 

  • Chen J, Qihao O, Chi Z, Hong F (2017) Smile detection in the wild with deep convolutional neural networks. Mach Vis Appl 28(1–2):173–183

    Article  Google Scholar 

  • Cheng F, Jiangsheng Yu, Xiong H (2010) Facial expression recognition in jaffe dataset based on gaussian process classification. IEEE Trans Neural Netw 21(10):1685–1690

    Article  Google Scholar 

  • Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  • Dahmane M, Meunier J (2011) Emotion recognition using dynamic grid-based hog features. In: Face and gesture 2011, IEEE, pp 884–888

  • Ekman P, Friesen WV (1971) Constants across cultures in the face and emotion. J Pers Soc Psychol 17(2):124

    Article  Google Scholar 

  • Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680,

  • Gu W, Xiang C, Venkatesh YV, Huang D, Lin H (2012) Facial expression recognition using radial encoding of local gabor features and classifier synthesis. Pattern Recogn 45(1):80–91

    Article  Google Scholar 

  • Hamester D, Barros P, Wermter S (2015) Face expression recognition with a 2-channel convolutional neural network. In: 2015 international joint conference on neural networks (IJCNN), pp 1–8. IEEE

  • Happy SL, Routray A (2014) Automatic facial expression recognition using features of salient facial patches. IEEE Trans Affect Comput 6(1):1–12

    Article  Google Scholar 

  • Hassner T, Harel S, Paz E, Enbar R (2015) Effective face frontalization in unconstrained images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4295–4304

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  • Hu Z, Bodyanskiy YV, Kulishova NY, Tyshchenko OK (2017) A multidimensional extended neo-fuzzy neuron for facial expression recognition. Int J Intell Syst Appl 9(9):29

    Google Scholar 

  • Huang Y, Chen F, Lv S, Wang X (2019) Facial expression recognition: a survey. Symmetry 11(10):1189

    Article  Google Scholar 

  • Hu P, Cai D Wang S, Yao A, Chen Y (2017a) Learning supervised scoring ensemble for emotion recognition in the wild. In: Proceedings of the 19th ACM international conference on multimodal interaction, pp 553–560

  • Kanade T, Cohn JF, Tian Y (2000) Comprehensive database for facial expression analysis. In: Proceedings Fourth IEEE international conference on automatic face and gesture recognition (Cat. No. PR00580), pp 46–53. IEEE

  • Khorrami P, Paine T, Huang T (2015) Do deep neural networks learn facial action units when doing expression recognition? In: Proceedings of the IEEE international conference on computer vision workshops, pp 19–27

  • Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  • Li Y, Zeng J, Shan S, Chen X (2018) Occlusion aware facial expression recognition using cnn with attention mechanism. IEEE Trans Image Process 28(5):2439–2450

    Article  MathSciNet  Google Scholar 

  • Liang L, Lang C, Li Y, Feng S, Zhao J (2020) Fine-grained facial expression recognition in the wild. IEEE Trans Inf Forensics Secur 16:482–494

    Article  Google Scholar 

  • Lin H-H, Lo L-J, Chiang W-C (2019) A novel assessment technique for the degree of facial symmetry before and after orthognathic surgery based on three-dimensional contour features using deep learning algorithms. In: Proceedings of the 2019 9th international conference on biomedical engineering and technology, pp 170–173

  • Liu M, Li S, Shan S, Chen X (2015) Au-inspired deep networks for facial expression feature learning. Neurocomputing 159:126–136

    Article  Google Scholar 

  • Liu P, Han S, Meng Z, Tong Y (2014) Facial expression recognition via a boosted deep belief network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1805–1812

  • Liu M, Li S, Shan S, Chen X (2013) Au-aware deep networks for facial expression recognition. In: 2013 10th IEEE international conference and workshops on automatic face and gesture recognition (FG), pp 1–6. IEEE

  • Lopes AT, de Aguiar E, De Souza AF, Oliveira-Santos T (2017) Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recogn 61:610–628

    Article  Google Scholar 

  • Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE computer society conference on computer vision and pattern recognition-workshops, pp 94–101. IEEE

  • Luo Y, Cai-ming W, Zhang Y (2013) Facial expression recognition based on fusion feature of PCA and LBP with SVM. Optik Int J Light Electron Opt 124(17):2767–2770

    Article  Google Scholar 

  • Michael J, Lyons MK, Gyoba J (1997) Japanese female facial expressions (JAFFE). Database of digital images

  • Minaee S, Minaei M, Abdolrashidi A (2021) Deep-emotion: facial expression recognition using attentional convolutional network. Sensors 21(9):3046

    Article  Google Scholar 

  • Mohammadzadeh A, Ghavifekr AA (2021) A simple matlab simulink model for adaptive general type-2 fuzzy logic systems. In: 2021 7th international conference on control, instrumentation and automation (ICCIA), pp 1–4. IEEE

  • Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. In: 2016 IEEE winter conference on applications of computer vision (WACV), pp 1–10. IEEE

  • Nanni L, Ghidoni S, Brahnam S (2017) Handcrafted vs. non-handcrafted features for computer vision classification. Pattern Recogn 71:158–172

    Article  Google Scholar 

  • Ng H-W, Nguyen VD, Vonikakis V, Winkler S (2015) Deep learning for emotion recognition on small datasets using transfer learning. In: Proceedings of the 2015 ACM on international conference on multimodal interaction, pp 443–449

  • Nwosu L, Wang H, Lu J, Unwala I, Yang X, Zhang T (2017) Deep convolutional neural network for facial expression recognition using facial parts. In: 2017 IEEE 15th international conference on dependable, autonomic and secure computing, pp 1318–1321. IEEE

  • Perikos I, Paraskevas M, Hatzilygeroudis I (2018) Facial expression recognition using adaptive neuro-fuzzy inference systems. In: 2018 IEEE/ACIS 17th international conference on computer and information science (ICIS), pp 1–6. IEEE

  • Pitaloka DA, Wulandari A, Basaruddin T, Liliana DY (2017) Enhancing cnn with preprocessing stage in automatic emotion recognition. Proced Comput Sci 116:523–529

    Article  Google Scholar 

  • Shaik NS, Cherukuri TK (2021) Lesion-aware attention with neural support vector machine for retinopathy diagnosis. Mach Vis Appl 32(6):1–13

    Article  Google Scholar 

  • Shaik NS, Cherukuri TK (2021) Multi-level attention network: application to brain tumor classification. Signal Image Video Process 20:1–8

    Google Scholar 

  • Shaik NS, Cherukuri TK (2022) Transfer learning based novel ensemble classifier for covid-19 detection from chest ct-scans. Comput Biol Med 141:105127

    Article  Google Scholar 

  • Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(6):803–816

    Article  Google Scholar 

  • Shin M, Kim M, Kwon D-S (2016) Baseline cnn structure analysis for facial expression recognition. In: 2016 25th IEEE international symposium on robot and human interactive communication (RO-MAN), pp 724–729. IEEE

  • Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 arXiv preprint

  • Sun Y, Wang X, Tang X (2013) Deep convolutional network cascade for facial point detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3476–3483

  • Szegedy C, Ioffe S, Vanhoucke V, Alemi AlA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence

  • Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition. CVPR 2001, vol 1, pp I–I. IEEE

  • Wei W, Ho ES, McCay KD, Damaševičius R, Maskeliūnas R, Esposito A (2021) Assessing facial symmetry and attractiveness using augmented reality. Pattern Analy Appl 20:1–17

    Google Scholar 

  • Wen G, Hou Z, Li H, Li D, Jiang L, Xun E (2017) Ensemble of deep neural networks with probability-based fusion for facial expression recognition. Cogn Comput 9(5):597–610

    Article  Google Scholar 

  • Xie S, Haifeng H (2018) Facial expression recognition using hierarchical features with deep comprehensive multipatches aggregation convolutional neural networks. IEEE Trans Multimed 21(1):211–220

    Article  Google Scholar 

  • Yang B, Cao J, Ni R, Zhang Y (2017) Facial expression recognition using weighted mixture deep neural network based on double-channel facial images. IEEE Access 6:4630–4640

    Article  Google Scholar 

  • Yu Z, Zhang C (2015) Image based static facial expression recognition with multiple deep network learning. In: Proceedings of the 2015 ACM on international conference on multimodal interaction, pp 435–442

  • Zhang T, Zheng W, Cui Z, Zong Y, Yan J, Yan K (2016) A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans Multimed 18(12):2528–2536

    Article  Google Scholar 

  • Zia SM, Arfan JM (2015) An adaptive training based on classification system for patterns in facial expressions using surf descriptor templates. Multimed Tools Appl 74(11):3881–3899

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nagur Shareef Shaik.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shaik, N.S., Cherukuri, T.K. Visual attention based composite dense neural network for facial expression recognition. J Ambient Intell Human Comput 14, 16229–16242 (2023). https://doi.org/10.1007/s12652-022-03843-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-022-03843-8

Keywords