Skip to main content

Advertisement

Log in

Robust affect analysis using committee of deep convolutional neural networks

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Human emotion has attracted researcher’s attention as it finds potential applications in identifying consumer’s mood and interest towards their product, assessments of learner emotional states, manufacturing smart cars, automotive industry and detecting mental states of the person in health care applications. In this paper, a well-designed committee network that focuses on the applicability of deep features for human emotion recognition from facial expressions is proposed. This architecture has the advantage of multi-level feature extraction using multiple filters that improve the performance of the network. The designed variant of inception–residual structure helps in the flow of input data through multiple paths, thus explicitly captures emotion variation from multi-path sibling layers and concatenated for recognition. The proposed algorithm is experimented with eNTERFACE, SAVEE and AFEW databases and the accuracy of 94.76%, 98.67% and 66.84%, respectively, is obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Ekman P (1992) Facial expressions of emotion: New findings, new questions. Psychol Sci 3:34–38

    Article  Google Scholar 

  2. Mattela G, Gupta SK (2018) Facial expression recognition using gabor-mean-DWT feature extraction technique. In: Proceedings of the 5th international conference on signal processing and integrated networks (SPIN), Noida, India, 22–23 February 2018 pp 575–580

  3. Feng X, Pietikäinen M, Hadid A (2007) Facial expression recognition based on local binary patterns. Pattern Recognit Image Anal 17:592–598

    Article  Google Scholar 

  4. Guo Z, Zhang L, Zhang D (2010) A completed modeling of local binary pattern operator for texture classification. IEEE Trans Image Process 19:1657–1663

    Article  MathSciNet  Google Scholar 

  5. Jabid T, Kabir MH, Chae O (2010) Robust facial expression recognition based on local directional pattern. ETRI J 32:784–794

    Article  Google Scholar 

  6. Yang P, Liu Q, Metaxas DN (2009) Boosting encoded dynamic features for facial expression recognition. Pattern Recognit Lett 30:132–139

    Article  Google Scholar 

  7. Cohn JF, Zlochower AJ,Lien JJ, Kanade T (1998) Feature-point tracking by optical flow discriminates subtle differences in facial expression. In: Proceedings of the third IEEE international conference on automatic face and gesture recognition, Nara, Japan, 14–16 April 1998 p 396

  8. Liu Y, Wang JD, Li P (2011) A feature point tracking method based on the combination of SIFT algorithm and KLT matching algorithm. J Astronaut 7:028

    Google Scholar 

  9. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  10. Simonyan K, Zisserman, A (2014) Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556

  11. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov V, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition pp 1–9

  12. Parkhi OM, Vedaldi A, Zisserman et al A (2015) Deep face recognition. In BMVC vol 1 no 3: p. 6

  13. Levi G, Hassner T (2015) Emotion recognition in the wild via convolutional neural networks and mapped binary patterns. In: Proceedings of the 2015 ACM on international conference on multimodal interaction. ACM, pp 503–510

  14. Lowe DG (1999) Object recognition from local scale-invariant features. In: Computer vision, 1999. The proceedings of the seventh IEEE international conference on, IEEE, 1999, vol 2: pp 1150–1157.

  15. Zhang T, Zheng W, Cui Z, Zong Y, Yan J, Yan K (2016) A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans Multimed 18(12):2528–2536

    Article  Google Scholar 

  16. Chen L, Zhou M, Su W, Wu M, She J, Hirota K (2018) Softmax regression based deep sparse autoencoder network for facial emotion recognition in human-robot interaction. Inf Sci 428:49–61

    Article  MathSciNet  Google Scholar 

  17. Ciregan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: Computer vision and pattern recognition (CVPR), 2012 IEEE conference on. IEEE, 2012, pp 3642– 3649

  18. Zhao X, Liang X, Liu L, Li T, Han Y, Vasconcelos N, Yan S (2016) Peak-piloted deep network for facial expression recognition. In: European conference on computer vision, Springer pp 425– 442

  19. Zhao J, Mao X, Zhang J (2018) Learning deep facial expression features from image and optical flow sequences using 3d cnn. Visual Comput 34:1–15

    Article  Google Scholar 

  20. Jung H, Lee S, Yim J, Park S, Kim J (2015) Joint fine-tuning in deep neural networks for facial expression recognition. In: Computer vision (ICCV), 2015 IEEE international conference on. IEEE pp 2983–2991

  21. Yan J, Zheng W, Cui Z, Tang C, Zhang T, Zong Y, Sun N (2016) Multi-clue fusion for emotion recognition in the wild. In: Proceedings of the 18th ACM International conference on multimodal interaction. ACM, pp 458–463

  22. Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition pp 2625–2634.

  23. Martin O, Kotsia I, Macq B, Pitas I (2006) The eNTERFACE’05 AudioVisual Emotion Database. In Presented at the 22nd international conference on data engineering workshops (ICDEW’06); IEEE

  24. Haq S, Jackson PJ (2009) Speaker-dependent audio-visual emotion recognition. In: Proceedings of the international conference on auditory-visual speech processing

  25. Dhall A, Goecke R, Lucey S, Gedeon T (2012) Collecting large, richly annotated facial-expression databases from movies. IEEE Multimed 19:34–41

    Article  Google Scholar 

  26. Mansoorizadeh M, Charkari NM (2010) Multimodal information fusion application to human emotion recognition from face and speech. Multimed Tools Appl 49(2):277–297

    Article  Google Scholar 

  27. Datcu D, Rothkrantz L (2009) Multimodal recognition of emotions in car environments. In: Proceedings of the conference on driver-car interaction & interface (DCI&I), Prague, pp 1–9

  28. Mansoorizadeh M, Charkari NM (2008) Bimodal person-dependent emotion recognition comparison of feature level and decision level information fusion. In: Proceedings of the 1st ACM international conference on PErvasive technologies related to assistive environments, Athens, Greece, article no. 90, pp 1–4

  29. Bejani M, Gharavian D, Charkari NM (2012) Facial expression recognition using temporal templates. Majlesi J Electr Eng 6(2):14–20

    Google Scholar 

  30. Rashid M, Abu-Bakar SAR, Mokji M (2013) Human emotion recognition from videos using spatio-temporal and audio features. Vis Comput 29:1269–1275

    Article  Google Scholar 

  31. Dobrišek S, Gajšek R, Mihelič F, Pavešić N, Štruc V (2013) Towards efficient multi-modal emotion recognition. Int J Adv Robot Syst 10(1):53

    Article  Google Scholar 

  32. Poria S, Cambria E, Gelbukh A (2015) Deep convolutional neural network textual features and multiple kernel learning for utterance-level multimodal sentiment analysis. In: Proceedings of the international conference on empirical methods in natural language processing, Lisbon, Portugal, pp 2539–2544.

  33. Pan X, Ying G, Chen G, Li H, Li W (2019) A deep spatial and temporal aggregation framework for video-based facial expression recognition. IEEE Access 7:48807–48815

    Article  Google Scholar 

  34. Selvaraj A, Russel NS (2019) Bimodal recognition of affective states with the features inspired from human visual and auditory perception system. Int J Imaging Syst Technol 29(4):584–598

    Article  Google Scholar 

  35. Banda N, Robinson P (2011) Noise analysis in audio-visual emotion recognition. In: Proceedings of the 11th international conference on multimodal interaction, ACM Press, New York pp 1–4

  36. Barros P, Wermter S (2016) Developing crossmodal expression recognition based on a deep neural model. Adapt Behav 24(5):373–396

    Article  Google Scholar 

  37. Almaev TR, Yüce A, Ghitulescu A, Valstar MF (2013) Distribution-based iterative pairwise classification of emotions in the wild using LGBP-TOP. In: Proceedings of the 15th ACM International conference on multimodal interaction, ACM Press, Sydney, Australia pp 535–542

  38. Gehrig T, Ekenel HK (2013) Why is facial expression analysis in the wild challenging?In: Proceedings of the 2013 on emotion recognition in the wild challenge and workshop, ACM Press, Sydney, Australia, pp 9–16

  39. Kaya H, Gürpinar F, Afshar S, Salah AA (2015) Contrasting and combining least squares based learners for emotion recognition in the wild. In: Proceedings of the 2015 ACM on international conference on multimodal interaction, ACM Press, Seattle, Washington, USA pp 459–466

  40. Ding W, Xu M, Huang D, Lin W, Dong M, Yu X, Li H (2016) Audio and face video emotion recognition in the wild using deep neural networks and small datasets. In: Proceedings of the 18th ACM international conference on multimodal interaction, ACM Press, Tokyo, Japan pp 506–513

  41. Ghazi MM, Ekenel HK (2016) Automatic emotion recognition in the wild using an ensemble of static and dynamic representations. In: Proceedings of the 18th ACM international conference on multimodal interaction, ACM Press, Tokyo, Japan pp 514–521

  42. Pini S, Ahmed OB, Cornia M, Baraldi L, Cucchiara R, Huet B (2017) Modeling multimodal cues in a deep learning-based framework for emotion recognition in the wild. In: Proceedings of the 19th ACM international conference on multimodal interaction, ACM Press, Glasgow, UK, pp 536–543

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Newlin Shebiah Russel.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Russel, N.S., Selvaraj, A. Robust affect analysis using committee of deep convolutional neural networks. Neural Comput & Applic 34, 3633–3645 (2022). https://doi.org/10.1007/s00521-021-06632-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06632-0

Keywords

Navigation