Interpreting Emotions Through the Grad-CAM Lens: Insights and Implications in CNN-Based Facial Emotion Recognition

Gebele, Jens; Brune, Philipp; Schwab, Frank; von Mammen, Sebastian

doi:10.1007/978-3-031-78201-5_27

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15313))

Included in the following conference series:

International Conference on Pattern Recognition

120 Accesses

Abstract

Facial Emotion Recognition (FER) has gained significant attention in recent years due to its potential applications in various fields such as automotive, mental health, and education. Despite the impressive results of Deep Learning (DL) in these areas, a critical shortfall of these systems is the lack of explainability. This paper presents a systematic analysis using Gradient-weighted Class Activation Mapping (Grad-CAM) combined with Guided Backpropagation to investigate the features learned by DL models in FER and their alignment with established emotional theories. We apply this methodology to Convolutional Neural Networks (CNNs) trained on three different emotional datasets: FER-2013, RAF-DB, and AffectNet. Our findings indicate that machine-learned features vary within an emotional state and are not necessarily aligned with expert understanding in emotion psychology. This raises questions about the reliability and ethical implications of using FER systems in sensitive areas, where accurate interpretation of emotions is critical. In response, our study proposes exploring Neuro-Symbolic AI approaches as a potential pathway to more effectively grasp the complexity of emotion psychology and address these concerns. This approach paves the way for the development of new FER model architectures, potentially fostering the emergence of more nuanced emotional concepts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Review of Survey and Assessment of Facial Emotion Recognition (FER) by Convolutional Neural Networks

SHapley Additive exPlanations for Machine Emotion Intelligence in CNNs

Facial Emotion Recognition Model

References

Abhishek, K., Kamath, D.: Attribution-based XAI methods in computer vision: a review. arXiv:2211.14736 (2022)
Araf, T.A., Siddika, A., Karimi, S., Alam, M.G.R.: Real-time face emotion recognition and visualization using grad-CAM. In: 2022 Second International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), pp. 1–5 (2022). https://doi.org/10.1109/ICAECT54875.2022.9807868
Bai, M., Goecke, R., Herath, D.: Micro-expression recognition based on video motion magnification and pre-trained neural network. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 549–553 (2021). https://doi.org/10.1109/ICIP42928.2021.9506793
Barrett, L.F., Adolphs, R., Marsella, S., Martinez, A.M., Pollak, S.D.: Emotional expressions reconsidered: challenges to inferring emotion from human facial movements. Psychol. Sci. Public Interest (2019). https://doi.org/10.1177/1529100619832930
Article Google Scholar
Carvalho, D.V., Pereira, E.M., Cardoso, J.S.: Machine learning interpretability: a survey on methods and metrics. Electronics 8(8), 832 (2019). https://doi.org/10.3390/electronics8080832
Article Google Scholar
Chen, G., Zhang, D., Xian, Z., Luo, J., Liang, W., Chen, Y.: Facial expressions classification based on broad learning network. In: 2022 10th International Conference on Information Systems and Computing Technology (ISCTech), pp. 715–720 (2022). https://doi.org/10.1109/ISCTech58360.2022.00118
Cheong, J.H., Jolly, E., Xie, T., Byrne, S., Kenney, M., Chang, L.J.: Py-feat: python facial expression analysis toolbox. arXiv:2104.03509 (2023)
Deramgozin, M., Jovanovic, S., Rabah, H., Ramzan, N.: A hybrid explainable AI framework applied to global and local facial expression recognition. In: 2021 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 1–5 (2021). https://doi.org/10.1109/IST50367.2021.9651357
Ekman, P.: Basic emotions. Handbook of Cognition and Emotion, pp. 301–320. Wiley, New York (1999)
Google Scholar
Ekman, P., Friesen, W.V., Hager, J.C.: Facial Action Coding System. A Human Face, Salt Lake City, Utah (2002)
Google Scholar
Fatima, S.A., Kumar, A., Raoof, S.S.: Real time emotion detection of humans using mini-xception algorithm. IOP Conf. Ser. Mater. Sci. Eng. 1042(1), 012027 (2021). https://doi.org/10.1088/1757-899X/1042/1/012027
Article Google Scholar
Gebele, J., Brune, P., Faußer, S.: Face value: on the impact of annotation (in-)consistencies and label ambiguity in facial data on emotion recognition. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 2597–2604 (2022). https://doi.org/10.1109/ICPR56361.2022.9956230
Gerardo, P.C., Menezes, P.: Classification of FACS-action units with CNN trained from emotion labelled data sets. In: 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 3766–3770 (2019). https://doi.org/10.1109/SMC.2019.8914238
Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. arXiv:1307.0414 [cs, stat] (2013)
Guerdan, L., Raymond, A., Gunes, H.: Toward affective XAI: facial affect analysis for understanding explainable human-AI interactions. In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp. 3789–3798 (2021). https://doi.org/10.1109/ICCVW54120.2021.00423
Huang, Y.X., Dai, W.Z., Jiang, Y., Zhou, Z.H.: Enabling knowledge refinement upon new concepts in abductive learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 7928–7935 (2023)
Google Scholar
Kishan Kondaveeti, H., Vishal Goud, M.: Emotion detection using deep facial features. In: 2020 IEEE International Conference on Advent Trends in Multidisciplinary Research and Innovation (ICATMRI), pp. 1–8 (2020). https://doi.org/10.1109/ICATMRI51801.2020.9398439
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)
Google Scholar
Li, S., Deng, W.: Deep facial expression recognition: a survey. IEEE Trans. Affect. Comput. 13(3), 1195–1215 (2022). https://doi.org/10.1109/TAFFC.2020.2981446
Article Google Scholar
Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2584–2593. IEEE, Honolulu, HI (2017). https://doi.org/10.1109/CVPR.2017.277
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pp. 94–101 (2010). https://doi.org/10.1109/CVPRW.2010.5543262
Lundberg, S., Lee, S.I.: A unified approach to interpreting model predictions (2017). https://doi.org/10.48550/arXiv.1705.07874
Mahendran, A., Vedaldi, A.: Visualizing deep convolutional neural networks using natural pre-images. Int. J. Comput. Vision 120(3), 233–255 (2016). https://doi.org/10.1007/s11263-016-0911-8
Article MathSciNet Google Scholar
Malek–Podjaski, M., Deligianni, F.: Towards explainable, privacy-preserved human-motion affect recognition. In: 2021 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 01–09 (2021). https://doi.org/10.1109/SSCI50451.2021.9660129
Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10(1), 18–31 (2017)
Article Google Scholar
Molnar, C.: Interpretable Machine Learning (Second Edition) - A Guide for Making Black Box Models Explainable. Leanpub (2018)
Google Scholar
Moreno-Armendáriz, M.A., Espinosa-Juarez, A., Godinez-Montero, E.: Using diverse ConvNets to classify face action units in dataset on emotions among Mexicans (DEM). IEEE Access 12, 15268–15279 (2024)
Article Google Scholar
Mouakher, A., Chatry, S., Yacoubi, S.E.: A multi-criteria evaluation framework for facial expression recognition models. In: 2023 20th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA), pp. 1–8 (2023). https://doi.org/10.1109/AICCSA59173.2023.10479285
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. arXiv:1602.04938 (2016)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vision 128(2), 336–359 (2020). https://doi.org/10.1007/s11263-019-01228-7
Article Google Scholar
Shahabinejad, M., Wang, Y., Yu, Y., Tang, J., Li, J.: Toward personalized emotion recognition: a face recognition based attention method for facial emotion recognition. In: 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), pp. 1–5 (2021). https://doi.org/10.1109/FG52635.2021.9666982
Shingjergji, K., Iren, D., Böttger, F., Urlings, C., Klemke, R.: Interpretable explainability in facial emotion recognition and gamification for data collection. In: 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 1–8 (2022). https://doi.org/10.1109/ACII55700.2022.9953864
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv:1312.6034 (2014)
Smilkov, D., Thorat, N., Kim, B., Viégas, F., Wattenberg, M.: SmoothGrad: removing noise by adding noise. arXiv:1706.03825 (2017)
Wadhawan, R., Gandhi, T.K.: Landmark-aware and part-based ensemble transfer learning network for static facial expression recognition from images. IEEE Trans. Artif. Intell. 4(2), 349–361 (2023). https://doi.org/10.1109/TAI.2022.3172272
Article Google Scholar
Yang, J., Zhang, F., Chen, B., Khan, S.U.: Facial expression recognition based on facial action unit. In: 2019 Tenth International Green and Sustainable Computing Conference (IGSC), pp. 1–6 (2019). https://doi.org/10.1109/IGSC48788.2019.8957163
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. arXiv:1311.2901 (2013)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. arXiv:1512.04150 (2015)

Download references

Author information

Authors and Affiliations

Neu-Ulm University of Applied Sciences, Wileystraße 1, 89231, Neu-Ulm, Germany
Jens Gebele & Philipp Brune
University of Würzburg, Am Hubland, 97074, Würzburg, Germany
Jens Gebele, Frank Schwab & Sebastian von Mammen

Authors

Jens Gebele
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Brune
View author publications
You can also search for this author in PubMed Google Scholar
Frank Schwab
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian von Mammen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jens Gebele .

Editor information

Editors and Affiliations

University of Salford, Salford, Lancashire, UK
Apostolos Antonacopoulos
IIT Bombay, Powai, Mumbai, Maharashtra, India
Subhasis Chaudhuri
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa
Chinese Academy of Sciences, Beijing, China
Cheng-Lin Liu
IIT Kharagpur, Kharagpur, West Bengal, India
Saumik Bhattacharya
Indian Statistical Institute, Kolkata, West Bengal, India
Umapada Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gebele, J., Brune, P., Schwab, F., von Mammen, S. (2025). Interpreting Emotions Through the Grad-CAM Lens: Insights and Implications in CNN-Based Facial Emotion Recognition. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15313. Springer, Cham. https://doi.org/10.1007/978-3-031-78201-5_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-78201-5_27
Published: 02 December 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78200-8
Online ISBN: 978-3-031-78201-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Interpreting Emotions Through the Grad-CAM Lens: Insights and Implications in CNN-Based Facial Emotion Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Review of Survey and Assessment of Facial Emotion Recognition (FER) by Convolutional Neural Networks

SHapley Additive exPlanations for Machine Emotion Intelligence in CNNs

Facial Emotion Recognition Model

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Interpreting Emotions Through the Grad-CAM Lens: Insights and Implications in CNN-Based Facial Emotion Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Review of Survey and Assessment of Facial Emotion Recognition (FER) by Convolutional Neural Networks

SHapley Additive exPlanations for Machine Emotion Intelligence in CNNs

Facial Emotion Recognition Model

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation