Skip to main content

Interpreting Emotions Through the Grad-CAM Lens: Insights and Implications in CNN-Based Facial Emotion Recognition

  • Conference paper
  • First Online:
Pattern Recognition (ICPR 2024)

Abstract

Facial Emotion Recognition (FER) has gained significant attention in recent years due to its potential applications in various fields such as automotive, mental health, and education. Despite the impressive results of Deep Learning (DL) in these areas, a critical shortfall of these systems is the lack of explainability. This paper presents a systematic analysis using Gradient-weighted Class Activation Mapping (Grad-CAM) combined with Guided Backpropagation to investigate the features learned by DL models in FER and their alignment with established emotional theories. We apply this methodology to Convolutional Neural Networks (CNNs) trained on three different emotional datasets: FER-2013, RAF-DB, and AffectNet. Our findings indicate that machine-learned features vary within an emotional state and are not necessarily aligned with expert understanding in emotion psychology. This raises questions about the reliability and ethical implications of using FER systems in sensitive areas, where accurate interpretation of emotions is critical. In response, our study proposes exploring Neuro-Symbolic AI approaches as a potential pathway to more effectively grasp the complexity of emotion psychology and address these concerns. This approach paves the way for the development of new FER model architectures, potentially fostering the emergence of more nuanced emotional concepts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Abhishek, K., Kamath, D.: Attribution-based XAI methods in computer vision: a review. arXiv:2211.14736 (2022)

  2. Araf, T.A., Siddika, A., Karimi, S., Alam, M.G.R.: Real-time face emotion recognition and visualization using grad-CAM. In: 2022 Second International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT), pp. 1–5 (2022). https://doi.org/10.1109/ICAECT54875.2022.9807868

  3. Bai, M., Goecke, R., Herath, D.: Micro-expression recognition based on video motion magnification and pre-trained neural network. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 549–553 (2021). https://doi.org/10.1109/ICIP42928.2021.9506793

  4. Barrett, L.F., Adolphs, R., Marsella, S., Martinez, A.M., Pollak, S.D.: Emotional expressions reconsidered: challenges to inferring emotion from human facial movements. Psychol. Sci. Public Interest (2019). https://doi.org/10.1177/1529100619832930

    Article  Google Scholar 

  5. Carvalho, D.V., Pereira, E.M., Cardoso, J.S.: Machine learning interpretability: a survey on methods and metrics. Electronics 8(8), 832 (2019). https://doi.org/10.3390/electronics8080832

    Article  Google Scholar 

  6. Chen, G., Zhang, D., Xian, Z., Luo, J., Liang, W., Chen, Y.: Facial expressions classification based on broad learning network. In: 2022 10th International Conference on Information Systems and Computing Technology (ISCTech), pp. 715–720 (2022). https://doi.org/10.1109/ISCTech58360.2022.00118

  7. Cheong, J.H., Jolly, E., Xie, T., Byrne, S., Kenney, M., Chang, L.J.: Py-feat: python facial expression analysis toolbox. arXiv:2104.03509 (2023)

  8. Deramgozin, M., Jovanovic, S., Rabah, H., Ramzan, N.: A hybrid explainable AI framework applied to global and local facial expression recognition. In: 2021 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 1–5 (2021). https://doi.org/10.1109/IST50367.2021.9651357

  9. Ekman, P.: Basic emotions. Handbook of Cognition and Emotion, pp. 301–320. Wiley, New York (1999)

    Google Scholar 

  10. Ekman, P., Friesen, W.V., Hager, J.C.: Facial Action Coding System. A Human Face, Salt Lake City, Utah (2002)

    Google Scholar 

  11. Fatima, S.A., Kumar, A., Raoof, S.S.: Real time emotion detection of humans using mini-xception algorithm. IOP Conf. Ser. Mater. Sci. Eng. 1042(1), 012027 (2021). https://doi.org/10.1088/1757-899X/1042/1/012027

    Article  Google Scholar 

  12. Gebele, J., Brune, P., Faußer, S.: Face value: on the impact of annotation (in-)consistencies and label ambiguity in facial data on emotion recognition. In: 2022 26th International Conference on Pattern Recognition (ICPR), pp. 2597–2604 (2022). https://doi.org/10.1109/ICPR56361.2022.9956230

  13. Gerardo, P.C., Menezes, P.: Classification of FACS-action units with CNN trained from emotion labelled data sets. In: 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 3766–3770 (2019). https://doi.org/10.1109/SMC.2019.8914238

  14. Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. arXiv:1307.0414 [cs, stat] (2013)

  15. Guerdan, L., Raymond, A., Gunes, H.: Toward affective XAI: facial affect analysis for understanding explainable human-AI interactions. In: 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pp. 3789–3798 (2021). https://doi.org/10.1109/ICCVW54120.2021.00423

  16. Huang, Y.X., Dai, W.Z., Jiang, Y., Zhou, Z.H.: Enabling knowledge refinement upon new concepts in abductive learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 7928–7935 (2023)

    Google Scholar 

  17. Kishan Kondaveeti, H., Vishal Goud, M.: Emotion detection using deep facial features. In: 2020 IEEE International Conference on Advent Trends in Multidisciplinary Research and Innovation (ICATMRI), pp. 1–8 (2020). https://doi.org/10.1109/ICATMRI51801.2020.9398439

  18. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)

    Google Scholar 

  19. Li, S., Deng, W.: Deep facial expression recognition: a survey. IEEE Trans. Affect. Comput. 13(3), 1195–1215 (2022). https://doi.org/10.1109/TAFFC.2020.2981446

    Article  Google Scholar 

  20. Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2584–2593. IEEE, Honolulu, HI (2017). https://doi.org/10.1109/CVPR.2017.277

  21. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pp. 94–101 (2010). https://doi.org/10.1109/CVPRW.2010.5543262

  22. Lundberg, S., Lee, S.I.: A unified approach to interpreting model predictions (2017). https://doi.org/10.48550/arXiv.1705.07874

  23. Mahendran, A., Vedaldi, A.: Visualizing deep convolutional neural networks using natural pre-images. Int. J. Comput. Vision 120(3), 233–255 (2016). https://doi.org/10.1007/s11263-016-0911-8

    Article  MathSciNet  Google Scholar 

  24. Malek–Podjaski, M., Deligianni, F.: Towards explainable, privacy-preserved human-motion affect recognition. In: 2021 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 01–09 (2021). https://doi.org/10.1109/SSCI50451.2021.9660129

  25. Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10(1), 18–31 (2017)

    Article  Google Scholar 

  26. Molnar, C.: Interpretable Machine Learning (Second Edition) - A Guide for Making Black Box Models Explainable. Leanpub (2018)

    Google Scholar 

  27. Moreno-Armendáriz, M.A., Espinosa-Juarez, A., Godinez-Montero, E.: Using diverse ConvNets to classify face action units in dataset on emotions among Mexicans (DEM). IEEE Access 12, 15268–15279 (2024)

    Article  Google Scholar 

  28. Mouakher, A., Chatry, S., Yacoubi, S.E.: A multi-criteria evaluation framework for facial expression recognition models. In: 2023 20th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA), pp. 1–8 (2023). https://doi.org/10.1109/AICCSA59173.2023.10479285

  29. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. arXiv:1602.04938 (2016)

  30. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vision 128(2), 336–359 (2020). https://doi.org/10.1007/s11263-019-01228-7

    Article  Google Scholar 

  31. Shahabinejad, M., Wang, Y., Yu, Y., Tang, J., Li, J.: Toward personalized emotion recognition: a face recognition based attention method for facial emotion recognition. In: 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), pp. 1–5 (2021). https://doi.org/10.1109/FG52635.2021.9666982

  32. Shingjergji, K., Iren, D., Böttger, F., Urlings, C., Klemke, R.: Interpretable explainability in facial emotion recognition and gamification for data collection. In: 2022 10th International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 1–8 (2022). https://doi.org/10.1109/ACII55700.2022.9953864

  33. Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv:1312.6034 (2014)

  34. Smilkov, D., Thorat, N., Kim, B., Viégas, F., Wattenberg, M.: SmoothGrad: removing noise by adding noise. arXiv:1706.03825 (2017)

  35. Wadhawan, R., Gandhi, T.K.: Landmark-aware and part-based ensemble transfer learning network for static facial expression recognition from images. IEEE Trans. Artif. Intell. 4(2), 349–361 (2023). https://doi.org/10.1109/TAI.2022.3172272

    Article  Google Scholar 

  36. Yang, J., Zhang, F., Chen, B., Khan, S.U.: Facial expression recognition based on facial action unit. In: 2019 Tenth International Green and Sustainable Computing Conference (IGSC), pp. 1–6 (2019). https://doi.org/10.1109/IGSC48788.2019.8957163

  37. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. arXiv:1311.2901 (2013)

  38. Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. arXiv:1512.04150 (2015)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jens Gebele .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gebele, J., Brune, P., Schwab, F., von Mammen, S. (2025). Interpreting Emotions Through the Grad-CAM Lens: Insights and Implications in CNN-Based Facial Emotion Recognition. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15313. Springer, Cham. https://doi.org/10.1007/978-3-031-78201-5_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78201-5_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78200-8

  • Online ISBN: 978-3-031-78201-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics