Skip to main content

Human-Computer Interaction Approach with Empathic Conversational Agent and Computer Vision

  • Conference paper
  • First Online:
Artificial Intelligence for Neuroscience and Emotional Systems (IWINAC 2024)

Abstract

The integration of empathy in Human-Computer Interaction (HCI) is essential for enhancing user experiences. Current HCI systems often overlook users’ emotional states, limiting interaction quality. This research examines the integration of Multimodal Emotion Recognition (MER) into empathic generative-based conversational agents, encompassing facial, body, and speech emotion recognition, along with sentiment analysis. These elements are fused and incorporated into Large Language Models (LLMs) to continuously comprehend and respond to users empathically. This paper highlights the advantages of this multimodal approach over traditional unimodal systems in recognizing complex human emotions. Additionally, it provides a well-structured background on the addressed topics. The findings include an overview of deep learning in HCI, a review of methods used for emotion recognition and conversational agents, and the proposal of an HCI architecture that integrates facial, body, and speech emotion recognition and sentiment analysis into a fusion model that is fed into an LLM making an empathic conversational agent. This research contributes to the field of HCI by providing an architecture to guide the development of more realistic and meaningful HCIs through MER and a conversational agent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Jaiswal, A., Krishnama Raju, A., Deb, S.: Facial emotion detection using deep learning. In: 2020 International Conference for Emerging Technology (INCET), pp. 1–5 (2020). https://ieeexplore.ieee.org/document/9154121

  2. Santos, B.S., Júnior, M.C., Nunes, M.A.S.N.: Approaches for generating empathy: a systematic mapping. In: Latifi, S. (ed.) Information Technology - New Generations. Advances in Intelligent Systems and Computing, vol. 558, pp. 715–722. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-54978-1_89

    Chapter  Google Scholar 

  3. Alrowais, F., et al.: Modified earthworm optimization with deep learning assisted emotion recognition for human computer interface. IEEE Access 11, 35089–35096 (2023). https://ieeexplore.ieee.org/document/10091537/

  4. Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 7553 436–444 (2015). https://www.nature.com/articles/nature14539

  5. Alrowais, F., et al.: Modified earthworm optimization with deep learning assisted emotion recognition for human computer interface. IEEE Access 11, 35089–35096 (2023)

    Article  Google Scholar 

  6. Khan, A., Sohail, A., Zahoora, U., Qureshi, A.S.: A survey of the recent architectures of deep convolutional neural networks. Artif. Intell. Rev. 53(8), 5455–5516 (2020). https://link.springer.com/article/10.1007/s10462-020-09825-6

  7. Chul, B., Id, K.: A brief review of facial emotion recognition based on visual information. Sensors 18, 401 (2018). https://www.mdpi.com/1424-8220/18/2/401/html

  8. Sleeman, W.C., Kapoor, R., Ghosh, P.: Multimodal classification: current landscape, taxonomy and future directions. ACM Comput. Surv. 55(7), 150:1–150:31 (2022). https://dl.acm.org/doi/10.1145/3543848

  9. Fernandes, S., Gawas, R., Alvares, P., Femandes, M., Kale, D., Aswale, S.: Survey on various conversational systems. In: 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), pp. 1–8 (2020)

    Google Scholar 

  10. Dettmers, T., Pagnoni, A., Holtzman, A., Zettlemoyer, L.: QLoRA: efficient finetuning of quantized LLMs (2023). http://arxiv.org/abs/2305.14314

  11. Hu, E.J., et al.: LoRA: low-rank adaptation of large language models (2021). http://arxiv.org/abs/2106.09685

  12. Casas, J., Spring, T., Daher, K., Mugellini, E., Khaled, O.A., Cudré-Mauroux, P.: enhancing conversational agents with empathic abilities. In: Proceedings of the 21st ACM International Conference on Intelligent Virtual Agents, IVA 2021, pp. 41 – 47 (2021). iSBN: 9781450386197

    Google Scholar 

  13. Daher, K., Casas, J., Khaled, O.A., Mugellini, E.: Empathic chatbot response for medical assistance. In: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, IVA 2020 (2020). iSBN: 9781450375863

    Google Scholar 

  14. Zhou, H., Huang, M., Zhang, T., Zhu, X., Liu, B.: Emotional chatting machine: emotional conversation generation with internal and external memory. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 730 – 738 (2018)

    Google Scholar 

  15. Denecke, K., Vaaheesan, S., Arulnathan, A.: A mental health chatbot for regulating emotions (sermo) - concept and usability test. IEEE Trans. Emerg. Topics Comput. 9, 1170–1182 (2021)

    Article  Google Scholar 

  16. Hajek, P., Barushka, A., Munk, M.: Neural networks with emotion associations, topic modeling and supervised term weighting for sentiment analysis. Int. J. Neural Syst. 31(10), 2150013 (2021)

    Article  Google Scholar 

  17. Kaur, H., Mangat, V. N.: A survey of sentiment analysis techniques. In: 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), pp. 921–925 (2017)

    Google Scholar 

  18. Wankhade, M., Rao, A.C.S., Kulkarni, C.: A survey on sentiment analysis methods, applications, and challenges. Artif. Intell. Rev. 55(7), 5731–5780 (2022). https://doi.org/10.1007/s10462-022-10144-1

    Article  Google Scholar 

  19. Hung, L.P., Alias, S.: Beyond sentiment analysis: a review of recent trends in text based sentiment analysis and emotion detection. J. Adv. Comput. Intell. Intell. Inf. 27(1), 84–95 (2023). https://www.fujipress.jp/jaciii/jc/jacii002700010084

  20. Ayadi, M.E., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44, 572–587 (2011)

    Article  Google Scholar 

  21. De Lope, J., Graña, M.: A hybrid time-distributed deep neural architecture for speech emotion recognition. Int. J. Neural Syst. 32(6), 2250024 (2022)

    Article  Google Scholar 

  22. Badshah, A.M., Ahmad, J., Rahim, N., Baik, S.W.: Speech emotion recognition from spectrograms with deep convolutional neural network. In: 2017 International Conference on Platform Technology and Service, PlatCon 2017 - Proceedings (2017)

    Google Scholar 

  23. Zaman, K., Zhaoyun, S., Shah, S.M., Shoaib, M., Lili, P., Hussain, A.: Driver emotions recognition based on improved faster r-cnn and neural architectural search network. Symmetry 14, 687 (2022)

    Article  Google Scholar 

  24. Yao, H., Yang, X., Chen, D., Wang, Z., Tian, Y.: Facial expression recognition based on fine-tuned channel-spatial attention transformer. Sensors 23, 6799 (2023). https://www.mdpi.com/1424-8220/23/15/6799/htm

  25. Bellamkonda, S., Gopalan, N.P.: An enhanced facial expression recognition model using local feature fusion of gabor wavelets and local directionality patterns. Int. J. Ambient Comput. Intell. 11, 48–70 (2020)

    Article  Google Scholar 

  26. Mukhiddinov, M., Djuraev, O., Akhmedov, F., Mukhamadiyev, A., Cho, J.: Masked face emotion recognition based on facial landmarks and deep learning approaches for visually impaired people. Sensors 23, 1080 (2023). https://www.mdpi.com/1424-8220/23/3/1080/htm

  27. Romeo, M., García, D.H., Han, T., Cangelosi, A., Jokinen, K.: Predicting apparent personality from body language: benchmarking deep learning architectures for adaptive social human-robot interaction. Adv. Robot. 35, 1167–1179 (2021)

    Article  Google Scholar 

  28. Ilyas, C.M.A., Nunes, R., Nasrollahi, K., Rehm, M., Moeslund, T.B.: Deep emotion recognition through upper body movements and facial expression. In: VISIGRAPP 2021 - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, vol. 5, pp. 669–679 (2021)

    Google Scholar 

  29. Ranganathan, H., Chakraborty, S., Panchanathan, S.: Multimodal emotion recognition using deep learning architectures. In: 2016 IEEE Winter Conference on Applications of Computer Vision, WACV 2016 (2016)

    Google Scholar 

  30. Prakash, V.G., et al.: Computer vision-based assessment of autistic children: analyzing interactions, emotions, human pose, and life skills. IEEE Access 11, 47907–47929 (2023)

    Article  Google Scholar 

  31. Zhu, L., Zhu, Z., Zhang, C., Xu, Y., Kong, X.: Multimodal sentiment analysis based on fusion methods: a survey. Inf. Fusion 95, 306–325 (2023). https://www.sciencedirect.com/science/article/pii/S156625352300074X

  32. Rizou, S., Paflioti, A., Theofilatos, A., Vakali, A., Sarigiannidis, G., Chatzisavvas, K.C.: Multilingual name entity recognition and intent classification employing deep learning architectures. Simul. Model. Pract. Theory 120, 102620 (2022). https://www.sciencedirect.com/science/article/pii/S1569190X22000995

  33. Li, C., et al.: Large language models understand and can be enhanced by emotional stimuli (2023). http://arxiv.org/abs/2307.11760

  34. Dettmers, T., Pagnoni, A., Holtzman, A., Zettlemoyer, L.: Qlora: efficient finetuning of quantized llms (2023). https://arxiv.org/abs/2305.14314v1

Download references

Acknowledgments

This work was supported by national funds through the Portuguese Foundation for Science and Technology (FCT), I.P., under the project UIDB/04524/2020 and was partially supported by Portuguese National funds through FITEC-Programa Interface with reference CIT “INOV-INESC Inovação-Financiamento Base”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to António Pereira .

Editor information

Editors and Affiliations

Ethics declarations

Disclosure of Interests

The authors have no competing interests.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pereira, R., Mendes, C., Costa, N., Frazão, L., Fernández-Caballero, A., Pereira, A. (2024). Human-Computer Interaction Approach with Empathic Conversational Agent and Computer Vision. In: Ferrández Vicente, J.M., Val Calvo, M., Adeli, H. (eds) Artificial Intelligence for Neuroscience and Emotional Systems. IWINAC 2024. Lecture Notes in Computer Science, vol 14674. Springer, Cham. https://doi.org/10.1007/978-3-031-61140-7_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-61140-7_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-61139-1

  • Online ISBN: 978-3-031-61140-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics