Human-Computer Interaction Approach with Empathic Conversational Agent and Computer Vision

Pereira, Rafael; Mendes, Carla; Costa, Nuno; Frazão, Luis; Fernández-Caballero, Antonio; Pereira, António

doi:10.1007/978-3-031-61140-7_41

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14674))

Included in the following conference series:

International Work-Conference on the Interplay Between Natural and Artificial Computation

773 Accesses
1 Citations

Abstract

The integration of empathy in Human-Computer Interaction (HCI) is essential for enhancing user experiences. Current HCI systems often overlook users’ emotional states, limiting interaction quality. This research examines the integration of Multimodal Emotion Recognition (MER) into empathic generative-based conversational agents, encompassing facial, body, and speech emotion recognition, along with sentiment analysis. These elements are fused and incorporated into Large Language Models (LLMs) to continuously comprehend and respond to users empathically. This paper highlights the advantages of this multimodal approach over traditional unimodal systems in recognizing complex human emotions. Additionally, it provides a well-structured background on the addressed topics. The findings include an overview of deep learning in HCI, a review of methods used for emotion recognition and conversational agents, and the proposal of an HCI architecture that integrates facial, body, and speech emotion recognition and sentiment analysis into a fusion model that is fed into an LLM making an empathic conversational agent. This research contributes to the field of HCI by providing an architecture to guide the development of more realistic and meaningful HCIs through MER and a conversational agent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Advancements and Role of Emotion Recognition in the 4th Industrial Revolution

Multimodal Emotion Analysis Based on Visual, Acoustic and Linguistic Features

A Novel Framework for Facial Emotion Detection Using Deep Learning Algorithms for HCI-Enabled System

References

Jaiswal, A., Krishnama Raju, A., Deb, S.: Facial emotion detection using deep learning. In: 2020 International Conference for Emerging Technology (INCET), pp. 1–5 (2020). https://ieeexplore.ieee.org/document/9154121
Santos, B.S., Júnior, M.C., Nunes, M.A.S.N.: Approaches for generating empathy: a systematic mapping. In: Latifi, S. (ed.) Information Technology - New Generations. Advances in Intelligent Systems and Computing, vol. 558, pp. 715–722. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-54978-1_89
Chapter Google Scholar
Alrowais, F., et al.: Modified earthworm optimization with deep learning assisted emotion recognition for human computer interface. IEEE Access 11, 35089–35096 (2023). https://ieeexplore.ieee.org/document/10091537/
Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 7553 436–444 (2015). https://www.nature.com/articles/nature14539
Alrowais, F., et al.: Modified earthworm optimization with deep learning assisted emotion recognition for human computer interface. IEEE Access 11, 35089–35096 (2023)
Article Google Scholar
Khan, A., Sohail, A., Zahoora, U., Qureshi, A.S.: A survey of the recent architectures of deep convolutional neural networks. Artif. Intell. Rev. 53(8), 5455–5516 (2020). https://link.springer.com/article/10.1007/s10462-020-09825-6
Chul, B., Id, K.: A brief review of facial emotion recognition based on visual information. Sensors 18, 401 (2018). https://www.mdpi.com/1424-8220/18/2/401/html
Sleeman, W.C., Kapoor, R., Ghosh, P.: Multimodal classification: current landscape, taxonomy and future directions. ACM Comput. Surv. 55(7), 150:1–150:31 (2022). https://dl.acm.org/doi/10.1145/3543848
Fernandes, S., Gawas, R., Alvares, P., Femandes, M., Kale, D., Aswale, S.: Survey on various conversational systems. In: 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), pp. 1–8 (2020)
Google Scholar
Dettmers, T., Pagnoni, A., Holtzman, A., Zettlemoyer, L.: QLoRA: efficient finetuning of quantized LLMs (2023). http://arxiv.org/abs/2305.14314
Hu, E.J., et al.: LoRA: low-rank adaptation of large language models (2021). http://arxiv.org/abs/2106.09685
Casas, J., Spring, T., Daher, K., Mugellini, E., Khaled, O.A., Cudré-Mauroux, P.: enhancing conversational agents with empathic abilities. In: Proceedings of the 21st ACM International Conference on Intelligent Virtual Agents, IVA 2021, pp. 41 – 47 (2021). iSBN: 9781450386197
Google Scholar
Daher, K., Casas, J., Khaled, O.A., Mugellini, E.: Empathic chatbot response for medical assistance. In: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, IVA 2020 (2020). iSBN: 9781450375863
Google Scholar
Zhou, H., Huang, M., Zhang, T., Zhu, X., Liu, B.: Emotional chatting machine: emotional conversation generation with internal and external memory. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 730 – 738 (2018)
Google Scholar
Denecke, K., Vaaheesan, S., Arulnathan, A.: A mental health chatbot for regulating emotions (sermo) - concept and usability test. IEEE Trans. Emerg. Topics Comput. 9, 1170–1182 (2021)
Article Google Scholar
Hajek, P., Barushka, A., Munk, M.: Neural networks with emotion associations, topic modeling and supervised term weighting for sentiment analysis. Int. J. Neural Syst. 31(10), 2150013 (2021)
Article Google Scholar
Kaur, H., Mangat, V. N.: A survey of sentiment analysis techniques. In: 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), pp. 921–925 (2017)
Google Scholar
Wankhade, M., Rao, A.C.S., Kulkarni, C.: A survey on sentiment analysis methods, applications, and challenges. Artif. Intell. Rev. 55(7), 5731–5780 (2022). https://doi.org/10.1007/s10462-022-10144-1
Article Google Scholar
Hung, L.P., Alias, S.: Beyond sentiment analysis: a review of recent trends in text based sentiment analysis and emotion detection. J. Adv. Comput. Intell. Intell. Inf. 27(1), 84–95 (2023). https://www.fujipress.jp/jaciii/jc/jacii002700010084
Ayadi, M.E., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44, 572–587 (2011)
Article Google Scholar
De Lope, J., Graña, M.: A hybrid time-distributed deep neural architecture for speech emotion recognition. Int. J. Neural Syst. 32(6), 2250024 (2022)
Article Google Scholar
Badshah, A.M., Ahmad, J., Rahim, N., Baik, S.W.: Speech emotion recognition from spectrograms with deep convolutional neural network. In: 2017 International Conference on Platform Technology and Service, PlatCon 2017 - Proceedings (2017)
Google Scholar
Zaman, K., Zhaoyun, S., Shah, S.M., Shoaib, M., Lili, P., Hussain, A.: Driver emotions recognition based on improved faster r-cnn and neural architectural search network. Symmetry 14, 687 (2022)
Article Google Scholar
Yao, H., Yang, X., Chen, D., Wang, Z., Tian, Y.: Facial expression recognition based on fine-tuned channel-spatial attention transformer. Sensors 23, 6799 (2023). https://www.mdpi.com/1424-8220/23/15/6799/htm
Bellamkonda, S., Gopalan, N.P.: An enhanced facial expression recognition model using local feature fusion of gabor wavelets and local directionality patterns. Int. J. Ambient Comput. Intell. 11, 48–70 (2020)
Article Google Scholar
Mukhiddinov, M., Djuraev, O., Akhmedov, F., Mukhamadiyev, A., Cho, J.: Masked face emotion recognition based on facial landmarks and deep learning approaches for visually impaired people. Sensors 23, 1080 (2023). https://www.mdpi.com/1424-8220/23/3/1080/htm
Romeo, M., García, D.H., Han, T., Cangelosi, A., Jokinen, K.: Predicting apparent personality from body language: benchmarking deep learning architectures for adaptive social human-robot interaction. Adv. Robot. 35, 1167–1179 (2021)
Article Google Scholar
Ilyas, C.M.A., Nunes, R., Nasrollahi, K., Rehm, M., Moeslund, T.B.: Deep emotion recognition through upper body movements and facial expression. In: VISIGRAPP 2021 - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, vol. 5, pp. 669–679 (2021)
Google Scholar
Ranganathan, H., Chakraborty, S., Panchanathan, S.: Multimodal emotion recognition using deep learning architectures. In: 2016 IEEE Winter Conference on Applications of Computer Vision, WACV 2016 (2016)
Google Scholar
Prakash, V.G., et al.: Computer vision-based assessment of autistic children: analyzing interactions, emotions, human pose, and life skills. IEEE Access 11, 47907–47929 (2023)
Article Google Scholar
Zhu, L., Zhu, Z., Zhang, C., Xu, Y., Kong, X.: Multimodal sentiment analysis based on fusion methods: a survey. Inf. Fusion 95, 306–325 (2023). https://www.sciencedirect.com/science/article/pii/S156625352300074X
Rizou, S., Paflioti, A., Theofilatos, A., Vakali, A., Sarigiannidis, G., Chatzisavvas, K.C.: Multilingual name entity recognition and intent classification employing deep learning architectures. Simul. Model. Pract. Theory 120, 102620 (2022). https://www.sciencedirect.com/science/article/pii/S1569190X22000995
Li, C., et al.: Large language models understand and can be enhanced by emotional stimuli (2023). http://arxiv.org/abs/2307.11760
Dettmers, T., Pagnoni, A., Holtzman, A., Zettlemoyer, L.: Qlora: efficient finetuning of quantized llms (2023). https://arxiv.org/abs/2305.14314v1

Download references

Acknowledgments

This work was supported by national funds through the Portuguese Foundation for Science and Technology (FCT), I.P., under the project UIDB/04524/2020 and was partially supported by Portuguese National funds through FITEC-Programa Interface with reference CIT “INOV-INESC Inovação-Financiamento Base”

Author information

Authors and Affiliations

Computer Science and Communications Research Centre, School of Technology and Management, Polytechnic of Leiria, 2411-901, Leiria, Portugal
Rafael Pereira, Carla Mendes, Nuno Costa, Luis Frazão & António Pereira
Instituto de Investigación en Informática de Albacete, Universidad de Castilla-La Mancha, 02071, Albacete, Spain
Antonio Fernández-Caballero
INOV INESC Inovação, Institute of New Technologies, Leiria Office, 2411-901, Leiria, Portugal
António Pereira

Authors

Rafael Pereira
View author publications
You can also search for this author in PubMed Google Scholar
Carla Mendes
View author publications
You can also search for this author in PubMed Google Scholar
Nuno Costa
View author publications
You can also search for this author in PubMed Google Scholar
Luis Frazão
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Fernández-Caballero
View author publications
You can also search for this author in PubMed Google Scholar
António Pereira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to António Pereira .

Editor information

Editors and Affiliations

Universidad Politécnica de Cartagena, Cartagena, Spain
José Manuel Ferrández Vicente
Polytechnic University of Valencia, Valencia, Spain
Mikel Val Calvo
Ohio State University, Columbus, OH, USA
Hojjat Adeli

Ethics declarations

Disclosure of Interests

The authors have no competing interests.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pereira, R., Mendes, C., Costa, N., Frazão, L., Fernández-Caballero, A., Pereira, A. (2024). Human-Computer Interaction Approach with Empathic Conversational Agent and Computer Vision. In: Ferrández Vicente, J.M., Val Calvo, M., Adeli, H. (eds) Artificial Intelligence for Neuroscience and Emotional Systems. IWINAC 2024. Lecture Notes in Computer Science, vol 14674. Springer, Cham. https://doi.org/10.1007/978-3-031-61140-7_41

Download citation

DOI: https://doi.org/10.1007/978-3-031-61140-7_41
Published: 31 May 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-61139-1
Online ISBN: 978-3-031-61140-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Human-Computer Interaction Approach with Empathic Conversational Agent and Computer Vision