The eHRI database: a multimodal database of engagement in human–robot interactions

Kesim, Ege; Numanoglu, Tugce; Bayramoglu, Oyku; Turker, Bekir Berker; Hussain, Nusrah; Sezgin, Metin; Yemez, Yucel; Erzin, Engin

doi:10.1007/s10579-022-09632-1

The eHRI database: a multimodal database of engagement in human–robot interactions

Original Paper
Published: 09 February 2023

Volume 57, pages 985–1009, (2023)
Cite this article

Language Resources and Evaluation Aims and scope Submit manuscript

Ege Kesim¹,
Tugce Numanoglu¹,
Oyku Bayramoglu¹,
Bekir Berker Turker¹,
Nusrah Hussain¹,
Metin Sezgin¹,
Yucel Yemez¹ &
…
Engin Erzin ORCID: orcid.org/0000-0002-2715-2368¹

625 Accesses
Explore all metrics

Abstract

We present the engagement in human–robot interaction (eHRI) database containing natural interactions between two human participants and a robot under a story-shaping game scenario. The audio-visual recordings provided with the database are fully annotated at a 5-intensity scale for head nods and smiles, as well as with speech transcription and continuous engagement values. In addition, we present baseline results for the smile and head nod detection along with a real-time multimodal engagement monitoring system. We believe that the eHRI database will serve as a novel asset for research in affective human–robot interaction by providing raw data, annotations, and baseline results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Engagement in Dialogue with Social Robots

Multimodal Engagement Prediction in Human-Robot Interaction Using Transformer Neural Networks

Smile and Laughter Detection for Elderly People-Robot Interaction

Notes

https://www.softbankrobotics.com/emea/en/nao.
https://www.softbankrobotics.com/emea/en/pepper.
https://furhatrobotics.com/.
The eHRI database will be publicly available at https://mvgl.ku.edu.tr/databases/.

References

Al Moubayed, S., Beskow, J., & Skantze, G. (2013). The Furhat social companion talking head. In Interspeech 2013, 14th annual conference of the international speech communication association (pp. 747–749)
Aubrey, A. J., Marshall, D., Rosin, P. L., Vendeventer, J., Cunningham, D. W., & Wallraven, C. (2013). Cardiff conversation database (CCDb): A database of natural dyadic conversations. In 2013 IEEE conference on computer vision and pattern recognition workshops (pp. 277–282)
Baltrusaitis, T., Zadeh, A., Lim, Y. C., & Morency, L. P. (2018). OpenFace 2.0: Facial behavior analysis toolkit. In 13th IEEE international conference on automatic face gesture recognition (FG 2018) (pp. 59–66)
Ben-Youssef, A., Clavel, C., Essid, S., Bilac, M., Chamoux, M., & Lim, A. (2017). UE-HRI: A new dataset for the study of user engagement in spontaneous human–robot interactions. In ICMI 2017, 19th ACM international conference on multimodal interaction (pp. 464–472)
Ben Youssef, A., Varni, G., Essid, S., & Clavel, C. (2019). On-the-fly detection of user engagement decrease in spontaneous human–robot interaction. International Journal of Social Robotics, 11(5), 815–828.
Article Google Scholar
Busso, C., Bulut, M., Lee, C. C., Kazemzadeh, A., Mower, E., Kim, S., Chang, J. N., Lee, S., & Narayanan, S. S. (2008). IEMOCAP: Interactive emotional dyadic motion capture database. Language Resources and Evaluation, 42(4), 335.
Article Google Scholar
Cafaro, A., Wagner, J., Baur, T., Dermouche, S., Torres Torres, M., Pelachaud, C., André, E., & Valstar, M. (2017). The NoXi database: Multimodal recordings of mediated novice-expert interactions. In ICMI 2017, 19th ACM international conference on multimodal interaction (pp. 350–359)
Castellano, G., Leite, I., Pereira, A., Martinho, C., Paiva, A., & McOwan, P. W. (2012). Detecting engagement in HRI: An exploration of social and task-based context. In 2012 international conference on privacy, security, risk and trust and 2012 international conference on social computing (pp. 421–428)
Celiktutan, O., Skordos, E., & Gunes, H. (2019). Multimodal human–human–robot interactions (MHHRI) dataset for studying personality and engagement. IEEE Transactions on Affective Computing, 10(4), 484–497.
Article Google Scholar
Devillers, L., Rosset, S., Duplessis, G. D., Bechade, L., Yemez, Y., Turker, B. B., Sezgin, M., Erzin, E., El Haddad, K., Dupont, S., Deleglise, P., Esteve, Y., Lailler, C., Gilmartin, E., Campbell, N. (2018). Multifaceted engagement in social interaction with a machine: The joker project. In FG 2018, 13th IEEE international conference on automatic face & gesture recognition (pp. 697–701)
Devillers, L., Rosset, S., Dubuisson, G. D., Sehili, M. A., Béchade, L., Delaborde, A., Gossart, C., Letard, V., Yang, F., Yemez, Y., Türker, B. B., Sezgin, M., El Haddad, K., Dupont, S., Luzzati, D., Estève, Y., Gilmartin, E., & Nick, C. (2015). Multimodal data collection of human–robot humorous interactions in the JOKER project. In ACII 2015, international conference on affective computing and intelligent interaction (pp. 348–354)
Dhall, A., Kaur, A., Goecke, R., & Gedeon, T. (2018). Emotiw 2018: Audio-video, student engagement and group-level affect prediction. In ICMI 2018, 20th ACM international conference on multimodal interaction (pp. 653–656)
Glas, N., & Pelachaud, C. (2015). Definitions of engagement in human–agent interaction. In ACII 2015, international conference on affective computing and intelligent interaction (pp. 944–949)
Griol, D., Molina, J. M., & Callejas, Z. (2014). Modeling the user state for context-aware spoken interaction in ambient assisted living. Applied Intelligence, 40(4), 749–771.
Article Google Scholar
Gupta, A., D’Cunha, A., Awasthi, K., & Balasubramanian, V. (2016). DAiSEE: Towards user engagement recognition in the wild. arXiv preprint. arXiv:1609.01885
Hussain, N., Erzin, E., Sezgin, T. M., & Yemez, Y. (2019). Speech driven backchannel generation using deep Q-network for enhancing engagement in human–robot interaction. In Interspeech 2019, 19th annual conference of the international speech communication association (pp. 4445–4449)
Hussain, N., Erzin, E., Sezgin, T. M., & Yemez, Y. (2022). Training socially engaging robots: Modeling backchannel behaviors with batch reinforcement learning. IEEE Transactions on Affective Computing. https://doi.org/10.1109/TAFFC.2022.3190233.
Article Google Scholar
Jayagopi, D. B., Sheiki, S., Klotz, D., Wienke, J., Odobez, J. M., Wrede, S., Khalidov, V., Nyugen, L., Wrede, B., & Gatica-Perez, D. (2013). The vernissage corpus: A conversational human–robot-interaction dataset. In HRI 2013, 8th ACM/IEEE international conference on human–robot interaction (pp. 149–150)
Kantharaju, R. B., Ringeval, F., & Besacier, L. (2018). Automatic recognition of affective laughter in spontaneous dyadic interactions from audiovisual signals. In ICMI 2018, 20th ACM international conference on multimodal interaction (pp. 220–228)
Kaur, A., Mustafa, A., Mehta, L., & Dhall, A. (2018). Prediction and localization of student engagement in the wild. In DICTA 2018, digital image computing: Techniques and applications (pp. 1–8)
Lee, K. M., Jung, Y., Kim, J., & Kim, S. R. (2006). Are physically embodied social agents better than disembodied social agents?: The effects of physical embodiment, tactile interaction, and people’s loneliness in human–robot interaction. International Journal of Human–Computer Studies, 64(10), 962–973.
Article Google Scholar
Li, J. (2015). The benefit of being physically present: A survey of experimental works comparing copresent robots, telepresent robots and virtual agents. International Journal of Human–Computer Studies, 77, 23–37.
Article Google Scholar
Malmir, M., Forster, D., Youngstrom, K., Morrison, L., & Movellan, J. (2013). Home alone: Social robots for digital ethnography of toddler behavior. In IEEE international conference on computer vision workshops (pp. 762–768)
McKeown, G., Valstar, M., Cowie, R., Pantic, M., & Schroder, M. (2012). The SEMAINE database: Annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Transactions on Affective Computing, 3(1), 5–17.
Article Google Scholar
Metallinou, A., Katsamanis, A., & Narayanan, S. (2013). Tracking continuous emotional trends of participants during affective dyadic interactions using body language and speech information. Image and Vision Computing, 31(2), 137–152.
Article Google Scholar
Metallinou, A., Yang, Z., Cc, Lee, Busso, C., Carnicke, S., & Narayanan, S. (2016). The USC CreativeIT database of multimodal dyadic interactions: From speech and full body motion capture to continuous emotional annotations. Language Resources and Evaluation, 50(3), 497–521.
Article Google Scholar
Moubayed, S. A., Skantze, G., & Beskow, J. (2013). The Furhat back-projected humanoid head: Lip reading, gaze and multi-party interaction. International Journal of Humanoid Robotics, 10(01), 1350005.
Article Google Scholar
Mubin, O., Ahmad, M. I., Kaur, S., Shi, W., & Khan, A. (2018). Social robots in public spaces: A meta-review. In S. S. Ge, J. J. Cabibihan, M. A. Salichs, E. Broadbent, H. He, A. R. Wagner, & Á. Castro-González (Eds.), Social robotics (pp. 213–220). Springer International Publishing.
Chapter Google Scholar
Rich, C., Ponsler, B., Holroyd, A., & Sidner, C. L. (2010). Recognizing engagement in human–robot interaction. In HRI 2010, 5th ACM/IEEE international conference on human–robot interaction (pp. 375–382)
Ringeval, F., Sonderegger, A., Sauer, J., & Lalanne, D. (2013). Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In FG 2013, 10th IEEE international conference and workshops on automatic face and gesture recognition (pp. 1–8)
Sharma, M., Ahmetovic, D., Jeni, L. A., & Kitani, K. M. (2018). Recognizing visual signatures of spontaneous head gestures. In WACV 2018, IEEE winter conference on applications of computer vision (pp. 400–408)
Sidner, C. L., Lee, C., Kidd, C. D., Lesh, N., & Rich, C. (2005). Explorations in engagement for humans and robots. Artificial Intelligence, 166(1), 140–164.
Article Google Scholar
Valstar, M. (2019). The handbook of multimodal-multisensor interfaces: Language processing, software, commercialization, and emerging directions—Volume 3, association for computing machinery and Morgan & Claypool, chap multimodal databases (pp. 393–421)
Vandeventer, J., Aubrey, A., Rosin, P. L., & Marshall, A. D. (2015). 4D Cardiff Conversation Database (4D CCDb): A 4D database of natural, dyadic conversations. In FAAVSP 2015, 1st joint conference on facial analysis, animation, and auditory-visual speech processing (pp. 157–162)
Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., & Sloetjes, H. (2006). ELAN: A professional framework for multimodality research. In LREC 2006, fifth international conference on language resources and evaluation. Max Planck Institute for Psycholinguistics. https://archive.mpi.nl/tla/elan

Download references

Funding

This work is supported by Türkiye Bilimsel ve Teknolojik Araştirma Kurumu under Grant Number 217E040.

Author information

Authors and Affiliations

KUIS-AI Laboratory, Multimedia, Vision and Graphics Group, College of Engineering, Koç University, Istanbul, Turkey
Ege Kesim, Tugce Numanoglu, Oyku Bayramoglu, Bekir Berker Turker, Nusrah Hussain, Metin Sezgin, Yucel Yemez & Engin Erzin

Authors

Ege Kesim
View author publications
You can also search for this author inPubMed Google Scholar
Tugce Numanoglu
View author publications
You can also search for this author inPubMed Google Scholar
Oyku Bayramoglu
View author publications
You can also search for this author inPubMed Google Scholar
Bekir Berker Turker
View author publications
You can also search for this author inPubMed Google Scholar
Nusrah Hussain
View author publications
You can also search for this author inPubMed Google Scholar
Metin Sezgin
View author publications
You can also search for this author inPubMed Google Scholar
Yucel Yemez
View author publications
You can also search for this author inPubMed Google Scholar
Engin Erzin
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Engin Erzin.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kesim, E., Numanoglu, T., Bayramoglu, O. et al. The eHRI database: a multimodal database of engagement in human–robot interactions. Lang Resources & Evaluation 57, 985–1009 (2023). https://doi.org/10.1007/s10579-022-09632-1

Download citation

Accepted: 13 December 2022
Published: 09 February 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s10579-022-09632-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The eHRI database: a multimodal database of engagement in human–robot interactions

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Engagement in Dialogue with Social Robots

Multimodal Engagement Prediction in Human-Robot Interaction Using Transformer Neural Networks

Smile and Laughter Detection for Elderly People-Robot Interaction

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now