Latent Character Model for Engagement Recognition Based on Multimodal Behaviors

Inoue, Koji; Lala, Divesh; Takanashi, Katsuya; Kawahara, Tatsuya

doi:10.1007/978-981-13-9443-0_11

Koji Inoue³⁷,
Divesh Lala³⁷,
Katsuya Takanashi³⁷ &
…
Tatsuya Kawahara³⁷

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 579))

339 Accesses
8 Citations

Abstract

Engagement represents how much a user is interested in and willing to continue the current dialogue and is the important cue for spoken dialogue systems to adapt the user state. We address engagement recognition based on listener’s multimodal behaviors such as backchannels, laughing, head nodding, and eye gaze. When the ground-truth labels are given by multiple annotators, they differ according to each annotator due to the different perspectives on the multimodal behaviors. We assume that each annotator has a latent character that affects its perception of engagement. We propose a hierarchical Bayesian model that estimates both the engagement level and the character of each annotator as latent variables. Furthermore, we incorporate other latent variables to map the input feature into a sub-space. The experimental result shows that the proposed model achieves higher accuracy than other models that do not take into account the character.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bednarik R, Eivazi S, Hradis M (2012) Gaze and conversational engagement in multiparty video conversation: an annotation scheme and classification of high and low levels of engagement. In: Proceedings of the ICMI workshop on eye gaze in intelligent human machine interaction
Google Scholar
Bohus D, Horvitz E (2009) Learning to predict engagement with a spoken dialog system in open-world settings. In: Proceedings of the SIGDIAL, pp 244–252
Google Scholar
Castellano G, Pereira A, Leite I, Paiva A, McOwan PW (2009) Detecting user engagement with a robot companion using task and social interaction-based features. In: Proceedings of the ICMI, pp 119–126
Google Scholar
Chiba Y, Ito A (2016) Estimation of users willingness to talk about the topic: analysis of interviews between humans. In: Proceedings of the IWSDS
Google Scholar
Den Y, Yoshida N, Takanashi K, Koiso H (2011) Annotation of Japanese response tokens and preliminary analysis on their distribution in three-party conversations. In: Proceedings of the oriental COCOSDA, pp 168–173
Google Scholar
DeVault D, Artstein R, Benn G, Dey T, Fast E, Gainer A, Georgila K, Gratch J, Hartholt A, Lhommet M, Lucas G, Marsella S, Morbini F, Nazarian A, Scherer S, Stratou G, Suri A, Traum D, Wood R, Xu Y, Rizzo A, Morency LP (2014) SimSensei kiosk: a virtual human interviewer for healthcare decision support. In: Proceedings of the autonomous agents and multi-agent systems, pp 1061–1068
Google Scholar
Frank M, Tofighi G, Gu H, Fruchter R (2016) Engagement detection in meetings. arXiv preprint arXiv:1608.08711
Glas N, Pelachaud C (2015) Definitions of engagement in human-agent interaction. In: Proceedings of the international workshop on engagement in human computer interaction, pp 944–949
Google Scholar
Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101(suppl 1):5228–5235
Article Google Scholar
Higashinaka R, Imamura K, Meguro T, Miyazaki C, Kobayashi N, Sugiyama H, Hirano T, Makino T, Matsuo Y (2014) Towards an open-domain conversational system fully based on natural language processing. In: Proceedings of the COLING, pp 928–939
Google Scholar
Huang Y, Gilmartin E, Campbell N (2016) Conversational engagement recognition using auditory and visual cues. In: Proceedings of the INTERSPEECH
Google Scholar
Inoue K, Lala D, Nakamura S, Takanashi K, Kawahara T (2016) Annotation and analysis of listener’s engagement based on multi-modal behaviors. In: Proceedings of the ICMI workshop on multimodal analyses enabling artificial agents in human-machine interaction
Google Scholar
Inoue K, Milhorat P, Lala D, Zhao T, Kawahara T (2016) Talking with ERICA, an autonomous android. In: Proceedings of the SIGDIAL, pp 212–215
Google Scholar
Michalowski MP, Sabanovic S, Simmons R (2006) A spatial model of engagement for a social robot. In: Proceedings of the international workshop on advanced motion control, pp 762–767
Google Scholar
Nakano YI, Ishii R (2010) Estimating user’s engagement from eye-gaze behaviors in human-agent conversations. In: Proceedings of the IUI, pp 139–148
Google Scholar
Oertel C, Mora KAF, Gustafson J, Odobez JM (2015) Deciphering the silent participant: on the use of audio-visual cues for the classification of listener categories in group discussions. In: Proceedings of the ICMI
Google Scholar
Ozkan D, Morency LP (2011) Modeling wisdom of crowds using latent mixture of discriminative experts. In: Proceedings of the ACL, pp 335–340
Google Scholar
Ozkan D, Sagae K, Morency LP (2010) Latent mixture of discriminative experts for multimodal prediction modeling. In: Proceedings of the COLING, pp 860–868
Google Scholar
Peters C (2005) Direction of attention perception for conversation initiation in virtual environments. In: Proceedings of the international workshop on intelligent virtual agents, pp 215–228
Chapter Google Scholar
Poggi I (2007) Mind, hands, face and body: a goal and belief view of multimodal communication. Weidler
Google Scholar
Ramanarayanan V, Leong CW, Suendermann-Oeft D (2017) Rushing to judgement: how do laypeople rate caller engagement in thin-slice videos of human-machine dialog?. In: INTERSPEECH, pp 2526–2530
Google Scholar
Rich C, Ponsler B, Holroyd A, Sidner CL (2010) Recognizing engagement in human-robot interaction. In: Proceedings of the HRI, pp 375–382
Google Scholar
Sanghvi J, Castellano G, Leite I, Pereira A, McOwan PW, Paiva A (2011) Automatic analysis of affective postures and body motion to detect engagement with a game companion. In: Proceedings of the HRI, pp 305–311
Google Scholar
Sidner CL, Lee C (2003) Engagement rules for human-robot collaborative interactions. In: Proceedings of the ICSMC, pp 3957–3962
Google Scholar
Sidner CL, Lee C, Kidd CD, Lesh N, Rich C (2005) Explorations in engagement for humans and robots. Artif Intell 166(1–2):140–164
Article Google Scholar
Sun M, Zhao Z, Ma X (2017) Sensing and handling engagement dynamics in human-robot interaction involving peripheral computing devices. In: CHI, pp 556–567
Google Scholar
Türker, B.B., Buçinca Z, Erzin E, Yemez Y, Sezgin M (2017) Analysis of engagement and user experience with a laughter responsive social robot. In: INTERSPEECH, pp 844–848
Google Scholar
Xu Q, Li L, Wang G (2013) Designing engagement-aware agents for multiparty conversations. In: Proceedings of the CHI, pp 2233–2242
Google Scholar
Yu C, Aoki PM, Woodruff A (2004) Detecting user engagement in everyday conversations. In: Proceedings of the ICSLP, pp 1329–1332
Google Scholar
Yu Z, Nicolich-Henkin L, Black AW, Rudnicky AI (2016) A Wizard-of-Oz study on a non-task-oriented dialog systems that reacts to user engagement. In: Proceedings of the SIGDIAL, pp 55–63
Google Scholar
Yu Z, Ramanarayanan V, Lange P, Suendermann-Oeft D (2017) An open-source dialog system with real-time engagement tracking for job interview training applications. In: Proceedings of the IWSDS
Google Scholar

Download references

Acknowledgements

This work was supported by JSPS KAKENHI (Grant Number 15J07337) and JST ERATO Ishiguro Symbiotic Human-Robot Interaction program (Grant Number JPMJER1401), Japan.

Author information

Authors and Affiliations

Graduate School of Informatics, Kyoto University, Kyoto, Japan
Koji Inoue, Divesh Lala, Katsuya Takanashi & Tatsuya Kawahara

Authors

Koji Inoue
View author publications
You can also search for this author in PubMed Google Scholar
Divesh Lala
View author publications
You can also search for this author in PubMed Google Scholar
Katsuya Takanashi
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuya Kawahara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Koji Inoue .

Editor information

Editors and Affiliations

Universidad Politécnica de Madrid, Madrid, Spain
Luis Fernando D'Haro
Nanyang Technological University, Singapore, Singapore
Rafael E. Banchs
Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore
Haizhou Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Inoue, K., Lala, D., Takanashi, K., Kawahara, T. (2019). Latent Character Model for Engagement Recognition Based on Multimodal Behaviors. In: D'Haro, L., Banchs, R., Li, H. (eds) 9th International Workshop on Spoken Dialogue System Technology. Lecture Notes in Electrical Engineering, vol 579. Springer, Singapore. https://doi.org/10.1007/978-981-13-9443-0_11

Download citation

DOI: https://doi.org/10.1007/978-981-13-9443-0_11
Published: 25 September 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9442-3
Online ISBN: 978-981-13-9443-0
eBook Packages: Literature, Cultural and Media StudiesLiterature, Cultural and Media Studies (R0)

Publish with us

Policies and ethics