Skip to main content

Multimodal and Multi-party Social Interactions

  • Chapter
  • First Online:
Book cover Context Aware Human-Robot and Human-Agent Interaction

Part of the book series: Human–Computer Interaction Series ((HCIS))

Abstract

Virtual characters and robots interacting with people in social contexts should understand the users’ behaviours and respond back with gestures, facial expressions and gaze. The challenges in this area are the estimation of high level user states fusing low level multi-modal sensory input, taking socially appropriate decisions using this partial sensory information and rendering synchronized and timely multi-modal behaviours based on taken decisions. Moreover, these characters should be able to communicate with multiple users and also among each other in multi-party group interactions. In this chapter, we provide an overview of the methods for multi-modal and multi-party interactions and discuss the challenges in this area. We also mention our current work and point out the future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Atrey PK, Hossain MA, Saddik AE, Kankanhalli MS (2010) Multimodal fusion for multimedia analysis: a survey. Multimedia Syst 16(6):345–379

    Google Scholar 

  2. Berton A, Kaltenmeier A, Haiber U, Schreiner O (2006) Speech recognition. In: Wahlster W (ed) SmartKom: foundations of multimodal dialogue systems. Springer, Berlin

    Google Scholar 

  3. Bickmore T (2008) Framing and interpersonal stance in relational agents. In: Why conversational agents do what they do. Functional representations for generating conversational agent behavior. AAMAS 2008, Estoril, Portugal

    Google Scholar 

  4. Bickmore TW, Picard RW (2005) Establishing and maintaining long-term human-computer relationships. ACM Trans Comput-Hum Interact 12:293–327

    Google Scholar 

  5. Bohus D, Horvitz E (2009) Dialog in the open world: platform and applications. In: Proceedings of the 2009 international conference on multimodal interfaces, ICMI-MLMI’09. ACM, New York, pp 31–38

    Google Scholar 

  6. Bohus D, Horvitz E (2009) Learning to predict engagement with a spoken dialog system in open-world settings. In: Proceedings of the SIGDIAL 2009 conference: the 10th annual meeting of the special interest group on discourse and dialogue. Association for computational linguistics, pp 244–252

    Google Scholar 

  7. Bohus D, Horvitz E (2010) Computational models for multiparty turn-taking. Technical report MSR-TR-2010-115, Microsoft technical report

    Google Scholar 

  8. Bohus D, Horvitz E (2010) Facilitating multiparty dialog with gaze, gesture, and speech. In: International conference on multimodal interfaces and the workshop on machine learning for multimodal interaction, ICMI-MLMI’10. ACM, New York, pp 5:1–5:8

    Google Scholar 

  9. Bohus D, Horvitz E (2011) Decisions about turns in multiparty conversation: from perception to action. In: Proceedings of the 13th international conference on multimodal interfaces, ICMI’11. ACM, New York, pp 153–160

    Google Scholar 

  10. Brooks RA (1985) A robust layered control system for a mobile robot. Technical report, Cambridge

    Google Scholar 

  11. Cassell J, Vilhjálmsson HH, Bickmore T (2001) BEAT: the behavior expression animation toolkit. In: Proceedings of the 28th annual conference on computer graphics and interactive techniques, SIGGRAPH’01. ACM, New York, pp 477–486

    Google Scholar 

  12. Egges A, Kshirsagar S, Magnenat-Thalmann N (2004) Generic personality and emotion simulation for conversational agents: research articles. Comput Animat Virtual Worlds 15:1–13

    Article  Google Scholar 

  13. Foster ME, Gaschler A, Giuliani M (2013) How can i help you? comparing engagement classification strategies for a robot bartender. In: Proceedings of the 15th international conference on multimodal interaction (ICMI 2013)

    Google Scholar 

  14. Gebhard P (2005) ALMA: a layered model of affect. In: Proceedings of the fourth international joint conference on autonomous agents and multiagent systems, AAMAS’05. ACM, New York, pp 29–36

    Google Scholar 

  15. Gockley R, Forlizzi J, Simmons R (2006) Interactions with a moody robot. In: Proceedings of the 1st ACM SIGCHI/SIGART conference on human-robot interaction, HRI’06. ACM, New York, pp 186–193

    Google Scholar 

  16. Kasap Z, Thalmann NM (2010) Towards episodic memory based long-term affective interaction with a human-like robot. In: IEEE international symposium on robot and human interactive communication (RO-MAN). IEEE, pp 479–484

    Google Scholar 

  17. Kasap Z, Moussa MB, Chaudhuri P, Magnenat-Thalmann N (2009) Making them remember: emotional virtual characters with memory. IEEE Comput Graph Appl 29:20–29

    Google Scholar 

  18. Keizer S, Foster ME, Lemon O, Gaschler A, Giuliani M (2013) Training and evaluation of an MDP model for social multi-user human-robot interaction. In: Proceedings of the 14th annual SIGdial meeting on discourse and dialogue

    Google Scholar 

  19. Kendon A (2010) Spacing and orientation in co-present interaction. In: Proceedings of the second international conference on development of multimodal interfaces: active listening and synchrony, COST’09. Springer, Berlin, Heidelberg, pp 1–15

    Google Scholar 

  20. Kipp M, Neff M, Kipp KH, Albrecht I (2007) Towards natural gesture synthesis: evaluating gesture units in a data-driven approach to gesture synthesis. In: Pelachaud C, Martin J-C, André E, Chollet G, Karpouzis K, Pelé D (eds) Intelligent virtual agents. Lecture notes in computer science, vol 4722. Springer, Berlin, pp 15–28

    Google Scholar 

  21. Kopp S, Krenn B, Marsella S, Marshall AN, Pelachaud C, Pirker H, Thórisson KR, Vilhjálmsson H (2006) Towards a common framework for multimodal generation: the behavior markup language. In: Proceedings of the 6th international conference on intelligent virtual agents, IVA’06. Springer, Berlin, Heidelberg, pp 205–217

    Google Scholar 

  22. Krenn B, Sieber G (2008) Functional markup for behaviour planning: theory and practice. In: Proceedings of the AAMAS 2008 workshop FML: functional markup language. Why conversational agents do what they do, AAMAS’08

    Google Scholar 

  23. Lee J, DeVault D, Marsella S, Traum D (2008) Thoughts on fml: behavior generation in the virtual human communication architecture. In: Proceedings of the 1st functional markup language workshop

    Google Scholar 

  24. Lee J, Marsella S (2012) Modeling speaker behavior: a comparison of two approaches. In: Nakano Y, Neff M, Paiva A, Walker M (eds) Intelligent virtual agents. Lecture notes in computer science, vol 7502. Springer, Berlin, pp 161–174

    Google Scholar 

  25. Lee J, Marsella S (2006) Nonverbal behavior generator for embodied conversational agents. Intelligent virtual agents. Lecture notes in computer science, vol 4133. Springer, Berlin, pp 243–255

    Google Scholar 

  26. Lombard M, Ditton TB, Crane D, Davis B, Gil-Egui G, Horvath K, Rossman J, Park S (2000) Measuring presence: a literature-based approach to the development of a standardized paper-and-pencil instrument. In: IJsselsteijn W, Freeman J, de Ridder H (eds) Proceedings of the third international workshop on presence

    Google Scholar 

  27. Mascardi V, Demergasso D, Ancona D (2005) Languages for programming BDI-style agents: an overview. In: Woa’05

    Google Scholar 

  28. Michalowski MP (2006) A spatial model of engagement for a social robot. In: Proceedings of the 9th international workshop on advanced motion control (AMC 2006)

    Google Scholar 

  29. Mutlu B, Kanda T, Forlizzi J, Hodgins J, Ishiguro H (2012) Conversational gaze mechanisms for humanlike robots. ACM Trans Interact Intell Syst 1(2):12:1–12:33

    Google Scholar 

  30. Otsuka K, Sawada H, Yamato J (2007) Automatic inference of cross-modal nonverbal interactions in multiparty conversations: who responds to whom, when, and how? from gaze, head gestures, and utterances. In: Proceedings of the 9th international conference on multimodal interfaces, ICMI’07. ACM, New York, pp 255–262

    Google Scholar 

  31. Peters C, Pelachaud C, Bevacqua E, Mancini M, Poggi I (2005) A model of attention and interest using gaze behavior. In: Intelligent virtual agents. Springer, London, pp 229–240

    Google Scholar 

  32. Selfridge EO, Arizmendi I, Heeman PA, Williams JD (2011) Stability and accuracy in incremental speech recognition. In: Proceedings of the SIGDIAL 2011 conference, SIGDIAL’11. Stroudsburg, Association for Computational Linguistics, pp 110–119

    Google Scholar 

  33. Shapiro A (2011) Building a character animation system. In: Motion in games. Springer, London, pp 98–109

    Google Scholar 

  34. Si M, Marsella SC, Pynadath DV (2006) Thespian: Modeling socially normative behavior in a decision-theoretic framework. In: Gratch J, Young M, Aylett R, Ballin D, Olivier P (eds) Intelligent virtual agents. Lecture notes in computer science, vol 4133. Springer, Berlin, pp 369–382

    Google Scholar 

  35. Sidner CL, Kidd CD, Lee C, Lesh N (2004) Where to look: a study of human-robot engagement. In: Proceedings of the 9th international conference on intelligent user interfaces, IUI’04. ACM, New York, pp 78–84

    Google Scholar 

  36. Stiefelhagen R, Ekenel HK, Fügen C, Gieselmann P, Holzapfel H, Kraft F, Nickel K, Voit M, Waibel A (2007) Enabling multimodal human-robot interaction for the karlsruhe humanoid robot. IEEE Trans Robot 23(5):840–851

    Google Scholar 

  37. Traum D (2004) Issues in multi-party dialogues. In: Dignum F (ed) Advances in agent communication, pp 201–211

    Google Scholar 

  38. Vijayasenan D, Valente F, Bourlard H (2012) Multistream speaker diarization of meetings recordings beyond mfcc and tdoa features. Speech Commun 54(1):55–67

    Article  Google Scholar 

  39. Wang Z, Lee J, Marsella SC (2013) Multi-party, multi-role comprehensive listening behavior. J Auton Agents Multi-Agent Syst

    Google Scholar 

  40. Yumak Z, Ren J, Magnenat-Thalmann N, Yuan J (2014) Modelling multi-party interactions among virtual characters, robots and humans. MIT presence: teleoperators and virtual environments (presence), vol 23(2). MIT Press, Cambridge, pp 172–190

    Google Scholar 

  41. Yumak Z, Ren J, Thalmann NM, Yuan J (2014) Tracking and fusion for multiparty interaction with a virtual character and a social robot. In: SIGGRAPH Asia 2014 autonomous virtual humans and social robot for telepresence. ACM, New York, p 3

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zerrin Yumak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Yumak, Z., Magnenat-Thalmann, N. (2016). Multimodal and Multi-party Social Interactions. In: Magnenat-Thalmann, N., Yuan, J., Thalmann, D., You, BJ. (eds) Context Aware Human-Robot and Human-Agent Interaction. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-319-19947-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19947-4_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19946-7

  • Online ISBN: 978-3-319-19947-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics