skip to main content
10.1145/3125739.3132607acmconferencesArticle/Chapter ViewAbstractPublication PageshaiConference Proceedingsconference-collections
research-article

Conversational Agent Learning Natural Gaze and Motion of Multi-Party Conversation from Example

Published:27 October 2017Publication History

ABSTRACT

Recent developments in robotics and virtual reality (VR) are making embodied agents familiar, and social behaviors of embodied conversational agents are essential to create mindful daily lives with conversational agents. Especially, natural nonverbal behaviors are required, such as gaze and gesture movement. We propose a novel method to create an agent with human-like gaze as a listener in multi-party conversation, using Hidden Markov Model (HMM) to learn the behavior from real conversation examples. The model can generate gaze reaction according to users' gaze and utterance. We implemented an agent with proposed method, and created VR environment to interact with the agent. The proposed agent reproduced several features of gaze behavior in example conversations. Impression survey result showed that there is at least a group who felt the proposed agent is similar to human and better than conventional methods.

Skip Supplemental Material Section

Supplemental Material

haip1037-file3.mp4

mp4

3.2 MB

References

  1. C. Busso, Z. Deng, M. Grimm, U. Neumann, and S. Narayanan. 2007. Rigid Head Motion in Expressive Speech Animation:Analysis and Synthesis. IEEE Transactions on Audio, Speech, and Language Processing 15, 3 (March 2007), 1075--1086. DOI:http://dx.doi.org/10.1109/TASL.2006.885910 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Carlos Busso, Zhigang Deng, Ulrich Neumann, and Shrikanth Narayanan. 2005. Natural head motion synthesis driven by acoustic prosodic features. Computer Animation and Virtual Worlds 16, 3--4 (2005), 283--290. DOI:http://dx.doi.org/10.1002/cav.80 Google ScholarGoogle ScholarCross RefCross Ref
  3. J. CASSEL. 1999. The power of a nod and a glance:Envelope vs.emotional feedback in animated conversational agents. Applied Artificial Intelligence 13, 4 (1999), 519--538. http://ci.nii.ac.jp/naid/80011023310/Google ScholarGoogle ScholarCross RefCross Ref
  4. David DeVault, Ron Artstein, Grace Benn, Teresa Dey, Edward Fast, Alesia Gainer, Kallirroi Georgila, Jonathan Gratch, Arno Hartholt, Margaux Lhommet, Gale Lucas, Stacy C. Marsella, Morbini Fabrizio, Angela Nazarian, Stefan Scherer, Giota Stratou, Apar Suri, David Traum, Rachel Wood, Yuyu Xu, Albert Rizzo, and Louis-Philippe Morency. 2014. SimSensei Kiosk:A Virtual Human Interviewer for Healthcare Decision Support. In Proceedings of the 13th Inter-national Conference on Autonomous Agents and Multiagent Systems (AAMAS 2014). International Foundation for Autonomous Agents and Multiagent Systems, Paris, France, 1061--1068.Google ScholarGoogle Scholar
  5. Adam Kendon. 1967. Some functions of gaze-direction in social interaction. Acta Psychologica 26 (1967), 22 -- 63. DOI:http://dx.doi.org/10.1016/0001--6918(67)90005--4Google ScholarGoogle ScholarCross RefCross Ref
  6. Soh Masuko and Junichi Hoshino. 2007. Head-eye Animation Corresponding to a Conversation for CG Characters. Computer Graphics Forum 26, 3 (2007), 303--312. DOI:http://dx.doi.org/10.1111/j.1467--8659.2007.01052.xGoogle ScholarGoogle ScholarCross RefCross Ref
  7. Yukiko I. Nakano, Gabe Reinstein, Tom Stocky, and Justine Cassell. 2003. Towards a Model of Face-to-face Grounding. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1 (ACL '03). Association for Computational Linguistics, Stroudsburg, PA, USA, 553--561. DOI:http://dx.doi.org/10.3115/1075096.1075166 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Pronama-Chan. 2014. User Guideline. (2014). http://pronama.azurewebsites.net/pronama/guideline/Google ScholarGoogle Scholar
  9. Pupil-Labs. 2014. Pupil Headset. (2014). https://pupil-labs.com/pupil/Google ScholarGoogle Scholar
  10. Tomio Watanabe. 2003. Embodied Interaction and Communication Technology Through the Development of E-COSMIC:Embodied Communication System for Mind Connection. Baby Science 2 (2003), 4--12. http://ci.nii.ac.jp/naid/10018152055/Google ScholarGoogle Scholar

Index Terms

  1. Conversational Agent Learning Natural Gaze and Motion of Multi-Party Conversation from Example

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      HAI '17: Proceedings of the 5th International Conference on Human Agent Interaction
      October 2017
      550 pages
      ISBN:9781450351133
      DOI:10.1145/3125739

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 27 October 2017

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate121of404submissions,30%
    • Article Metrics

      • Downloads (Last 12 months)18
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader