Skip to main content

Advertisement

Log in

Multi-party, multi-role comprehensive listening behavior

  • Published:
Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Abstract

Realizing effective listening behavior in virtual humans has become a key area of research, especially as research has sought to realize more complex social scenarios involving multiple participants and bystanders. A human listener’s nonverbal behavior is conditioned by a variety of factors, from current speaker’s behavior to the listener’s role and desire to participate in the conversation and unfolding comprehension of the speaker. Similarly, we seek to create virtual humans able to provide feedback based on their participatory goals and their unfolding understanding of, and reaction to, the relevance of what the speaker is saying as the speaker speaks. Based on a survey of existing psychological literature as well as recent technological advances in recognition and partial understanding of natural language, we describe a model of how to integrate these factors into a virtual human that behaves consistently with these goals. We then discuss how the model is implemented into a virtual human architecture and present an evaluation of behaviors used in the model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The forced choice obviously simplifies this decoding task for the observer but the use of gibberish makes it harder.

References

  1. Argyle, M., & Cook, M. (1976). Gaze and mutual gaze. Cambridge: Cambridge University Press.

    Google Scholar 

  2. Argyle, M., Lalljee, M., & Cook, M. (1968). The effects of visibility on interaction in a dyad. Human Relations, 21, 3–17.

    Article  Google Scholar 

  3. Bavelas, J. B., Coates, L., & Johnson, T. (2000). Listeners as co-narrators. Journal of Personality and Social Psychology, 79, 941–952.

    Article  Google Scholar 

  4. Bevacqua, E., Pammi, S., Hyniewska, S. J., Schroder, M., & Pelachaud, C. (2010). Multimodal backchannels for embodied conversational agents. In Proceedings of the 10th International Conference on Intelligent Virtual Agents (pp. 194–200). Philadelphia: IVA.

  5. Brunner, L. (1979). Smiles can be back channels. Journal of Personality and Social Psychology, 37(5), 728–734.

    Article  MathSciNet  Google Scholar 

  6. Callan, H., Chance, M., & Pitcairn, T. (1973). Attention and advertence in human groups. Social Science Information, 12, 27–41.

    Article  Google Scholar 

  7. DeVault, D., Sagae, K., & Traum, D. (2011). Incremental interpretation and prediction of utterance meaning for interactive dialogue. Dialogue & Discourse, 2(1), 143–170.

    Google Scholar 

  8. Dittmann, A., & Llewellyn, L. (1968). Relationship between vocalizations and head nods as listener responses. Journal of Personality and Social Psychology, 9, 79–84.

    Article  Google Scholar 

  9. Ekman, P. (1979). About brows: Emotional and conversational signals. In M. von Cranach, K. Foppa, W. Lepenies, & D. Ploog (Eds.), Human ethology (pp. 169–248). Cambridge: Cambridge University Press.

  10. Ellsworth, P., Friedman, H., Perlick, D., & Hoyt, M. (1978). Some effects of gaze on subjects motivated to seek or to avoid social comparison. Journal of Experimental Social Pscyhology, 14, 69–87.

    Article  Google Scholar 

  11. Friedman, H. S., & Riggio, R. E. (1981). Effect of individual differences in non-verbal expressiveness on transmission of emotion. Journal of Nonverbal Behavior, 6(2), 96–104.

    Article  Google Scholar 

  12. Goffman, E. (1981). Forms of talk. Philadelphia: University of Pennsylvania Press.

    Google Scholar 

  13. Goodwin, C. (1981). Conversational organization: Interaction between speakers and hearers. New York: Academic Press.

    Google Scholar 

  14. Gratch, J., Wang, N., Gerten, J., Fast, E. & Duffy, R. (2007). Creating rapport with virtual agents. In Proceedings of the 7th International Conference on Intelligent Virtual Agents. Paris: IVA.

  15. Gu, E. & Badler, N. (2006). Visual attention and eye gaze during multipartite conversations with distractions. In Proceedings of the 6th, International Conference on Intelligent Virtual Agents. Marina Del Rey: IVA.

  16. Hanks, W. F. (1996). Language and communicative practices. Boulder: Westview Press.

    Google Scholar 

  17. Hartholt, A., Gratch, J., Weiss, L., & Team, T. G. (2009). At the virtual frontier: Introducing gunslinger, a multi-character, mixed-reality, story-driven experience. In Proceedings of the 9th International Conference on Intelligent Virtual Agents (pp. 500–501). Berlin/Heidelberg: Springer.

  18. Heylen, D. (2005). Challenges ahead: Head movements and other social acts in conversations. In Social presence cues symposium. Hatfield: University of Hertfordshire.

  19. Heylen, D., Kopp, S., Marsella, S., Pelachaud, C., & Vilhjlmsson, H., (2008). The next step towards a functional markup language. In Proceedings of the 8th International Conference on Intelligent Virtual Agents (pp. 270–280). Berlin/Heidelberg: Springer.

  20. Ikeda, K. (2009). Triadic exchange pattern in multiparty communication: A case study of conversational narrative among friends. Language and Culture, 30(2), 53–65.

    Google Scholar 

  21. Jan, D., & Traum, D. R. (2007). Dynamic movement and positioning of embodied agents in multiparty conversations. In Proceedings of the 6th International Conference on Autonomous Agents and Multiagent Systems (pp. 59–66). Toronto: AAMAS.

  22. Jónsdóttir, G. R., Gratch, J., Fast, E., Thórission, K. R. (2007) Fluid semantic back-channel feedback in dialogue: Challenges and progress. In Proceedings of the 7th International Conference on Intelligent Virtual Agents. Paris: IVA.

  23. Kendon, A. (1972). Some relationships between body motion and speech. In A. Seigman & B. Pope (Eds.), Studies in dyadic communication (pp. 177–216). Elmsford, New York: Pergamon Press.

  24. Kendon, A. (1990). Conducting interaction: Patterns of behavior in focused encounters. Cambridge: Cambridge University Press.

    Google Scholar 

  25. Kendon, A. (2002). Some uses of the head shake. Gesture, 2(36), 147–182.

    Google Scholar 

  26. Kok, I., & Heylen, D. (2011). Appropriate and inappropriate timing of listener responses from multiple perspectives. In H. Vilhjlmsson, S. Kopp, S. Marsella, & K. Thrisson (Eds.), Intelligent virtual agents. Lecture Notes in Computer Science (vol. 6895). Berlin/Heidelberg: Springer.

  27. Kopp S., Allwood J., Grammer, K., Ahlsen, E., & Stocksmeier, T. Modeling embodied feedback with virtual humans. In Modeling communication with robots and virtual humans (Vol. 4930, pp. 18–37). Berlin/Heidelberg: Springer.

  28. Kopp, S., Krenn, B., Marsella, S., Marshall, A., Pelachaud, C., Pirker, H., Thrisson, K., & Vilhjlmsson, H. (2006). Towards a common framework for multimodal generation: The behavior markup language (Vol. 4133, pp. 205G217). Berlin/Heidelberg: Springer.

  29. Lee, J., & Marsella, S. (2006). Nonverbal behavior generator for embodied conversational agents. In Proceedings of the 6th International Conference on Intelligent Virtual Agents (pp. 243–255). IVA: Marina del Rey.

  30. Lee, J., & Marsella, S. (2010). Predicting speaker head nods and the effects of affective information. IEEE Transactions on Multimedia, 12(6), 552–562.

    Article  Google Scholar 

  31. Maatman, R., Gratch, J., & Marsella, S. (2005). Natural behavior of a listening agent. In Proceedings of the 5th International Conference on Intelligent Virtual Agents (pp. 25–36). IVA: Kos. .

  32. Marsella, S., & Gratch, J. (2009). EMA: A process model of appraisal dynamics. Cognitive Systems Research, 10(1), 70–90.

    Article  Google Scholar 

  33. McClave, E. Z. (2000). Linguistic functions of head movements in the context of speech. Journal of Pragmatics, 32(24), 855–878.

    Google Scholar 

  34. Morency, L.-P., de Kok, I., & Gratch, J. (2008). A probabilistic multimodal approach for predicting listener backchannels. In Proceedings of the 8th International Conference on Intelligent Virtual Agents (pp. 70–84). IVA: Tokyo.

  35. Poppe, R., Truong, K., Reidsma, D., & Heylen, D. (2010). Backchannel strategies for artificial listeners. In J. Allbeck, N. Badler, T. Bickmore, C. Pelachaud, & A. Safonova (Eds.), Intelligent virtual agents. Lecture Notes in Computer Science (Vol. 6356, pp. 146–158). Berlin Heidelberg: Springer.

  36. Smith, C. A., & Lazarus, R. (1990). Emotion and adaptation. In L. A. Pervin (Ed.), Handbook of personality: Theory & research (pp. 609–637). New York: Guilford Press.

  37. Thiébaux, M., Marshall, A., Marsella, S., & Kallmann, M. (2008). Smartbody: Behavior realization for embodied conversational agents. In Seventh International Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) (pp. 151–158). Estoril: AAMAS.

  38. Traum, D., DeVault, D., Lee, J., Wang, Z., & Marsella, S. (2012). Incremental dialogue understanding and feedback for multiparty, multimodal conversation. In Y. Nakano, M. Neff, A. Paiva, & M. Walker (Eds.), Intelligent virtual agents. Lecture Notes in Computer Science (Vol. 7502, pp. 275–288). Berlin/Heidelberg: Springer.

  39. Traum, D., Marsella, S., Gratch, J., Lee, J., & Hartholt, A. (2008). Multi-party, multi-issue, multi-strategy negotiation for multi-modal virtual agents. In Proceedings of the 8th International Conference on Intelligent Virtual Agents (pp. 117–130). IVA: Tokyo.

  40. Vertegaa, R., der Veer, G. C. V., & Vons, H. (2000). Effects of gaze on multiparty mediated communication. Proceedings of Graphics, Interface (pp. 95–102). New York: ACM Press.

  41. Yngve, V. (1970). On getting a word in edgewise. In Papers from the 6th regional meeting (pp. 567–578). Chicago: Chicago Linguistic Society.

Download references

Acknowledgments

This work was sponsored by the U.S. Army Research, Development, and Engineering Command (RDECOM). The content does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred. We also would like to thank our colleagues Drs. David DeVault, LP Morency and David Traum for all their help in implementing this model.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiyang Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, Z., Lee, J. & Marsella, S. Multi-party, multi-role comprehensive listening behavior. Auton Agent Multi-Agent Syst 27, 218–234 (2013). https://doi.org/10.1007/s10458-012-9215-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10458-012-9215-8

Keywords

Navigation