Abstract
In contrast to the variety of listening behaviors produced in human-to-human interaction, most virtual agents sit or stand passively when a user speaks. This is a reflection of the fact that although the correct responsive behavior of a listener during a conversation is often related to the semantics, the state of current speech understanding technology is such that semantic information is unavailable until after an utterance is complete. This paper will illustrate that appropriate listening behavior can also be generated by other features of a speaker’s behavior that are available in real time such as speech quality, posture shifts and head movements. This paper presents a mapping from these real-time obtainable features of a human speaker to agent listening behaviors.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Arons, B.: Pitch-Based Emphasis Detection For Segmenting Speech Recordings. In: International Conference on Spoken Language Processing (1994)
Bernieri, J.E.G.A.F.J.: The Importance of Nonverbal Cues in Judging Rapport. Journal of Nonverbal Behavior 23(4), 253–269 (1999)
Breazeal, C., Aryananda, L.: Recognition of Affective Communicative Intent in Robot-Directed Speech. Autonomous Robots 12, 83–104 (2002)
Cassell, J.: Nudge Nudge Wink Wink: Elements of Face-to-Face Conversation for Embodied Conversational Agents. In: Cassell, J., Sullivan, J., Prevost, S., Churchill, E. (eds.) Embodied Conversational Agents, pp. 1–27. MIT Press, Cambridge (2000)
Cassell, J., Bickmore, T., et al.: Embodiment in Conversational Interfaces: Rea. In: Conference on Human Factors in Computing Systems, Pittsburgh, PA (1999)
Cassell, J., Bickmore, T., et al.: Human conversation as a system framework: Designing embodied conversational agents. In: Cassell, J., Sullivan, J. (eds.) Embodied Conversational Agents, Prevost, S., Churchill, E.: pp. 29–63. MIT Press, Boston (2000)
Cassell, J., Nakano, Y.I., et al.: Non-verbal cues for discourse structure. In: Association for Computational Linguistics Joint EACL - ACL Conference (2001)
Cassell, J., Sullivan, J., et al. (eds.): Embodied Conversational Agents. MIT Press, Cambridge (2000)
Cathcart, N., Carletta, J., et al.: A shallow model of backchannel continuers in spoken dialogue. In: 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest (2003)
Clark, H.H., Wasow, T.: Repeating words in Spontaneous Speech. Cognitive Psychology 37, 204–242 (1998)
E.S.: Phonetic Consequences of Speech Disfluency. In: International Congress of Phonetic Sciences, San Francisco, CA (1999)
Fernandez, R.: A Computational Model for the Automatic Recognition of Affect in Speech. Cambridge, MA, Ph.D. Thesis, MIT Media Arts and Science (2004)
Gratch, J., Rickel, J., et al.: Creating Interactive Virtual Humans: Some Assembly Required. IEEE Intelligent Systems (2002) (July/August: 54-61)
Lakin, J.L., Jefferis, V.A., et al.: Chameleon Effect as Social Glue: Evidence for the Evolutionary Significance of Nonconsious Mimicry. Journal of Nonverbal Behavior 27(3), 145–162 (2003)
Marsella, S., Gratch, J., et al.: Expressive Behaviors for Virtual Worlds. In: Prendinger, H., Ishizuka, M. (eds.) Life-like Characters Tools, Affective Functions and Applications, pp. 317–360. Springer, Berlin (2003)
McFarland, D.H.: Respiratory Markers of Conversational Interaction. Journal of Speech, Language, and Hearing Research 44, 128–143 (2001)
McNeill, D.: Hand and mind: What gestures reveal about thought. The University of Chicago Press, Chicago (1992)
Milewski, B.: The Fourier Transform, Reliable Software, Relisoft.com (1996)
Sonnby-Borgstrom, M., Jonsson, P., et al.: Emotional Empathy as Related to Mimicry Reactions at Different Levels of Information Processing. Journal of Nonverbal Behavior 27(1), 3–23 (2003)
Tosa, N.: Neurobaby. In: ACM SIGGRAPH, pp. 212–213 (1993)
Van baaren, R.B., Holland, R.W., et al.: Mimicry and Prosocial Behavior. Psychological Science 15(1), 71–74 (2004)
Ward, N., Tsukahara, W.: Prosodic features which cue back-channel responses in English and Japanese. Journal of Pragmatics 23, 1177–1207 (2000)
Warner, R.: Coordinated cycles in behavior and physiology during face-to-face social interactions. In: Watt, J.H., VanLear, C.A. (eds.) Dynamic patterns in communication processes. SAGE publications, Thousand Oaks (1996)
Warner, R.M., Malloy, D., et al.: Rhythmic organization of social interaction and observer ratings of positive affect and involvement. Journal of Nonverbal Behavior 11(2), 57–74 (1987)
Welji, H., Duncan, S.: Characteristics of face-to-face interactions, with and without rapport: Friends vs. strangers. In: Symposium on Cognitive Processing Effects of ’Social Resonance’ in Interaction, 26th Annual Meeting of the Cognitive Science Society (2004)
Yngve, V.H.: On getting a word in edgewise. In: Sixth regional Meeting of the Chicago Linguistic Society (1970)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Maatman, R.M., Gratch, J., Marsella, S. (2005). Natural Behavior of a Listening Agent. In: Panayiotopoulos, T., Gratch, J., Aylett, R., Ballin, D., Olivier, P., Rist, T. (eds) Intelligent Virtual Agents. IVA 2005. Lecture Notes in Computer Science(), vol 3661. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11550617_3
Download citation
DOI: https://doi.org/10.1007/11550617_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28738-4
Online ISBN: 978-3-540-28739-1
eBook Packages: Computer ScienceComputer Science (R0)