Skip to main content
Log in

Recognition of Affective Communicative Intent in Robot-Directed Speech

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

Human speech provides a natural and intuitive interface for both communicating with humanoid robots as well as for teaching them. In general, the acoustic pattern of speech contains three kinds of information: who the speaker is, what the speaker said, and how the speaker said it. This paper focuses on the question of recognizing affective communicative intent in robot-directed speech without looking into the linguistic content. We present an approach for recognizing four distinct prosodic patterns that communicate praise, prohibition, attention, and comfort to preverbal infants. These communicative intents are well matched to teaching a robot since praise, prohibition, and directing the robot's attention to relevant aspects of a task, could be used by a human instructor to intuitively facilitate the robot's learning process. We integrate this perceptual ability into our robot's “emotion” system, thereby allowing a human to directly manipulate the robot's affective state. This has a powerful organizing influence on the robot's behavior, and will ultimately be used to socially communicate affective reinforcement. Communicative efficacy has been tested with people very familiar with the robot as well as with naïve subjects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Blumberg, B. 1996. Old tricks, new dogs: Ethology and interactive creatures. Ph.D. Thesis, MIT.

  • Breazeal, C. 1998. Amotivational system for regulation human-robot interaction. In Proceedings of AAAI98, pp. 54–61.

  • Breazeal, C. and Scassellati, B. 1999. How to build robots that make friends and influence people. In Proceedings of IROS99, pp. 858–863.

  • Breazeal, C., Edsinger, A., Fitzpatric, P., and Scassellati, B. 2000. Social constraints on animate vision. In Proceedings of the 1st International Conference on Humanoid Robots, Cambridge, MA.

  • Breazeal, C. and Foerst, A. 1999. Schmoozing with robots: Exploring the original wireless network. In Proceedings of Cognitive Technology (CT99), pp. 375–390.

  • Breazeal, C. 1999. Robot in society: Friend or appliance? In Proceedings of Agents 99 Worshop on Emotion Based Architectures, pp. 18–26.

  • Breazeal, C. and Scassellati, B. 2000. Challenges in building robots that imitate people. Imitation in Animals and Artifacts,MIT Press.

  • Bullowa, M. 1979. Before speech: The Beginning of Interpersonal Communication, Cambridge University Press: Cambridge, London.

    Google Scholar 

  • Cahn, J. 1990. Generating expression in synthesized speech. Master's Thesis, MIT Media Lab.

  • Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., and Stone, M. 1994. Animated conversation: Rule-based generation of facial expression, gesture, and spoken intonation for multiple conversational agents. In SIGGRAPH.

  • Chen, L. and Huang, T. 1998. Multimodal human emotion/expression recognition. In Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

  • Damasio, A.R. 1994. Descartes's Error: Emotion, Reason, and the Human Brain. Gosset/Putnam Press: New York, NY.

    Google Scholar 

  • Dellaert, F., Polzin, F., and Waibel, A. 1996. Recognizing emotion in speech. In Proceedings of the ICSLP.

  • Eibl-Eibelsfeld, I. 1970. Liebe und Hass: Zur Naturgeschichte elementarer Verhaltensweisen, Piper: Munich, Germany.

    Google Scholar 

  • Ekman, P. 1992. Are there basic emotions? Psychological Review, 99(3):550–553.

    Google Scholar 

  • Fernald, A. 1985. Four-month-old infants prefer to listen to motherese. Infant Behavior and Development, 8:181–195.

    Google Scholar 

  • Ferrier, L.J. 1987. Intonation in discourse: Talk between 12-montholds and their mothers. In Children's Language, Vol. 5, K. Nelson (Ed.), Erlbaum: Hillsdale, NJ, pp. 35–60.

    Google Scholar 

  • Grieser, D.L. and Kuhl, P.K. 1988. Maternal speech to infants in a tonal language: Support for universal prosodic features in motherese. Developmental Psychology, 24:14–20.

    Google Scholar 

  • McRoberts, G., Fernald, A., and Moses, L. in press. An acoustic study of prosodic form-function relationships in infant-directed speech: Cross language similarities. Development Psychology.

  • Murray, I.R. and Arnott, L. 1993. Toward the simulation of emotion in synthetic speech:Areviewof the literature on human vocal emotion. Journal Acoustical Society of America, 93(2):1097–1108.

    Google Scholar 

  • Nakatsu, R., Nicholson, J., and Tosa, N. 1999. Emotion recognition and its application to computer agents with spontaneous interactive capabilities. In ICMCS, Vol. 2, pp. 804–808.

    Google Scholar 

  • Papousek, M., Papousek, H., and Bornstein, M.H. 1985. The naturalistic vocal environment of young infants: On the significance of homogeneity and variability in parental speech. In Social Perception in Infants, T. Field and N. Fox (Eds.), Ablex: Norwood, NJ, pp. 269–297.

    Google Scholar 

  • Reeves, B. and Nass, C. 1996. The Media Equation: How People Treat Computers, Television, and New Media Like Real People and Places, CSLI Publications: Stanford, CA.

    Google Scholar 

  • Roy, D. and Pentland, A. 1996. Automatic spoken affect classification and analysis. In Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition, pp. 363–367.

  • Slaney, M. and McRoberts, G. 1998. Baby ears: A recognition system for affective vocalizations. In Proceedings of the 1998 International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Seattle, WA.

  • Snow, C.E. 1972. Mother's speech to children learning language.Child Development, 43:549–565.

    Google Scholar 

  • Stern, D.N., Spieker, S., and MacKain, K. 1982. Intonation contours as signals in maternal speech to prelinguistic infants. Developmental Psychology, 18:727–735.

  • Velasquez, J. 1998. When robots weep: A mechanism for emotional memories. In Proceedings of AAAI98.

  • Vlassis, N. and Likas, A. 1999. A kurtosis-based dynamic approach to Gaussian mixture modeling. IEEE Trans. on Systems, Man, and Cybernetics. Part A: Systems and Humans, 29(4):393–399.

  • Yoon, S.Y., Blumberg, B., and Schneider, G. 2000. Motivation driven learning for interactive synthetic characters. In Proceedings of Agents.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Breazeal, C., Aryananda, L. Recognition of Affective Communicative Intent in Robot-Directed Speech. Autonomous Robots 12, 83–104 (2002). https://doi.org/10.1023/A:1013215010749

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1013215010749

Navigation