Skip to main content
Log in

First steps toward natural human-like HRI

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

Natural human-like human-robot interaction (NHL-HRI) requires the robot to be skilled both at recognizing and producing many subtle human behaviors, often taken for granted by humans. We suggest a rough division of these requirements for NHL-HRI into three classes of properties: (1) social behaviors, (2) goal-oriented cognition, and (3) robust intelligence, and present the novel DIARC architecture for complex affective robots for human-robot interaction, which aims to meet some of those requirements. We briefly describe the functional properties of DIARC and its implementation in our ADE system. Then we report results from human subject evaluations in the laboratory as well as our experiences with the robot running ADE at the 2005 AAAI Robot Competition in the Open Interaction Event and Robot Exhibition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Note that the emphasis here is on both “consistency” and “extended time period” as humans can sometimes be tricked into believing that something has a purpose because it seems to exhibit purposeful behavior for short periods of time. However, the deception will typically not last for long (e.g., see the repeatedly failed attempts at convincing humans that a computer is a human in the Loebner prize competition).

  2. The ASCII representation is generated with the binary program distributed on Lowe's website http://www.cs.ubc.ca/∼lowe/keypoints/.

  3. The current implementation of the action interpreter is still somewhat impoverished, as variables for other scripts have not been implemented yet. For example, it is not possible to add “variable actions” to scripts such as “pick any script that satisfies preconditions \(X_i\) and execute it”, which would cause the action interpreter to search through its scripts and match them against the preconditions \(X_i\). Also, the current implementation only supports detection of failures, but not “recursive” attempts to recover from them—“recursive”, for recovery actions might themselves fail and might thus lead to recovery from recovery, etc.

  4. Automatic failure recovery procedures have since been added to the infrastructure to address this problem; see also Section 4.

  5. This problem has since been addressed by applying a stereo vision algorithm to isolate an object in the foreground. Preprocessing the stereo image increases the likelihood that any of the determined keypoints actually belong to the object held to the camera.

  6. The details for reprioritization of goals were not provided in Breazeal et al. (2004).

References

  • Andronache, V. and Scheutz, M. 2004. Integrating theory and practice: The agent architecture framework APOC and its development environment ADE. In Proceedings of AAMAS 2004.

  • Andronache, V. and Scheutz, M. 2006. ADE—a tool for the development of distributed architectures for virtual and robotic agents. In Petta, P., Müller, J. (Eds.), Best of at2ai-4, vol. 20.

  • Bless, H., Schwarz, N., and Wieland, R. 1996. Mood and the impact of category membership and individuating information. European Journal of Social Psychology, 26:935–959.

    Google Scholar 

  • Breazeal, C., Hoffman, G., and Lockerd, A. 2004. Teaching and working with robots as a collaboration. In Proceedings of AAMAS 2004.

  • Breazeal, C. and Scassellati, B. 1999. How to build robots that make friends and influence people. In Iros, pp. 858–863.

  • Breazeal, C.L. 2002. Designing Sociable Robots. MIT Press.

  • Burkhart, F. 2005. Emofilt: the simulation of emotional speech by prosody-transformation. In Proceedings of Interspeech 2005.

  • Byers, Z., Dixon, M., Goodier, K., Grimm, C.M. and Smart, W.D. 2003. An autonomous robot photographer. In Proceedings of Iros 2003. Las Vegas, NV.

  • Clore, G., Gasper, K., and Conway, H. 2001. Affect as information. In Forgas, J. (Ed.), Handbook of Affect and Social Cognition, Erlbaum, Mahwah, NJ, pp. 121–144.

  • The CMU Sphinx group open source speech recognition engines. 2004. http://cmusphinx.sourceforge.net/html/cmusphinx.php.

  • Desai, M. and Yanco, H.A. 2005. August. Blending human and robot inputs for sliding scale autonomy. In Proceedings of the 14th IEEE International Workshop on Robot and Human Interactive Communication. Nashville, TN.

  • Ekman, P. 1993. Facial expression and emotion. American Psychologist, 48(4):384–392.

    Google Scholar 

  • Ekman, P. and Friesen, W.V. 1977. Manual for the Facial Action Coding System (FACS). Consulting Psychologists Press, Palo Alto.

  • The festival speech synthesis system. 2004. http://www.cstr.ed.ac.uk/projects/festival/. Centre for Speech Technology Research.

  • Fillmore, C.J., Baker, C.F., and Sato, H. 2002. The framenet database and software tools. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC). Las Palmas, Spain, pp. 1157–1160.

  • Grosz, B.L. and Sidner, C.L. 1990. Plans for discourse. In Cohen, P.R., Morgan, J. and Pollack, M.E. (Eds.), Intentions in Communication. MIT Press, MA, pp. 417–444.

  • Hanson, D., Olney, A., Prilliman, S., Mathews, E., Zielke, M., Hammons, D., Fernandez, R., and Stephanou, H. 2005. July. Upending the uncanny valley. In Proceedings of the AAAI-05 Robot Workshop. Pittsburgh, PA.

  • Ichise, R., Shapiro, D., and Langley, P. 2002. Learning hierarchical skills from observation. In Proceedings of the Fifth International Conference on Discovery Science, pp. 247–258.

  • Kanda, T., Iwase, K., Shiomi, M., and Ishiguro, H. 2005. A tension-moderating mechanism for promoting speech-based human-robot interaction. In Iros, pp. 527–532.

  • Kipper, K., Dang, H., and Palmer, M. 2005. Class-based construction of a verb lexicon. In Proceedings of AAAI 2000.

  • Kramer, J. and Scheutz, M. 2006a. Reflection and reasoning for system integrity inan agent architecture infrastructure. (Under review)

  • Kramer, J. and Scheutz, M. 2006b. ADE: A framework for robust complex robotic architectures. (Under review)

  • Lisetti, C.L., Brown, S., Alvarez, K., and Marpaung, A. 2004. A social informatics approach to human-robot interaction with an office service robot. IEEE Transactions on Systems, Man, and Cybernetics–Special Issue on Human Robot Interaction, 34(2):195–209.

    Google Scholar 

  • Lowe, D.G. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2):91–110.

    Google Scholar 

  • Michaud, F., Brosseau, Y., Côté, C., Létourneau, D., Moisan, P., Ponchon, A., Raievsky, C., Valin, J., Beaudry, É., and Kabanza, F. 2005. Modularity and integration in the design of a socially interactive robot. In Proceedings IEEE International Workshop on Robot and Human Interactive Communication, pp. 172–177.

  • Michaud, F., Duquette, A., and Nadeau, I. 2003. Characteristics of mobile robotic toys for children with pervasive developmental disorders. In Proceedings IEEE Conference on Systems, Man, and Cybernetics.

  • Middendorff, C. and Scheutz, M. 2006. Real-time evolving swarms for rapid pattern detection and tracking. In Proceedings of Artificial Life x.

  • Mueller, E.T. 1998. Natural Language Processing with Thoughttreasure. Signiform, New York.

  • Murphy, R.R., Lisetti, C., Tardif, R., Irish, L., and Gage, A. 2002. Emotion-based control of cooperating heterogeneous mobile robots. IEEE Transactions on Robotics and Automation, 18(5):744–757.

    Google Scholar 

  • Ortony, A., Norman, D., and Revelle, W. 2005. Effective functioning: A three level model of affect, motivation, cognition, and behavior. In Fellous, J. and Arbib, M. (Eds.), Who Needs Emotions? The Brain Meets the Machine. Oxford University Press, New York.

  • Pellom, B. and Hacioglu, K. 2003. Recent improvements in the CU SONIC ASR system for noisy speech: The SPINE task. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).

  • Picard, R. 1997. Affective Computing. MIT Press, Cambridge, Mass, London, England.

  • Rani, P., Sarkar, N., and Smith, C.A. 2003. Affect-sensitive human-robot cooperation-theory and experiments. In Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 2382–2387.

  • Schank, R. and Abelson, R.R. 1977. Scripts, Plans, Goals, and Understanding. Lawrence Erlbaum Associates, Hillsdale, NJ.

  • Scheutz, M. 2000. Surviving in a hostile multiagent environment: How simple affective states can aid in the competition for resources. In Hamilton, H.J. (Ed.), Advances in Artificial Intelligence, 13th Biennial Conference of the Canadian Society for Computational Studies of Intelligence, ai 2000, Montréal, Quebec, Canada, may 14-17, 2000, Proceedings, Springer, vol. 1822, pp. 389–399.

  • Scheutz, M. 2004. Useful roles of emotions in artificial agents: A case study from artificial life. In Proceedings of AAAI 2004.

  • Scheutz, M. 2006. ADE–steps towards a distributed development and runtime environment for complex robotic agent architectures. Applied Artificial Intelligence, 20(4–5).

  • Scheutz, M. and Andronache, V. 2003. APOC—a framework for complex agents. In Proceedings of the AAAI Spring Symposium. AAAI Press.

  • Scheutz, M. and Andronache, V. 2004. Architectural mechanisms for dynamic changes of behavior selection strategies in behavior-based systems. IEEE Transactions of System, Man, and Cybernetics Part B, 34(6):2377–2395.

    Google Scholar 

  • Scheutz, M. and Logan, B. 2001. Affective versus deliberative agent control. In Colton, S. (Ed.), Proceedings of the AISB’01 Symposium on Emotion, Cognition and Affective Computing. York: Society for the Study of Artificial Intelligence and the Simulation of Behaviour, pp. 1–10.

  • Scheutz, M., McRaven, J., and Cserey, G. 2004. Fast, reliable, adaptive, bimodal people tracking for indoor environments. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

  • Scheutz, M., Schermerhorn, P., Kramer, J., and Middendorff, C. 2006. The utility of affect expression in natural language interactions in joint human-robot tasks. In Proceedings of the ACM Conference on Human-Robot Interaction (HRI2006).

  • Sleator, D. and Temperley, D. 1993. Parsing english with a link grammar. In Proceedings of the Third International Workshop on Parsing Technologies.

  • Yamamoto, S., Nakadai, K., Valin, J., Rouat, J., Michaud, F., Komatani, K., Ogata, T., and Okuno, H. 2005. Making a robot recognize three simultaneous sentences in real-time. In Proceedings IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 897–902.

  • Yang, M.-H., Kriegman, D.J. and Ahuja, N. 2002. Detecting faces in images: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(1):34–58.

    Google Scholar 

Download references

Acknowledgment

The authors would like to thank Virgil Andronache, Chris Middendorff, Aaron Dingler, Peter Bui and Patrick Davis for their help with the implementation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul Schermerhorn.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Scheutz, M., Schermerhorn, P., Kramer, J. et al. First steps toward natural human-like HRI. Auton Robot 22, 411–423 (2007). https://doi.org/10.1007/s10514-006-9018-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-006-9018-3

Keywords

Navigation