Skip to main content
Log in

A Gesture Based Interface for Human-Robot Interaction

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

Service robotics is currently a highly active research area in robotics, with enormous societal potential. Since service robots directly interact with people, finding “natural” and easy-to-use user interfaces is of fundamental importance. While past work has predominately focussed on issues such as navigation and manipulation, relatively few robotic systems are equipped with flexible user interfaces that permit controlling the robot by “natural” means. This paper describes a gesture interface for the control of a mobile robot equipped with a manipulator. The interface uses a camera to track a person and recognize gestures involving arm motion. A fast, adaptive tracking algorithm enables the robot to track and follow a person reliably through office environments with changing lighting conditions. Two alternative methods for gesture recognition are compared: a template based approach and a neural network approach. Both are combined with the Viterbi algorithm for the recognition of gestures defined through arm motion (in addition to static arm poses). Results are reported in the context of an interactive clean-up task, where a person guides the robot to specific locations that need to be cleaned and instructs the robot to pick up trash.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Asoh, H., Hayamizu, S., Hara, I., Motomura, Y., Akaho, S., and Matsui, T. 1997. Socially embedded learning of office-conversant robot jijo-2. In Proceedings of IJCAI-97. IJCAI, Inc.

  • Boehme, H.-J., Brakensiek, A., Braumann, U.-D., Krabbes, M., and Gross, H.-M. 1998. Neural networks for gesture-based remote control of a mobile robot. In Proc.1998 IEEE World Congress on Computational Intelligence WCCI'98—IJCNN'98, Anchorage, IEEE Computer Society Press, pp. 372–377.

  • Boehme, H.-J., Braumann, U.-D., Brakensiek, A., Corradini, A., Krabbes, M., and Gross, H.-M. 1998. User localisation for visually-based human-machine interaction. In Proc.1998 IEEE Int.Conf.on Face and Gesture Recognition, Nara, Japan, pp. 486–491.

  • Borenstein, J., Everett, B., and Feng, L. 1996. Navigating Mobile Robots: Systems and Techniques, A.K. Peters, Ltd.: Wellesley, MA.

    Google Scholar 

  • Burgard, W., Cremers, A.B., Fox, D., H¨ahnel, D., Lakemeyer, G., Schulz, D., Steiner, W., and Thrun, S. 1999. Experiences with an interactive museum tour-guide robot. Artificial Intelligence, 114(1- 2):3–55.

    Google Scholar 

  • Campbell, W., Becker, A., Azarbayejani A., Bobick, A., and Pentland, A. 1996. Invariant features for 3-d gesture recognition. Technical Report 379, M.I.T. Media Laboratory Perceptual Computing Section.

  • Cox, I.J. and Wilfong, G.T. (Eds.). 1990.Autonomous RobotVehicles, Springer Verlag: Berlin.

    Google Scholar 

  • Crowley, J.L. 1997. Vision for man-machine interaction. Robotics and Autonomous Systems, 19:347–358.

    Google Scholar 

  • Cui, Y. and Weng, J.J. 1996. Hand sign recognition from intensity image sequences with complex backgrouns. In Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, Killington, Vermont.

  • Darrel, T., Moghaddam, B., and Pentland, A.P. 1996. Active face tracking and pose estimation in an interactive room. In Proceedings of the IEEE Sixth International Conference on Computer Vision, pp. 67–72.

  • Elfes, A. 1989.Occupancy grids:Aprobabilistic framework for robot perception and navigation. Ph.D. Thesis, Department of Electrical and Computer Engineering, Carnegie Mellon University.

  • Endres, H., Feiten,W., and Lawitzky, G. 1998. Field test of a navigation system: Autonomous cleaning in supermarkets. In Proc.of the 1998 IEEE International Conference on Robotics & Automation (ICRA 98).

  • Firby, R.J., Kahn, R.E., Prokopowicz, P.N., and Swain, M.J. 1995. An architecture for active vision and action. In Proceedings of IJCAI-95, pp. 72–79.

  • Firby, R.J. 1994. Task networks for controlling continous processes. In Proceedings of the Second International Conference on AI Planning and Application.

  • Franklin, D., Kahn, R., Swain, M., and Firby, J. 1996. Happy patrons make better tippers: Creating a robot waiter using perscus and the animate agent architecture. In Proc.of Second International Conference on Automatic Face and Gesture Recognition.

  • Huber, E. and Kortenkamp, D. 1995. Using stereo vision to pursue moving agents with a mobile robot. In Proceedings of the IEEE International Conference on Robotics and Automation.

  • Kahn, R.E., Swain, M.J., Prokopowicz, P.N., and Firby, R.J. 1996. Gesture recognition using the perseus architecture. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, San Francisco, CA, pp. 734–741.

  • Kahn, R. 1996. Perseus: An extensible vision system for humanmachine interaction. Ph.D. thesis, University of Chicago.

  • King, S. and Weiman, C. 1990. Helpmate autonomous mobile robot navigation system. In Proceedings of the SPIE Conference on Mobile Robots, Boston, MA, November, Vol. 2352. pp. 190–198.

    Google Scholar 

  • Kortenkamp, D., Huber, E., and Bonasso, P. 1996. Recognizing and interpreting gestures on a mobile robot. In Proceedings of AAAI-96, AAAI Press/The MIT Press, pp. 915–921.

  • Kortenkamp, D., Bonassi, R.P., and Murphy, R. (Eds.) 1998. AIbased Mobile Robots: Case Studies of Successful Robot Systems, MIT Press: Cambridge, MA.

    Google Scholar 

  • MacCormick, J. and Blake, A. 1999. A probabilistic exclusion principle for tracking multiple objects. In Proceedings of the International Conference on Computer Vision, Kerkyra, Korfu.

  • Maes, P., Blumberg, B., Darrel, T., and Pentland, A. 1997. The ALIVE system: Full body interaction with animated autonomous agents. ACM Multimedia Systems, 5.

  • Margaritis, D. and Thrun, S. 1998. Learning to locate an object in 3d space from a sequence of camera images. In Proceedings of the International Conference on Machine Learning (ICML).

  • Moravec, H.P. 1988. Sensor fusion in certainty grids for mobile robots. AI Magazine, Summer: pp. 61–74.

  • Pomerleau, D. 1993. Neural Network Perception for Mobile Robot Guidance, Kluwer Academic Publishers: Boston, MA.

    Google Scholar 

  • Rabiner, L.R. and Juang, B.H. 1986. An introduction to hidden markov models. IEEE ASSP Magazine.

  • Rowley, H.A., Baluja, S., and Kanade, T. 1998. Neural networkbased face detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(1):23–28.

    Google Scholar 

  • Rumelhart, D.E., Hinton, G.E., and Williams, R.J. 1986. Learning internal representations by error propagation. In Parallel Distributed Processing.Vol.I + II, D.E. Rumelhart and J.L. McClelland (Eds.), MIT Press.

  • Schraft, R.D. and Schmierer, G. 1998. Serviceroboter, Springer Verlag. In German.

  • Simmons, R. 1995. The 1994 AAAI robot competition and exhibition. AI Magazine, 16(1), Spring.

  • Starner, T. and Pentland, A. 1995. Real-time american sign language recognition from video using hidden markov models. In Proceedings of International Symposium on Computer Vision, IEEE Computer Society Press.

  • Thrun, S., B¨ucken, A., Burgard, W., Fox, D., Fr¨ohlinghaus, T., Hennig, D., Hofmann, T., Krell, M., and Schimdt, T. 1998. Map learning and high-speed navigation in RHINO. In AI-based Mobile Robots: Case Studies of Successful Robot Systems, D. Kortenkamp, R.P. Bonasso, and R. Murphy (Eds.), MIT Press: Cambridge, MA.

    Google Scholar 

  • Thrun, S., Bennewitz, M., Burgard, W., Cremers, A.B., Dellaert, F., Fox, D., H¨ahnel, D., Rosenberg, C., Roy, N., Schulte, J., and Schulz, D. 1999. MINERVA: A second generation mobile tourguide robot. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA).

  • Thrun, S. 1998. Learning maps for indoor mobile robot navigation. Artificial Intelligence, 99(1):21–77.

    Google Scholar 

  • Torrance, M.C. 1994. Natural communication with robots. Master's Thesis, MIT Department of Electrical Engineering and Computer Science, Cambridge, MA.

  • Triesch, J. and von der Malsburg, C. 1997. Robotic gesture recognition. In Proceedings of the Bielefeld Gesture Workshop (GW'97), Bielefeld, Germany, Springer. Lecture Notes in Artificial Intelligence, vol. 1371.

    Google Scholar 

  • Triesch, J. and von der Malsburg, C. 1998. A gesture interface for human-robot-interaction. In Proc.1998 IEEE Int.Conf.on Face and Gesture Recognition, Nara, Japan.

  • Waibel, A. and Lee, K.-F. 1990. (Eds.). Readings in Speech Recognition, Morgan Kaufmann Publishers: San Mateo, CA.

    Google Scholar 

  • Waldherr, S., Thrun, S., Margaritis, D., and Romero, R. 1998. Template-based recognition of pose and motion gestures on a mobile robot. In Proceedings of the Fifteenth National Conference on Artificial Intelligence.

  • Wilson, A. and Bobick, A. 1995a. Learning visual behavior for gesture analysis. Technical Report 337, M.I.T. Media Laboratory Perceptual Computing Section.

  • Wilson, A. and Bobick, A. 1995b. Using configuration states for the representation and recognition of gestures. Technical Report 308, M.I.T. Media Laboratory Perceptual Computing Section.

  • Wilson, A., Bobick, A., and Cassell, J. 1996. Recovering the temporal structure of natural gesture. Technical Report 388, M.I.T. Media Laboratory Perceptual Computing Section.

  • Wong, C., Kortenkamp, D., and Speich, M. 1995. A mobile robot that recognizes people. In Proceedings of the IEEE International Conference on Tools with Artificial Intelligence.

  • Wren, C.R., Azarbayejani, A., Darrell, T., and Pentland, A.P. 1997. Pfinder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Learning, 19(7):780–785.

    Google Scholar 

  • Wyszecki, G. and Styles, W. 1982. Color Science: Concepts and Methods, Quantitative Data and Formulae, John Wiley & Sons.

  • Yang, J. and Waibel, A. 1995. Tracking human faces in real-time. Technical Report CMU-CS–95–210, School of Computer Science, Carnegie Mellon University.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Waldherr, S., Romero, R. & Thrun, S. A Gesture Based Interface for Human-Robot Interaction. Autonomous Robots 9, 151–173 (2000). https://doi.org/10.1023/A:1008918401478

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008918401478

Navigation