Skip to main content

Learning Actions through Imitation and Exploration: Towards Humanoid Robots That Learn from Humans

  • Chapter
Creating Brain-Like Intelligence

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5436))

Abstract

A prerequisite for achieving brain-like intelligence is the ability to rapidly learn new behaviors and actions. A fundamental mechanism for rapid learning in humans is imitation: children routinely learn new skills (e.g., opening a door or tying a shoe lace) by imitating their parents; adults continue to learn by imitating skilled instructors (e.g., in tennis). In this chapter, we propose a probabilistic framework for imitation learning in robots that is inspired by how humans learn from imitation and exploration. Rather than relying on complex (and often brittle) physics-based models, the robot learns a dynamic Bayesian network that captures its dynamics directly in terms of sensor measurements and actions during an imitation-guided exploration phase. After learning, actions are selected based on probabilistic inference in the learned Bayesian network. We present results demonstrating that a 25-degree-of-freedom humanoid robot can learn dynamically stable, full-body imitative motions simply by observing a human demonstrator.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Turing, A.: Computing machinery and intelligence. Mind 59, 433–460 (1950)

    Article  Google Scholar 

  2. McCarthy, J., Minsky, M., Rochester, N., Shannon, C.: A proposal for the dartmouth summer research project on artificial intelligence (1955)

    Google Scholar 

  3. Meltzoff, A.N.: Elements of a developmental theory of imitation. In: The imitative mind: Development, evolution, and brain bases, pp. 19–41. Cambridge University Press, Cambridge (2002)

    Chapter  Google Scholar 

  4. Doya, K., Ishii, S., Pouget, A., Rao, R.P.N. (eds.): Bayesian Brain: Probabilistic Approaches to Neural Coding. MIT Press, Cambridge (2007)

    Google Scholar 

  5. Rao, R.P.N., Olshausen, B.A., Lewicki, M.S. (eds.): Probabilistic Models of the Brain: Perception and Neural Function, Perception and Neural Function. MIT Press, Cambridge (2002)

    Google Scholar 

  6. Rao, R.P.N., Shon, A.P., Meltzoff, A.N.: A Bayesian model of imitation in infants and robots. In: Imitation and Social Learning in Robots, Humans, and Animals. Cambridge University Press, Cambridge (2005)

    Google Scholar 

  7. Kuniyoshi, Y., Inaba, M., Inoue, H.: Learning by watching: Extracting reusable task knowledge from visual observation of human performance. Transaction on Robotics and Automation 10(6), 799–822 (1994)

    Article  Google Scholar 

  8. Takahashi, Y., Hikita, K., Asada, M.: Incremental purposive behavior acquisition based on self-interpretation of instructions by coach. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003), pp. 686–693. IEEE Computer Society Press, Los Alamitos (2003)

    Google Scholar 

  9. Schaal, S., Ijspeert, A., Billard, A.: Computational approaches to motor learning by imitation. The Neuroscience of Social Interaction 1(1431), 199–218 (2004)

    Google Scholar 

  10. Inamura, T., Toshima, I., Nakamura, Y.: Acquiring motion elements for bi-directional computation of motion recognition and generation. In: Experimental Robotics VIII, pp. 372–381. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  11. Ijspeert, A.J., Nakanishi, J., Schaal, S.: Trajectory formation for imitation with nonlinear dynamical systems. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2001), pp. 752–757. IEEE Press, Los Alamitos (2001)

    Google Scholar 

  12. Billard, A., Mataric, M.: Learning human arm movements by imitation: Evaluation of a biologically-inspired connectionist architecture. Robotics and Autonomous Systems 37(941), 145–160 (2001)

    Article  Google Scholar 

  13. Calinon, S., Guenter, F., Billard, A.: On learning, representing and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man and Cybernetics, Part B. Special issue on robot learning by observation, demonstration and imitation 37(2), 286–298 (2007)

    Article  Google Scholar 

  14. Demiris, J., Hayes, G.: A robot controller using learning by imitation. In: Proceedings of the 2nd International Symposium on Intelligent Robotic Systems (IROS 1994). IEEE Press, Los Alamitos (1994)

    Google Scholar 

  15. Schaal, S.: Learning from demonstration. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems 9 (NIPS 1996), vol. 9, p. 1040. MIT Press, Cambridge (1997)

    Google Scholar 

  16. Atkeson, C.G., Schaal, S.: Robot learning from demonstration. In: Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), pp. 12–20 (1997)

    Google Scholar 

  17. Watkins, C.: Learning from Delayed Rewards. PhD thesis, Cambridge University (1989)

    Google Scholar 

  18. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  19. Price, B.: Accelerating Reinforcement Learning with Imitation. PhD thesis, University of British Columbia (2003)

    Google Scholar 

  20. Ng, A.Y., Russell, S.: Algorithms for inverse reinforcement learning. In: Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), pp. 663–670 (2000)

    Google Scholar 

  21. Abbeel, P., Ng, A.Y.: Exploration and apprenticeship learning in reinforcement learning. In: Proceedings of the Twenty-first International Conference on Machine Learning (ICML 2005) (2005)

    Google Scholar 

  22. Schaal, S.: Is imitation learning the route to humanoid robots? Trends Cognitive Science 3(6), 233–242 (1999)

    Article  CAS  Google Scholar 

  23. Calinon, S., Guenter, F., Billard, A.: Goal-directed imitation in a humanoid robot. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2005). IEEE Press, Los Alamitos (2005)

    Google Scholar 

  24. Webots: Commercial Mobile Robot Simulation Software, http://www.cyberbotics.com

  25. Featherstone, R.: Robot Dynamics Algorithms. Springer, Heidelberg (1987)

    Google Scholar 

  26. Luh, J.Y.S., Walker, M.W., Paul, R.P.C.: On-line computational scheme for mechanical manipulators. Dynamic Systems Measurement and Control 102 (1980)

    Google Scholar 

  27. Chang, K.S., Khatib, O.: Efficient algorithm for extended operational space inertia matrix. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 1999). IEEE Press, Los Alamitos (1999)

    Google Scholar 

  28. Marhefka, D., Orin, D.: Simulation of contact using a nonlinear damping model. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 1996). IEEE Press, Los Alamitos (1996)

    Google Scholar 

  29. Lotstedt, P.: Numerical simulation of time-dependent contact friction problems in rigid body mechanics. SIAM Journal on Scientific Statistical Computing 5(2), 370–393 (1984)

    Article  Google Scholar 

  30. Stewart, D., Trinkle, J.: An implicit time-stepping scheme for rigid body dynamics with coulomb friction. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2000). IEEE Press, Los Alamitos (2000)

    Google Scholar 

  31. Kuffner, J.J., Nishiwaki, K., Kagami, S., Inaba, M., Inoue, H.: Motion planning for humanoid robots under obstacle and dynamic balance constraints. In: Proceedings of the IEEE International Conf. Robotics and Automation (ICRA 2001), pp. 692–698. IEEE Press, Los Alamitos (2001)

    Google Scholar 

  32. Frank, A.A., McGhee, R.B.: Some considerations realation to the design of autopilots for legged vehicles. Terramechanics 6, 23–25 (1969)

    Article  Google Scholar 

  33. Vukobratovic, M., Borovac, B.: Zero-moment point - thirty five years of its life. International Journal of Humanoid Robotics 1(1), 157–173 (2004)

    Article  Google Scholar 

  34. Park, J., Rhee, Y.: ZMP trajectory generation for reduced trunk motions of biped robots. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 1998). IEEE Press, Los Alamitos (1998)

    Google Scholar 

  35. Huang, Q., Kajita, S., Koyachi, N., Kaneko, K., Yokoi, K., Arai, H., Komoriya, K., Tanie, K.: A high stability, smooth walking pattern for a biped robot. In: Proceedings of the IEEE International Conf. Robotics and Automation (ICRA 1999). IEEE Press, Los Alamitos (1999)

    Google Scholar 

  36. Kagami, S., Kanehiro, F., Tamiya, Y., Inaba, M., Inoue, H.: Autobalancer: an online dynamic balance compensation scheme for humanoid robots. In: Proceedings of the International Workshop on Algorithmic Foundation of Robotics, pp. 329–340 (2000)

    Google Scholar 

  37. Park, J., Kim, K.: Biped robot walking using gravity-compensated inverted pendulum mode and computed torque control. In: Proceedings of the IEEE International Conf. Robotics and Automation (ICRA 1998). IEEE Press, Los Alamitos (1998)

    Google Scholar 

  38. Yamaguchi, Takanishi, A., Kato, I.: Development of a biped walking robot compensating for three-axis moment by trunk motion. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 1993), pp. 561–566. IEEE Press, Los Alamitos (1993)

    Chapter  Google Scholar 

  39. Yamane, K., Nakamura, Y.: Dynamics filter - concept and implementation of on-line motion generator for human figures. IEEE Transactions on Robotics and Automation 19(3), 421–432 (2003)

    Article  Google Scholar 

  40. Ko, J., Klein, D., Fox, D., Hahnel, D.: GP-UKF: Unscented Kalman filters with gaussian process prediction and observation models. In: Proceedings of IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS 2007). IEEE Press, Los Alamitos (2007)

    Google Scholar 

  41. Shon, A.P., Verma, D., Rao, R.P.N.: Active imitation learning. In: Proceedings of the American Association for Artificial Intelligence (AAAI 2007) (2007)

    Google Scholar 

  42. Barbic, J., Safonova, A., Pan, J.Y., Faloutsos, C., Hodgins, J.K., Pollard, N.S.: Segmenting motion capture data into distinct behaviors. In: Proceedings of Graphics Interface (GI 2004), University of Waterloo, Waterloo, Ontario, Canada, Canadian Human-Computer Communications Society, pp. 185–194 (2004)

    Google Scholar 

  43. Muller, M., Roder, T.: Motion templates for automatic classification and retrieval of motion capture data. In: Proceedings of the 2006 ACM SIGGRAPH/Eurographics symposium on Computer animation (SCA 2006), Aire-la-Ville, Switzerland, Eurographics Association, pp. 137–146 (2006)

    Google Scholar 

  44. Seth, A., Pandy, M.G.: A nonlinear tracking method of computing net joint torques for human movement. In: Proceedings of the 26th Annual International Conference of the Engineering in Medicine and Biology Society (2004)

    Google Scholar 

  45. Sung, H.G.: Gaussian Mixture Regression and Classification. PhD thesis, Rice University (2004)

    Google Scholar 

  46. Welling, M., Kurihara, K.: Bayesian K-means as a Maximization-Expectation algorithm. In: Proceedings of the SIAM conference on Data Mining (2005)

    Google Scholar 

  47. Scott, D., Szewczyk, W.: From kernels to mixtures. Technometrics 43(3), 323–335 (2001)

    Article  Google Scholar 

  48. Kreutz, M., Reimetz, A.M., Sendhoff, B., Weihs, C., von Seelen, W.: Structure optimization of density estimation models applied to regression problems with dynamic noise. In: Proceedings of the Seventh International Workshop on Artificial Intelligence and Statistics, pp. 237–242. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  49. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to algorithms. MIT Press, Cambridge (2001)

    Google Scholar 

  50. Park, J.D., Darwiche, A.: Complexity results and approximation strategies for map explanations. Journal of Artififical Intelligence Research (JAIR) 21, 101–133 (2004)

    Google Scholar 

  51. Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Francisco (1988)

    Google Scholar 

  52. Weiss, Y.: Correctness of local probability propagation in graphical models with loops. Neural Computation 12(1), 1–41 (2000)

    Article  CAS  PubMed  Google Scholar 

  53. Sudderth, E.B., Ihler, A.T., Freeman, W.T., Willsky, A.S.: Nonparametric belief propagation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2003), pp. 605–612 (2003)

    Google Scholar 

  54. Kschischang, F.R., Frey, B.J., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47(2), 498–519 (2001)

    Article  Google Scholar 

  55. Carreira-Perpinan, M.A.: Mode-finding for mixtures of gaussian distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 22(11), 1318–1323 (2000)

    Article  Google Scholar 

  56. Hwang, J., Lay, S., Lippman, A.: Nonparametric multivariate density estimation: a comparative study. IEEE Transactions on Signal Processing 42(10), 2795–2810 (1994)

    Article  Google Scholar 

  57. Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, Boca Raton (1986)

    Book  Google Scholar 

  58. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977)

    Google Scholar 

  59. Vicon: Vicon MX Motion Capture System, http://www.vicon.com

  60. Lawrence, N.D.: Gaussian process latent variable models for visualization of high dimensional data. In: Advances in Neural Information Processing Systems 15 (NIPS 2002). MIT Press, Cambridge (2003)

    Google Scholar 

  61. Grochow, K., Martin, S.L., Hertzmann, A., Popovic, Z.: Style-based inverse kinematics. In: Proceedings of the ACM Transactions on Graphics, SIGGRAPH 2004 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Grimes, D.B., Rao, R.P.N. (2009). Learning Actions through Imitation and Exploration: Towards Humanoid Robots That Learn from Humans. In: Sendhoff, B., Körner, E., Sporns, O., Ritter, H., Doya, K. (eds) Creating Brain-Like Intelligence. Lecture Notes in Computer Science(), vol 5436. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00616-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00616-6_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00615-9

  • Online ISBN: 978-3-642-00616-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics