Imitation and Reinforcement Learning for Motor Primitives with Perceptual Coupling

Kober, Jens; Mohler, Betty; Peters, Jan

doi:10.1007/978-3-642-05181-4_10

Jens Kober⁴,
Betty Mohler⁴ &
Jan Peters⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 264))

1516 Accesses
6 Citations

Abstract

Traditional motor primitive approaches deal largely with open-loop policies which can only deal with small perturbations. In this paper, we present a new type of motor primitive policies which serve as closed-loop policies together with an appropriate learning algorithm. Our new motor primitives are an augmented version version of the dynamical system-based motor primitives [Ijspeert et al(2002)Ijspeert, Nakanishi, and Schaal] that incorporates perceptual coupling to external variables. We show that these motor primitives can perform complex tasks such as Ball-in-a-Cup or Kendama task even with large variances in the initial conditions where a skilled human player would be challenged. We initialize the open-loop policies by imitation learning and the perceptual coupling with a handcrafted solution. We first improve the open-loop policies and subsequently the perceptual coupling using a novel reinforcement learning method which is particularly well-suited for dynamical system-based motor primitives.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Andrieu, C., de Freitas, N., Doucet, A., Jordan, M.I.: An introduction to MCMC for machine learning. Machine Learning 50(1), 5–43 (2003)
Article MATH Google Scholar
Atkeson, C.G.: Using local trajectory optimizers to speed up global optimization in dynamic programming. In: Hanson, J.E., Moody, S.J., Lippmann, R.P. (eds.) Advances in Neural Information Processing Systems 6 (NIPS), pp. 503–521. Morgan Kaufmann, Denver (1994)
Google Scholar
Guenter, F., Hersch, M., Calinon, S., Billard, A.: Reinforcement learning for imitating constrained reaching movements. Advanced Robotics, Special Issue on Imitative Robots 21(13), 1521–1544 (2007)
Google Scholar
Howard, M., Klanke, S., Gienger, M., Goerick, C., Vijayakumar, S.: Methods for learning control policies from variable-constraint demonstrations. In: Sigaud, O., Peters, J. (eds.) From Motor Learning to Interaction Learning in Robots. SCI, vol. 264, pp. 253–291. Springer, Heidelberg (2010)
Google Scholar
Howard, M., Klanke, S., Gienger, M., Goerick, C., Vijayakumar, S.: A novel method for learning policies from variable constraint data. Autonomous Robots (2009b)
Google Scholar
Ijspeert, A.J., Nakanishi, J., Schaal, S.: Movement imitation with nonlinear dynamical systems in humanoid robots. In: Proc. IEEE Int. Conf. on Robotics and Automation (ICRA), Washington, DC, pp. 1398–1403 (2002)
Google Scholar
Ijspeert, A.J., Nakanishi, J., Schaal, S.: Learning attractor landscapes for learning motor primitives. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 16 (NIPS), vol. 15, pp. 1547–1554. MIT Press, Cambridge (2003)
Google Scholar
Kober, J., Peters, J.: Policy search for motor primitives in robotics. In: Advances in Neural Information Processing Systems, NIPS (2008)
Google Scholar
Kulic, D., Nakamura, Y.: Incremental learning of full body motion primitives. In: Sigaud, O., Peters, J. (eds.) From Motor Learning to Interaction Learning in Robots. SCI, vol. 264, pp. 383–406. Springer, Heidelberg (2010)
Google Scholar
Miyamoto, H., Schaal, S., Gandolfo, F., Gomi, H., Koike, Y., Osu, R., Nakano, E., Wada, Y., Kawato, M.: A kendama learning robot based on bi-directional theory. Neural Networks 9(8), 1281–1302 (1996)
Article Google Scholar
Nakanishi, J., Morimoto, J., Endo, G., Cheng, G., Schaal, S., Kawato, M.: A framework for learning biped locomotion with dynamic movement primitives. In: Proc. IEEE-RAS Int. Conf. on Humanoid Robots (HUMANOIDS), Santa Monica, CA, November 10-12. IEEE, Los Angeles (2004)
Google Scholar
Nakanishi, J., Morimoto, J., Endo, G., Cheng, G., Schaal, S., Kawato, M.: Learning from demonstration and adaptation of biped locomotion. Robotics and Autonomous Systems (RAS) 47(2-3), 79–91 (2004)
Article Google Scholar
Nakanishi, J., Mistry, M., Peters, J., Schaal, S.: Experimental evaluation of task space position/orientation control towards compliant control for humanoid robots. In: Proc. IEEE/RSJ 2007 Int. Conf. on Intell. Robotics Systems, IROS (2007)
Google Scholar
Peters, J., Schaal, S.: Policy gradient methods for robotics. In: Proc. IEEE/RSJ 2006 Int. Conf. on Intell. Robots and Systems (IROS), Beijing, China, pp. 2219–2225 (2006)
Google Scholar
Peters, J., Schaal, S.: Reinforcement learning for operational space. In: Proc. Int. Conference on Robotics and Automation (ICRA), Rome, Italy (2007)
Google Scholar
Pongas, D., Billard, A., Schaal, S.: Rapid synchronization and accurate phase-locking of rhythmic motor primitives. In: Proc. IEEE 2005 Int. Conf. on Intell. Robots and Systems (IROS), vol. 2005, pp. 2911–2916 (2005)
Google Scholar
Ratliff, N., Silver, D., Bagnell, J.: Learning to search: Functional gradient techniques for imitation learning. Autonomous Robots 27(1), 25–53 (2009)
Article Google Scholar
Riedmiller, M., Gabel, T., Hafner, R., Lange, S.: Reinforcement learning for robot soccer. Autonomous Robots 27(1), 55–73 (2009)
Article Google Scholar
Rückstieß, T., Felder, M., Schmidhuber, J.: State-dependent exploration for policy gradient methods. In: Proceedings of the European Conference on Machine Learning (ECML), pp. 234–249 (2008)
Google Scholar
Sato, S., Sakaguchi, T., Masutani, Y., Miyazaki, F.: Mastering of a task with interaction between a robot and its environment: “kendama” task. Transactions of the Japan Society of Mechanical Engineers C 59(558), 487–493 (1993)
Google Scholar
Schaal, S., Peters, J., Nakanishi, J., Ijspeert, A.J.: Control, planning, learning, and imitation with dynamic movement primitives. In: Proc. Workshop on Bilateral Paradigms on Humans and Humanoids, IEEE 2003 Int. Conf. on Intell. Robots and Systems (IROS), Las Vegas, NV, October 27-31 (2003)
Google Scholar
Schaal, S., Mohajerian, P., Ijspeert, A.J.: Dynamics systems vs. optimal control — a unifying view. Progress in Brain Research 165(1), 425–445 (2007)
Article Google Scholar
Shone, T., Krudysz, G., Brown, K.: Dynamic manipulation of kendama. Tech. rep., Rensselaer Polytechnic Institute (2000)
Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning. MIT Press, Cambridge (1998)
Google Scholar
Takenaka, K.: Dynamical control of manipulator with vision: “cup and ball” game demonstrated by robot. Transactions of the Japan Society of Mechanical Engineers C 50(458), 2046–2053 (1984)
Google Scholar
Urbanek, H., Albu-Schäffer, A., van der Smagt, P.: Learning from demonstration repetitive movements for autonomous service robotics. In: Proc. IEEE/RSL 2004 Int. Conf. on Intell. Robots and Systems (IROS), Sendai, Japan, pp. 3495–3500 (2004)
Google Scholar
Wikipedia (2008), http://en.wikipedia.org/wiki/Ball_in_a_cup
Williams, R.J.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 229–256 (1992)
MATH Google Scholar
Wulf, G.: Attention and motor skill learning. Human Kinetics, Urbana Champaign (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Max Planck Institute for Biological Cybernetics, Tübingen, Germany
Jens Kober, Betty Mohler & Jan Peters

Authors

Jens Kober
View author publications
You can also search for this author in PubMed Google Scholar
Betty Mohler
View author publications
You can also search for this author in PubMed Google Scholar
Jan Peters
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut des Systèmes Intelligents et de Robotique (CNRS UMR 7222), Université Pierre et Marie Curie Pyramide, Tour 55 Boîte courrier 173, 4 Place Jussieu, 75252, PARIS cedex 05, France
Olivier Sigaud
Dept. Schölkopf, Max-Planck Institute for Biological Cybernetics, Spemannstraße 38,Rm 223, 72076, Tübingen, Germany
Jan Peters

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kober, J., Mohler, B., Peters, J. (2010). Imitation and Reinforcement Learning for Motor Primitives with Perceptual Coupling. In: Sigaud, O., Peters, J. (eds) From Motor Learning to Interaction Learning in Robots. Studies in Computational Intelligence, vol 264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05181-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-05181-4_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05180-7
Online ISBN: 978-3-642-05181-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics