Contextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller

Abdolmaleki, Abbas; Lau, Nuno; Reis, Luis Paulo; Peters, Jan; Neumann, Gerhard

doi:10.1007/s10846-016-0347-y

Contextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller

Published: 15 February 2016

Volume 83, pages 393–408, (2016)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abbas Abdolmaleki^1,2,3,
Nuno Lau¹,
Luis Paulo Reis^2,3,
Jan Peters^4,5 &
…
Gerhard Neumann⁶

348 Accesses
4 Citations
Explore all metrics

Abstract

We investigate learning of flexible robot locomotion controllers, i.e., the controllers should be applicable for multiple contexts, for example different walking speeds, various slopes of the terrain or other physical properties of the robot. In our experiments, contexts are desired walking linear speed of the gait. Current approaches for learning control parameters of biped locomotion controllers are typically only applicable for a single context. They can be used for a particular context, for example to learn a gait with highest speed, lowest energy consumption or a combination of both. The question of our research is, how can we obtain a flexible walking controller that controls the robot (near) optimally for many different contexts? We achieve the desired flexibility of the controller by applying the recently developed contextual relative entropy policy search(REPS) method which generalizes the robot walking controller for different contexts, where a context is described by a real valued vector. In this paper we also extend the contextual REPS algorithm to learn a non-linear policy instead of a linear policy over the contexts which call it RBF-REPS as it uses Radial Basis Functions. In order to validate our method, we perform three simulation experiments including a walking experiment using a simulated NAO humanoid robot. The robot learns a policy to choose the controller parameters for a continuous set of forward walking speeds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Data-Driven Approach to Estimate Human Center of Mass State During Perturbed Locomotion Using Simulated Wearable Sensors

Article 01 April 2024

Path Planning and Trajectory Planning Algorithms: A General Overview

A review of motion planning algorithms for intelligent robots

Article Open access 25 November 2021

References

Kormushev, P., Ugurlu, B., Calinon, S., Tsagarakis, N.G., Caldwell, D.G.: Bipedal walking energy minimization by reinforcement learning with evolving policy parameterization. In: Proceedings of the International Conference on Robot Systems (2011)
Kupcsik, A.G., Deisenroth, M.P., Peters, J., Neumann, G: Data-efficient generalization of robot skills with contextual policy search. In: Proceedings of the National Conference on Artificial Intelligence (AAAI) (2013)
Shafii, N., Khorsandian, A., Abdolmaleki, A., Jozi, B.: An optimized gait generator based on fourier series towards fast and robust biped locomotion involving arms swing. In: Proceedings of the International Conference on Automation and Logistics (2009)
Harada, K., Kajita, S., Kaneko, K., Hirukawa, H.: An analytical method for real-time gait planning for humanoid robots, International Journal of Humanoid Robotics (2006)
Gong, D., Yan, J., Zuo, G.: A review of gait optimization based on evolutionary computation, Applied Computational Intelligence and Soft Computing (2010)
Wang, J.M., Fleet, D.J., Hertzmann, A.: Optimizing walking controllers. In: ACM Transactions on Graphics (TOG) (2009)
Seungmoon, S., Hartmut, G.: Regulating speed and generating large speed transitions in a neuromuscular human walking model. In: Proceedings of the International Conference on Robotics and Automation (ICRA) (2012)
Kajita, S., Kanehiro, F., Kaneko, K., Fujiwara, K.: Biped walking pattern generation by using preview control of zero-moment point. In: Proceedings of the International Conference on Robotics and Automation (ICRA) (2003)
Shafii, N., Abdolmaleki, A., Ferreira, R., Lau, N., Reis, L.P.: Omnidirectional Walking and Active Balance for Soccer Humanoid Robot, in Progress in Artificial Intelligence (2013)
Vukobratovic, M., Stokic, D., Borovac, B., Surla, D.: Biped Locomotion: Dynamics, Stability, Control and Application. Springer, Berlin Heidelberg New York (1990)
Book MATH Google Scholar
Harada, K., Kajita, S., Kaneko, K., Hirukawa, H.: An analytical method for real-time gait planning for humanoid robots. International Journal of Humanoid Robotics (2006)
Srinivasan, M., Ruina, A.: Computer optimization of a minimal biped model discovers walking and running. Nature (2005)
Kagami, S., Nishivaki, K., Inaba, M., Inoue, H.: A Fast Dynamically Equilibrated Walking Trajectory Generation Method of Humanoid Robot. Autonomous Robots (2002)
Kofinas, N., Orfanoudakis, E., Lagoudakis, M.G.: Complete analytical inverse kinematics for NAO. In: Autonomous Robot Systems (Robotica) (2013)
Kajita, S., Kanehiro, F., Kaneko, K., Yokoi, K., Hirukawa, H.: The 3D linear inverted pendulum mode: a simple modeling for a biped walking pattern generation. Intelligent Robots and Systems (2001)
Cord, N., Rfer, T., Laue, T.: Gait optimization on a humanoid robot using particle swarm optimization. In: Proceedings of the Second Workshop on Humanoid Soccer Robots (2007)
Abdolmaleki, A., Shafii, N., Reis, L.P., Lau, N., Peters, J., Neumann, G.: Omnidirectional walking with a compliant inverted pendulum model. In: Advances in Artificial Intelligence–IBERAMIA (2014)
Ijspeert, A.J., Nakanishi, J., Schaal, S.: Learning attractor landscapes for learning motor primitives. In: Neural Information Processing Systems (NIPS) (2002)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Peters, J., Mlling, K., Altun, Y.: Relative entropy policy search. In: AAAI (2010)
MacAlpine, P., Barrett, S., Urieli, D., Vu, V., Stone, P.: Design and optimization of an omnidirectional humanoid walk: a winning approach at the RoboCup 2011 3D simulation competition. In: AAAI (2012)
Shafii, N., Lau, N., Reis, L.P.: Learning to Walk Fast: Optimized Hip Height Movement for Simulated and Real Humanoid Robots. Journal of Intelligent and Robotic Systems (2015)
Xu, Y., Vatankhah, H.: Simspark: an open source robot simulator developed by the RoboCup community. In: RoboCup 2013 (2014)
Glaser, S., Dorer, K.: Trunk controlled motion framework. In: Proceedings of the 8th Workshop on Humanoid Soccer Robots, IEEE-RAS International Conference on Humanoid Robots (2013)
Hansen, N., Mller, S.D., Koumoutsakos, P.: Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). In: Evolutionary Computation (2003)

Download references

Author information

Authors and Affiliations

DETI / IEETA, University of Aveiro, Aveiro, Portugal
Abbas Abdolmaleki & Nuno Lau
DSI, University of Minho, Guimarães, Portugal
Abbas Abdolmaleki & Luis Paulo Reis
LIACC, University of Porto, Porto, Portugal
Abbas Abdolmaleki & Luis Paulo Reis
IAS, TU Darmstadt, Darmstadt, Germany
Jan Peters
MPI for Intelligent Systems, Stuttgart, Germany
Jan Peters
CLAS, TU Darmstadt, Darmstadt, Germany
Gerhard Neumann

Authors

Abbas Abdolmaleki
View author publications
You can also search for this author in PubMed Google Scholar
Nuno Lau
View author publications
You can also search for this author in PubMed Google Scholar
Luis Paulo Reis
View author publications
You can also search for this author in PubMed Google Scholar
Jan Peters
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Neumann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abbas Abdolmaleki.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abdolmaleki, A., Lau, N., Reis, L.P. et al. Contextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller. J Intell Robot Syst 83, 393–408 (2016). https://doi.org/10.1007/s10846-016-0347-y

Download citation

Received: 14 June 2015
Accepted: 27 January 2016
Published: 15 February 2016
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10846-016-0347-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Contextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller

Abstract

Access this article

Similar content being viewed by others

A Data-Driven Approach to Estimate Human Center of Mass State During Perturbed Locomotion Using Simulated Wearable Sensors

Path Planning and Trajectory Planning Algorithms: A General Overview

A review of motion planning algorithms for intelligent robots

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Contextual Policy Search for Linear and Nonlinear Generalization of a Humanoid Walking Controller

Abstract

Access this article

Similar content being viewed by others

A Data-Driven Approach to Estimate Human Center of Mass State During Perturbed Locomotion Using Simulated Wearable Sensors

Path Planning and Trajectory Planning Algorithms: A General Overview

A review of motion planning algorithms for intelligent robots

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation