Skip to main content
Log in

Hybrid Dynamic Control Algorithm for Humanoid Robots Based on Reinforcement Learning

  • Published:
Journal of Intelligent and Robotic Systems Aims and scope Submit manuscript

Abstract

In this paper, hybrid integrated dynamic control algorithm for humanoid locomotion mechanism is presented. The proposed structure of controller involves two feedback loops: model-based dynamic controller including impart-force controller and reinforcement learning feedback controller around zero-moment point. The proposed new reinforcement learning algorithm is based on modified version of actor-critic architecture for dynamic reactive compensation. Simulation experiments were carried out in order to validate the proposed control approach.The obtained simulation results served as the basis for a critical evaluation of the controller performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Atkeson, C.G., Santamaria, J.C.: A comparison of direct and model-based reinforcement learning. In: Proceedings of the 1997 IEEE International Conference on Robotics and Automation, pp. 3557–3564. Albuquerque, USA (1997)

  2. Benbrahim, H., Franklin, J.A.: Biped dynamic walking using reinforcement learning. Robot. Auton. Syst. 22, 283–302 (1997)

    Article  Google Scholar 

  3. Berenji, H.R., Khedkar, P.: Learning and tuning fuzzy logic controllers through reinforcements. IEEE Trans. Neural Netw. 3, 724–740 (1992)

    Article  Google Scholar 

  4. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont, USA (1996)

    MATH  Google Scholar 

  5. Chew, C., Pratt, G.A.: Dynamic bipedal walking assisted by learning. Robotica 20, 477–491 (2002)

    Article  Google Scholar 

  6. Doya, K.: Reinforcement learning in continuous time and space. Neural Comput. 12, 219–245 (2000)

    Article  Google Scholar 

  7. Gienger, M., Löffler, K., Pfeiffer, F.: Walking control of a biped robot based on inertial measurement. In: Proceedings of IARP International Workshop on Humanoid and Human Friendly Roboticsm, pp. 22–29. Tsukuba, Japan (2002)

  8. Gullapalli, V.: A stochastic reinforcement learning algorithm for learning real-valued functions. Neural Netw. 3, 671–692 (1990)

    Article  Google Scholar 

  9. Gullapalli, V., Franklin, J.A., Benbrahim, H.: Acquiring robot skills via reinforcement learning. IEEE Control Systems Magazine, pp. 13–24 (1994)

  10. Hirai, K., Hirose, M., Haikawa, Y., Takenaka, T.: The development of honda humanoid robot. In: Proceedings of the 1998 IEEE Int. Conference on Robotics and Automation, pp. 1321–1326 (1998)

  11. Kamio, S., Iba, H.: Adaptation technique for integrating genetic programming and reinforcement learning for real robot. IEEE Trans. Evol. Comput. 9(3), 318–333 (June 2005)

    Article  Google Scholar 

  12. Katić, D., Vukobratović, M.: Survay of intelligent control techniques for humanoid robots. J. Intell. Robot. Syst. 37, 117–141 (2003)

    Article  Google Scholar 

  13. Katić, D., Vukobratović, M.: Intelligent Control of Robotic Systems. Kluwer, Dordrecht, The Netherlands (2003)

    MATH  Google Scholar 

  14. Katic, D., Vukobratovic, M.: Survey of intelligent control algorithms for humanoid robots. In: Proceedings of the 16th IFAC World Congress, Prague, Czech Republic, July 2005

  15. Li, Q., Takanishi, A., Kato, I.: Learning control of compensative trunk motion for biped walking robot based on ZMP. In: Proceedings of the 1992 IEEE/RSJ Intl. Conference on Intelligent Robot and Systems, pp. 597–603 (1992)

  16. Mori, T., Nakamura, Y., Sato, M., Ishii, S.: Reinforcement learning for a cpg-driven biped robot. In: Proceedings of the Nineteenth National Conference on Artificial Intelligence (AAAI), pp. 623–630 (2004)

  17. Morimoto, J., Cheng, G., Atkeson, C.G., Zeglin,G.: A simple reinforcement learning algorithm for biped walking. In: Proceedings of the 2004 IEEE International Conference on Robotics & Automation. New Orleans, USA (2004)

  18. Nakamura, Y., Sato, M., Ishii, S.: Reinforcement learning for biped robot. In: Proceedings of the 2nd International Symposium on Adaptive Motion of Animals and Machines (2003)

  19. Park H.J., Chung H.: Hybrid control for biped robots using impedance control and computed-torque control. In: Proceedings of the 1999 IEEE International Conference on Robotics and Automation, pp. 1365–1370. Detroit, USA (1999)

  20. Park, H.J.: Impedance control for biped robot locomotion. IEEE Trans. Robot. Autom. 17(6), 10–6 (2001)

    Article  Google Scholar 

  21. Peters, J., Vijayakumar, S., Schaal, S.: Reinforcement learning for humanoid robots. In: Proceedings of the Third IEEE-RAS International Conference on Humanoid Robots, Karlsruhe & Munich (2003)

  22. Salatian, A.W., Yi, K.Y., Zheng, Y.F.: Reinforcement learning for a biped robot to climb sloping surfaces. J. Robot. Syst. 14, 283–296 (1997)

    Article  Google Scholar 

  23. Sugihara, T., Nakamura, Y.: Whole-body cooperative balancing of humanoid robot using COG Jacobian. In: Proceedings of the 2002 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2575–2580, Lausanne (2002)

  24. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT Press, Cambridge, USA (1998)

    Google Scholar 

  25. Tedrake, R., Zhang, T.W., Seung, H.S.: Stochastic policy gradient reinforcement learning on a simple 3d biped. In: Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (2004)

  26. Vukobratović, M., Borovac, B., Surla, D., Stokić, D.: Biped Locomotion - Dynamics, Stability, Control and Application. Springer, Berlin, Germany (1990)

    MATH  Google Scholar 

  27. Vukobratovic, M., Ekalo, Y.: New approach to control of robotic manipulators interacting with dynamic environment. Robotica 14, 31–39 (1996)

    Article  Google Scholar 

  28. Vukobratović, M., Borovac, B.: Zero-moment point – thirty five years of its life. Int. J. Humanoid Robot 1, 157–173 (2004)

    Article  Google Scholar 

  29. Vukobratović, M., Borovac, B.: Note on the article “zero-moment point – thirty five years of its life”. Int. J. Humanoid Robot. 2, 225–227 (2005)

    Article  Google Scholar 

  30. Watkins, C.J.C.H., Dayan, P.: Q learning. Mach. Learn. 8, 279–292 (1992)

    MATH  Google Scholar 

  31. Yokoi, F., Kanehiro, F., Kaneko, K., Fujiwara, K., Kajita, S., Hirukawa, H.: Experimental study of biped locomotion of humanoid robot HRP-1S. In: Siciliano, B., Dario, P. (eds.) Experimental Robotics VIII, pp. 75–84. Springer, Berlin, Germany (2003)

    Chapter  Google Scholar 

  32. Zhou, C., Meng, Q.: Reinforcement learning and fuzzy evaluative feedback for a biped robot. In: Proceedings of the 2000 IEEE International Conference on Robotics and Automation, pp. 3829–3834. San Francisco, USA (2000)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Duśko M. Katić.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Katić, D.M., Rodić, A.D. & Vukobratović, M.K. Hybrid Dynamic Control Algorithm for Humanoid Robots Based on Reinforcement Learning. J Intell Robot Syst 51, 3–30 (2008). https://doi.org/10.1007/s10846-007-9174-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10846-007-9174-5

Keywords

Navigation