Abstract
Reinforcement learning based methods can achieve excellent results for robot locomotion control. However, their serious disadvantage is the long agent training time and large number of parameters defining its behavior. In this paper, we propose a method that significantly reduces training time. It is based on the Policy Modulating Trajectory Generator (PMTG) architecture, which uses Central Pattern Generators (CPG) as a gait generator. We tested this approach on an OpenAI BipedalWalker-v3 environment. The paper presents the results of this algorithm, showing its effectiveness in solving a locomotion problem over challenging terrain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
McGhee, R.B.: Finite state control of quadruped locomotion. Simulation 9(3), 135–140 (1967). https://doi.org/10.1177/003754976700900308
Raibert, M.H.: Legged Robots that Balance. MIT Press, Cambridge (1986)
Villarreal, O., Barasuol, V., Wensing, P.M., Caldwell, D.G., Semini, C.: Mpc-based controller with terrain insight for dynamic legged locomotion. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 2436–2442 (2020)
Sleiman, J.P., Farshidian, F., Minniti, M.V., Hutter, M.: A unified mpc framework for whole-body dynamic locomotion and manipulation. IEEE Robot. Autom. Lett. 6(3), 4688–4695 (2021)
Bjelonic, M., Grandia, R., Harley, O., Galliard, C., Zimmermann, S., Hutter, M.: Whole-body mpc and online gait sequence generation for wheeled-legged robots. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8388–8395 (2021)
Di Carlo, J., Wensing, P.M., Katz, B., Bledt, G., Kim, S.: Dynamic locomotion in the mit cheetah 3 through convex model-predictive control. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–9 (2018)
Iscen, A., Caluwaerts, K., Tan, J., Zhang, T., Coumans, E., Sindhwani, V., Vanhoucke, V.: Policies modulating trajectory generators. In: Conference on Robot Learning, pp. 916–926 (2018)
Tan, J., Zhang, T., Coumans, E., Iscen, A., Bai, Y., Hafner, D., Vanhoucke, V.: Sim-to-real: Learning agile locomotion for quadruped robots (2018). arXiv:1804.10332
Kumar, A., Fu, Z., Pathak, D., Malik, J.: Rma: rapid motor adaptation for legged robots (2021). arXiv:2107.04034
Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V., Hutter, M.: Learning agile and dynamic motor skills for legged robots. Sci. Robot. 4(26) (2019). https://doi.org/10.1126/scirobotics.aau5872
Margolis, G.B., Yang, G., Paigwar, K., Chen, T., Agrawal, P.: Rapid locomotion via reinforcement learning (2022). arXiv:2205.02824
Margolis, G.B., Chen, T., Paigwar, K., Fu, X., Kim, D., Kim, S., Agrawal, P: Learning to jump from pixels (2021). arXiv:2110.15344
Alexander, R.M.: Optimization and gaits in the locomotion of vertebrates. Physiol. Rev. 69(4), 1199–1227 (1989)
Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., et al.: Soft actor-critic algorithms and applications (2018). arXiv:1812.0590
Rudin, N., Hoeller, D., Reist, P., Hutter, M.: Learning to walk in minutes using massively parallel deep reinforcement learning. In: Conference on Robot Learning, pp. 91–100 (2022)
Danilov, V., Diane, S.: CPG-based gait generator for a quadruped robot with sidewalk and turning operations. In: Robotics in Natural Settings: CLAWAR 2022, pp. 276–288 (2022). https://doi.org/10.1007/978-3-031-15226-9_27
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym (2016). arXiv:1606.01540
Yu, J., Tan, M., Chen, J., Zhang, J.: A survey on CPG-inspired control models and system implementation. IEEE Trans. Neural Netw. Learn. Syst. 25(3), 441–456 (2013)
Yu, J., Tan, M., Chen, J., Zhang, J.: A survey on CPG-inspired control models and system implementation. IEEE Trans. Neural Netw. Learn. Syst. 25(3), 441–456 (2013)
Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2623–2631 (2019)
OpenAI Gym Leaderboard. https://github.com/openai/gym/wiki/Leaderboard
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Danilov, V., Klimov, K., Kapytov, D., Diane, S. (2024). Bipedal Walking Robot Control Using PMTG Architecture. In: Youssef, E.S.E., Tokhi, M.O., Silva, M.F., Rincon, L.M. (eds) Synergetic Cooperation between Robots and Humans. CLAWAR 2023. Lecture Notes in Networks and Systems, vol 811. Springer, Cham. https://doi.org/10.1007/978-3-031-47272-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-47272-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47271-8
Online ISBN: 978-3-031-47272-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)