research-article

Dynamic terrain traversal skills using reinforcement learning

Authors:

Michiel van de PanneAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 34, Issue 4

Article No.: 80, Pages 1 - 11

https://doi.org/10.1145/2766910

Published: 27 July 2015 Publication History

Abstract

The locomotion skills developed for physics-based characters most often target flat terrain. However, much of their potential lies with the creation of dynamic, momentum-based motions across more complex terrains. In this paper, we learn controllers that allow simulated characters to traverse terrains with gaps, steps, and walls using highly dynamic gaits. This is achieved using reinforcement learning, with careful attention given to the action representation, non-parametric approximation of both the value function and the policy; epsilon-greedy exploration; and the learning of a good state distance metric. The methods enable a 21-link planar dog and a 7-link planar biped to navigate challenging sequences of terrain using bounding and running gaits. We evaluate the impact of the key features of our skill learning pipeline on the resulting performance.

Supplementary Material

ZIP File (a80-peng.zip)

Supplemental files

Download
54.92 MB

MP4 File (a80.mp4)

Download
17.13 MB

References

[1]

Box2D, 2015. Box2d: A 2d physics engine for games, Jan. http://box2d.org.

[2]

Busoniu, L., Babuska, R., De Schutter, B., and Ernst, D. 2010. Reinforcement learning and dynamic programming using function approximators. CRC press.

Digital Library

[3]

Coros, S., Beaudoin, P., Yin, K. K., and van de Pann, M. 2008. Synthesis of constrained walking skills. ACM Trans. Graph. 27, 5, Article 113.

Digital Library

[4]

Coros, S., Beaudoin, P., and van de Panne, M. 2009. Robust task-based control policies for physics-based characters. ACM Transctions on Graphics 28, 5, Article 170.

Digital Library

[5]

Coros, S., Beaudoin, P., and van de Panne, M. 2010. Generalized biped walking control. ACM Transctions on Graphics 29, 4, Article 130.

Digital Library

[6]

Coros, S., Karpathy, A., Jones, B., Reveret, L., and van de Panne, M. 2011. Locomotion skills for simulated quadrupeds. ACM Transactions on Graphics 30, 4, Article 59.

Digital Library

[7]

de Lasa, M., Mordatch, I., and Hertzmann, A. 2010. Feature-based locomotion controllers. ACM Transactions on Graphics (TOG) 29, 4, 131.

Digital Library

[8]

Engel, Y., Mannor, S., and Meir, R. 2005. Reinforcement learning with gaussian processes. In Proceedings of the 22nd international conference on Machine learning, ACM, 201--208.

Digital Library

[9]

Fonteneau, R., Murphy, S. A., Wehenkel, L., and Ernst, D. 2013. Batch mode reinforcement learning based on the synthesis of artificial trajectories. Annals of operations research 208, 1, 383--416.

[10]

Geijtenbeek, T., and Pronost, N. 2012. Interactive character animation using simulated physics: A state-of-the-art review. In Computer Graphics Forum, vol. 31, 2492--2515.

Digital Library

[11]

Ha, S., and Liu, C. K. 2015. Iterative training of dynamic skills inspired by human coaching techniques. ACM Transactions on Graphics (TOG). to appear.

Digital Library

[12]

Ha, S., Ye, Y., and Liu, C. K. 2012. Falling and Landing Motion Control for Character Animation. ACM Transactions on Graphics 31, 6 (Nov.), 1.

Digital Library

[13]

Hansen, N. 2006. The cma evolution strategy: A comparing review. In Towards a New Evolutionary Computation, 75--102.

[14]

Jain, S., Ye, Y., and Liu, C. K. 2009. Optimization-based interactive motion synthesis. ACM Transactions on Graphics (TOG) 28, 1, 10.

Digital Library

[15]

Kwon, T., and Hodgins, J. 2010. Control systems for human running using an inverted pendulum model and a reference motion capture sequence. In Proc. of Symposium on Computer Animation, 129--138.

Digital Library

[16]

Lange, S., Gabel, T., and Riedmiller, M. 2012. Batch reinforcement learning. In Reinforcement Learning. Springer, 45--73.

[17]

Lee, J., and Lee, K. H. 2006. Precomputing avatar behavior from human motion data. Graphical Models 68, 2, 158--174.

Digital Library

[18]

Lee, Y., Lee, S. J., and Popović, Z. 2009. Compact character controllers. ACM Transctions on Graphics 28, 5, Article 169.

Digital Library

[19]

Lee, Y., Wampler, K., Bernstein, G., Popović, J., and Popović, Z. 2010. Motion fields for interactive character locomotion. ACM Transctions on Graphics 29, 6, Article 138.

Digital Library

[20]

Lee, Y., Kim, S., and Lee, J. 2010. Data-driven biped control. ACM Transctions on Graphics 29, 4, Article 129.

Digital Library

[21]

Levine, S., and Koltun, V. 2014. Learning complex neural network policies with trajectory optimization. In Proc. ICML 2014, 829--837.

[22]

Levine, S., Wang, J. M., Haraux, A., Popović, Z., and Koltun, V. 2012. Continuous character control with low-dimensional embeddings. ACM Transactions on Graphics 31, 4, 28.

Digital Library

[23]

Liu, L., Yin, K., van de Panne, M., and Guo, B. 2012. Terrain runner: control, parameterization, composition, and planning for highly dynamic motions. ACM Trans. Graph. 31, 6, 154.

Digital Library

[24]

Macchietto, A., Zordan, V., and Shelton, C. R. 2009. Momentum control for balance. In ACM Transactions on Graphics (TOG), vol. 28, ACM, 80.

Digital Library

[25]

McCann, J., and Pollard, N. 2007. Responsive characters from motion fragments. ACM Transactions on Graphics 26, 3, Article 6.

Digital Library

[26]

Mordatch, I., de Lasa, M., and Hertzmann, A. 2010. Robust physics-based locomotion using low-dimensional planning. ACM Trans. Graph. 29, 4, Article 71.

Digital Library

[27]

Muja, M., and Lowe, D. G. 2009. Fast approximate nearest neighbors with automatic algorithm configuration. In VISAPP (1), 331--340.

[28]

Ormoneit, D., and Sen, Ś. 2002. Kernel-based reinforcement learning. Machine learning 49, 2-3, 161--178.

Digital Library

[29]

Raibert, M. H., and Hodgins, J. K. 1991. Animation of dynamic legged locomotion. In ACM SIGGRAPH Computer Graphics, vol. 25, ACM, 349--358.

Digital Library

[30]

Ross, S., Gordon, G., and Bagnell, A. 2011. A reduction of imitation learning and structured prediction to noregret online learning. Journal of Machine Learning Research 15, 627--635.

[31]

Stewart, A. J., and Cremer, J. F. 1992. Beyond keyframing: an algorithmic approach to animation. In Graphics Interface, 273--281.

Digital Library

[32]

Tan, J., Gu, Y., Liu, C. K., and Turk, G. 2014. Learning bicycle stunts. ACM Trans. Graph. 33, 4, 50:1--50:12.

Digital Library

[33]

Treuille, A., Lee, Y., and Popović, Z. 2007. Near-optimal character animation with continuous control. ACM Transactions on Graphics (TOG) 26, 3, Article 7.

Digital Library

[34]

van Hasselt, H., and Wiering, M. A. 2007. Reinforcement learning in continuous action spaces. In Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on, IEEE, 272--279.

[35]

van Hasselt, H. 2012. Reinforcement learning in continuous state and action spaces. In Reinforcement Learning. Springer, 207--251.

[36]

Wang, J. M., Fleet, D. J., and Hertzmann, A. 2009. Optimizing walking controllers. ACM Transctions on Graphics 28, 5, Article 168.

Digital Library

[37]

Wei, X., Min, J., and Chai, J. 2011. Physically valid statistical models for human motion generation. ACM Transctions on Graphics 30, 3, Article 19.

Digital Library

[38]

Ye, Y., and Liu, C. K. 2010. Optimal feedback control for character animation using an abstract model. ACM Trans. Graph. 29, 4, Article 74.

Digital Library

[39]

Yin, K., Loken, K., and van de Panne, M. 2007. Simbicon: Simple biped locomotion control. ACM Transctions on Graphics 26, 3, Article 105.

Digital Library

[40]

Yin, K., Coros, S., Beaudoin, P., and van de Panne, M. 2008. Continuation methods for adapting simulated skills. ACM Transctions on Graphics 27, 3, Article 81.

Digital Library

[41]

Zordan, V. B., and Hodgins, J. K. 2002. Motion capture-driven simulations that hit and react. In Proc. of Symposium on Computer Animation, 89--96.

Digital Library

Cited By

Si ZGu TKwon T(2024)An Auto Obstacle Collision Avoidance System using Reinforcement Learning and Motion VAEJournal of the Korea Computer Graphics Society10.15701/kcgs.2024.30.4.130:4(1-10)Online publication date: 1-Sep-2024
https://doi.org/10.15701/kcgs.2024.30.4.1
Wang ZBenes BQureshi AMousas C(2024)Evolution-Based Shape and Behavior Co-Design of Virtual AgentsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.335574530:12(7579-7591)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1109/TVCG.2024.3355745
Kwon TGu TAhn JLee Y(2023)Adaptive Tracking of a Single-Rigid-Body Character in Various EnvironmentsSIGGRAPH Asia 2023 Conference Papers10.1145/3610548.3618187(1-11)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.1145/3610548.3618187
Show More Cited By

Index Terms

Dynamic terrain traversal skills using reinforcement learning
1. Computing methodologies
  1. Computer graphics
    1. Animation

Recommendations

Terrain-adaptive locomotion skills using deep reinforcement learning

Reinforcement learning offers a promising methodology for developing skills for simulated characters, but typically requires working with sparse hand-crafted features. Building on recent progress in deep reinforcement learning (DeepRL), we introduce a ...
Dynamic sprites: artistic authoring of interactive animations

Traditional methods for creating dynamic objects and characters from static drawings involve careful tweaking of animation curves and/or simulation parameters. Sprite sheets offer a more drawing-centric solution, but they do not encode timing ...
Animation of dynamic legged locomotion

This paper is about the use of control algorithms to animate dynamic legged locomotion. Control could free the animator from specifying the details of joint and limb motion while producing both physically realistic and natural looking results. We ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 34, Issue 4

August 2015

1307 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/2809654

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 July 2015

Published in TOG Volume 34, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

65
Total Citations
View Citations
778
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)3

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Si ZGu TKwon T(2024)An Auto Obstacle Collision Avoidance System using Reinforcement Learning and Motion VAEJournal of the Korea Computer Graphics Society10.15701/kcgs.2024.30.4.130:4(1-10)Online publication date: 1-Sep-2024
https://doi.org/10.15701/kcgs.2024.30.4.1
Wang ZBenes BQureshi AMousas C(2024)Evolution-Based Shape and Behavior Co-Design of Virtual AgentsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.335574530:12(7579-7591)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1109/TVCG.2024.3355745
Kwon TGu TAhn JLee Y(2023)Adaptive Tracking of a Single-Rigid-Body Character in Various EnvironmentsSIGGRAPH Asia 2023 Conference Papers10.1145/3610548.3618187(1-11)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.1145/3610548.3618187
Kim DBerseth GSchwartz MPark J(2023)Torque-Based Deep Reinforcement Learning for Task-and-Robot Agnostic Learning on Bipedal Robots Using Sim-to-Real TransferIEEE Robotics and Automation Letters10.1109/LRA.2023.33045618:10(6251-6258)Online publication date: Oct-2023
https://doi.org/10.1109/LRA.2023.3304561
Chen XHuang DLi MCai YWen ZCai ZYang W(2023)Evolving Physical Instinct for Morphology and Control Co-Adaption2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS55552.2023.10342243(6616-6623)Online publication date: 1-Oct-2023
https://doi.org/10.1109/IROS55552.2023.10342243
Xie ZStarke SLing Hvan de Panne M(2022)Learning Soccer Juggling Skills with Layer-wise Mixture-of-ExpertsACM SIGGRAPH 2022 Conference Proceedings10.1145/3528233.3530735(1-9)Online publication date: 27-Jul-2022
https://dl.acm.org/doi/10.1145/3528233.3530735
Won JGopinath DHodgins J(2022)Physics-based character controllers using conditional VAEsACM Transactions on Graphics10.1145/3528223.353006741:4(1-12)Online publication date: 22-Jul-2022
https://dl.acm.org/doi/10.1145/3528223.3530067
Babadi Avan de Panne MLiu CHamalainen P(2022)Learning Task-Agnostic Action Spaces for Movement OptimizationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.310009528:12(4700-4712)Online publication date: 1-Dec-2022
https://doi.org/10.1109/TVCG.2021.3100095
Tidd BHudson NCosgun ALeitner J(2022)Learning Setup Policies: Reliable Transition Between Locomotion BehavioursIEEE Robotics and Automation Letters10.1109/LRA.2022.32075677:4(11958-11965)Online publication date: Oct-2022
https://doi.org/10.1109/LRA.2022.3207567
Ma L(2022)Position Synchronization Control Algorithm of Legged Robot Based on DSP Centralized ControlMobile Networks and Applications10.1007/s11036-022-01914-w27:3(955-964)Online publication date: 1-Jun-2022
https://dl.acm.org/doi/10.1007/s11036-022-01914-w
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents