research-article

Deep Integration of Physical Humanoid Control and Crowd Navigation

Authors:

Brandon Haworth,

Seonghyeon Moon,

Petros Faloutsos,

Mubbasir KapadiaAuthors Info & Claims

MIG '20: Proceedings of the 13th ACM SIGGRAPH Conference on Motion, Interaction and Games

Article No.: 15, Pages 1 - 10

https://doi.org/10.1145/3424636.3426894

Published: 22 November 2020 Publication History

Abstract

Many multi-agent navigation approaches make use of simplified representations such as a disk. These simplifications allow for fast simulation of thousands of agents but limit the simulation accuracy and fidelity. In this paper, we propose a fully integrated physical character control and multi-agent navigation method. In place of sample complex online planning methods, we extend the use of recent deep reinforcement learning techniques. This extension improves on multi-agent navigation models and simulated humanoids by combining Multi-Agent and Hierarchical Reinforcement Learning. We train a single short term goal-conditioned low-level policy to provide directed walking behaviour. This task-agnostic controller can be shared by higher-level policies that perform longer-term planning. The proposed approach produces reciprocal collision avoidance, robust navigation, and emergent crowd behaviours. Furthermore, it offers several key affordances not previously possible in multi-agent navigation including tunable character morphology and physically accurate interactions with agents and the environment. Our results show that the proposed method outperforms prior methods across environments and tasks, as well as, performing well in terms of zero-shot generalization over different numbers of agents and computation time.

Supplementary Material

MP4 File (a15-haworth-video1.mp4)

Download
14.47 MB

MP4 File (a15-haworth-video2.mp4)

Download
10.24 MB

MP4 File (a15-haworth-video3.mp4)

Download
33.10 MB

References

[1]

Brian Allen and Petros Faloutsos. 2009a. Complex networks of simple neurons for bipedal locomotion. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 4457–4462.

[2]

Brian Allen and Petros Faloutsos. 2009b. Evolved controllers for simulated locomotion. In Lecture Notes in Computer Science: Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, Vol. 5884 LNCS. Springer, 219–230.

[3]

Glen Berseth, Mubbasir Kapadia, and Petros Faloutsos. 2015. Robust space-time footsteps for agent-based steering. Computer Animation and Virtual Worlds(2015).

[4]

Glen Berseth, Xue Bin Peng, and Michiel van de Panne. 2018. Terrain RL Simulator. CoRR abs/1804.06424(2018). arxiv:1804.06424http://arxiv.org/abs/1804.06424

[5]

Hugo Bruggeman, Wendy Zosh, and William H Warren. 2007. Optic flow drives human visuo-locomotor adaptation. Current biology 17, 23 (2007), 2035–2040.

[6]

Lucian Bu, Robert Babu, Bart De Schutter, 2008. A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38, 2 (2008), 156–172.

Digital Library

[7]

Caroline Claus and Craig Boutilier. 1998. The dynamics of reinforcement learning in cooperative multiagent systems. AAAI/IAAI 1998(1998), 746–752.

Digital Library

[8]

Alain Dutech, Olivier Buffet, and François Charpillet. 2001. Multi-agent systems by incremental gradient reinforcement learning. In International Joint Conference on Artificial Intelligence, Vol. 17. Citeseer, 833–838.

[9]

Petros Faloutsos, Michiel Van de Panne, and Demetri Terzopoulos. 2001. Composable controllers for physics-based character animation. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. ACM, 251–260.

Digital Library

[10]

Scott Fujimoto, Herke Van Hoof, and David Meger. 2018. Addressing function approximation error in actor-critic methods. arXiv preprint arXiv:1802.09477(2018).

[11]

Tao Geng, Bernd Porr, and Florentin Wörgötter. 2006. A reflexive neural network for dynamic biped walking control.Neural Computation 18, 5 (2006), 1156–96.

[12]

Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, and Alexandre Alahi. 2018. Social gan: Socially acceptable trajectories with generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2255–2264.

[13]

Dongge Han, Wendelin Boehmer, Michael Wooldridge, and Alex Rogers. 2019. Multi-Agent Hierarchical Reinforcement Learning with Dynamic Termination. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems. 2006–2008.

Digital Library

[14]

Dirk Helbing, Illés Farkas, and Tamas Vicsek. 2000. Simulating dynamical features of escape panic. Nature 407, 6803 (2000), 487–490.

[15]

Dirk Helbing and Peter Molnar. 1995. Social force model for pedestrian dynamics. Physical review E 51, 5 (1995), 4282.

[16]

Rico Jonschkowski and Oliver Brock. 2015. Learning state representations with robotic priors. Autonomous Robots 39, 3 (2015), 407–428.

Digital Library

[17]

L P Kaelbling. 1993. Learning to achieve goals. In International Joint Conference on Artificial Intelligence (IJCAI), Vol. vol.2. 1094 – 8.

[18]

Mubbasir Kapadia, Nuria Pelechano, Jan Allbeck, and Norm Badler. 2015. Virtual crowds: Steps toward behavioral realism. Synthesis lectures on visual computing: computer graphics, animation, computational photography, and imaging 7, 4 (2015), 1–270.

[19]

Mubbasir Kapadia, Matt Wang, Shawn Singh, Glenn Reinman, and Petros Faloutsos. 2011. Scenario space: characterizing coverage, quality, and failure of steering algorithms. In Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. ACM, 53–62.

Digital Library

[20]

Ioannis Karamouzas, Peter Heil, Pascal Van Beek, and Mark H Overmars. 2009. A predictive collision avoidance model for pedestrian simulation. In International workshop on motion in games. Springer, 41–52.

Digital Library

[21]

Sujeong Kim, StephenJ. Guy, Karl Hillesland, Basim Zafar, AdnanAbdul-Aziz Gutub, and Dinesh Manocha. 2014. Velocity-based modeling of physical interactions in dense crowds. The Visual Computer (2014), 1–15. https://doi.org/10.1007/s00371-014-0946-1

Digital Library

[22]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).

[23]

Andrew Kun and W. Thomas Miller III. 1996. Adaptive dynamic balance of a biped robot using neural networks. In Proceedings of the IEEE International Conference on Robotics and Automation, Vol. pages. IEEE, 240–245.

[24]

Jaedong Lee, Jungdam Won, and Jehee Lee. 2018. Crowd simulation by deep reinforcement learning. In Proceedings of the 11th Annual International Conference on Motion, Interaction, and Games. ACM, 2.

Digital Library

[25]

Michael L Littman. 1994. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994. Elsevier, 157–163.

Digital Library

[26]

Ryan Lowe, YI WU, Aviv Tamar, Jean Harb, OpenAI Pieter Abbeel, and Igor Mordatch. 2017. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. In Advances in Neural Information Processing Systems 30. 6379–6390.

[27]

Francisco Martinez-Gil, Miguel Lozano, and Fernando Fernández. 2015. Strategies for simulating pedestrian navigation with multiple reinforcement learning agents. Autonomous Agents and Multi-Agent Systems 29, 1 (2015), 98–130.

Digital Library

[28]

Josh Merel, Arun Ahuja, Vu Pham, Saran Tunyasuvunakool, Siqi Liu, Dhruva Tirumala, Nicolas Heess, and Greg Wayne. 2018. Hierarchical visuomotor control of humanoids. CoRR abs/1811.09656(2018). arxiv:1811.09656http://arxiv.org/abs/1811.09656

[29]

W. Thomas Miller III. 1994. Real-time neural network control of a biped walking robot. Control Systems, IEEE 14, 1 (1994), 41–48.

[30]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529.

[31]

Ranjit Nair, Milind Tambe, Makoto Yokoo, David Pynadath, and Stacy Marsella. 2003. Taming decentralized POMDPs: Towards efficient policy computation for multiagent settings. In IJCAI, Vol. 3. 705–711.

[32]

OpenAI. 2018. OpenAI Five. https://blog.openai.com/openai-five/.

[33]

Nuria Pelechano, Jan M Allbeck, Mubbasir Kapadia, and Norman I Badler. 2016. Simulating heterogeneous crowds with interactive behaviors. CRC Press.

Digital Library

[34]

Xue Bin Peng, Glen Berseth, KangKang Yin, and Michiel Van De Panne. 2017. Deeploco: Dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Transactions on Graphics (TOG) 36, 4 (2017), 41.

Digital Library

[35]

John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust region policy optimization. In International conference on machine learning. 1889–1897.

[36]

John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, and Pieter Abbeel. 2016. High-dimensional continuous control using generalized advantage estimation. In International Conference on Learning Representations (ICLR 2016).

[37]

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov. 2017. Proximal Policy Optimization Algorithms. ArXiv e-prints (July 2017). arxiv:1707.06347 [cs.LG]

[38]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347(2017).

[39]

David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, and Martin Riedmiller. 2014. Deterministic policy gradient algorithms. In Proc. ICML.

[40]

David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, and et al.2017. Mastering the game of Go without human knowledge. Nature 550, 7676 (Oct 2017), 354–359.

[41]

Shawn Singh, Mubbasir Kapadia, Glenn Reinman, and Petros Faloutsos. 2011. Footstep navigation for dynamic crowds. Computer Animation and Virtual Worlds 22, 2-3 (2011), 151–158.

Digital Library

[42]

Gentaro Taga, Yoko Yamaguchi, and Hiroshi Shinizu. 1991. Self-organized control of bipedal locomotion by neural oscillators in unpredicatable environments. Biological Cybernetics 65, 3 (1991), 147–159.

Digital Library

[43]

Ming Tan. 1993. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the tenth international conference on machine learning. 330–337.

Digital Library

[44]

Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Changjie Fan, and Li Wang. 2018. Hierarchical deep multiagent reinforcement learning. arXiv preprint arXiv:1809.09332(2018).

[45]

Daniel Thalmann and Soraia Raupp Musse. 2013. . Springer.

[46]

Lisa Torrey. 2010. Crowd Simulation via Multi-agent Reinforcement Learning. In Proceedings of the Sixth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment(Stanford, California, USA) (AIIDE’10). AAAI Press, 89–94.

Digital Library

[47]

Michael R Tucker, Jeremy Olivier, Anna Pagel, Hannes Bleuler, Mohamed Bouri, Olivier Lambercy, José del R Millán, Robert Riener, Heike Vallery, and Roger Gassert. 2015. Control strategies for active lower extremity prosthetics and orthotics: a review. Journal of neuroengineering and rehabilitation 12, 1(2015), 1.

[48]

Jur Van Den Berg, Stephen J Guy, Ming Lin, and Dinesh Manocha. 2011. Reciprocal n-body collision avoidance. In Robotics research. Springer, 3–19.

[49]

Jur Van den Berg, Ming Lin, and Dinesh Manocha. 2008. Reciprocal velocity obstacles for real-time multi-agent navigation. In 2008 IEEE International Conference on Robotics and Automation. IEEE, 1928–1935.

[50]

William H Warren Jr, Bruce A Kay, Wendy D Zosh, Andrew P Duchon, and Stephanie Sahuc. 2001. Optic flow is used to control human walking. Nature neuroscience 4, 2 (2001), 213.

[51]

Manuel Watter, Jost Springenberg, Joschka Boedecker, and Martin Riedmiller. 2015. Embed to control: A locally linear latent dynamics model for control from raw images. In Advances in neural information processing systems. 2746–2754.

[52]

David Wilkie, Jur Van Den Berg, and Dinesh Manocha. 2009. Generalized velocity obstacles. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 5573–5578.

[53]

KangKang Yin, Kevin Loken, and Michiel van de Panne. 2007. SIMBICON: Simple Biped Locomotion Control. ACM Transactions on Graphics 26, 3 (2007), Article 105.

Digital Library

[54]

Petr Zaytsev, S Javad Hasaneini, and Andy Ruina. 2015. Two steps is enough: no need to plan far ahead for walking balance. In 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 6295–6300.

[55]

Amy Zhang, Nicolas Ballas, and Joelle Pineau. 2018. A dissection of overfitting and generalization in continuous reinforcement learning. arXiv preprint arXiv:1806.07937(2018).

Cited By

Yang HHong S(2025)Creating an Anthropomorphic Folktale Animal: A Pilot Study on Character Design Creativity Derived From Autonomous Behavior Generation Powered by Reinforcement LearningComputer Animation and Virtual Worlds10.1002/cav.7001336:1Online publication date: 28-Jan-2025
https://doi.org/10.1002/cav.70013
Shao WLin FXu RMeng FWang H(2024)UAV-Assisted NOMA Network Power Allocation Under Offshore Multi-energy Complementary Power Generation SystemProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9239-3_12(125-136)Online publication date: 4-Jan-2024
https://doi.org/10.1007/978-981-99-9239-3_12
Ye JLiu ZLiu TWu YWang Y(2024)Crowd evacuation simulation based on hierarchical agent model and physics‐based character controlComputer Animation and Virtual Worlds10.1002/cav.226335:3Online publication date: 27-May-2024
https://doi.org/10.1002/cav.2263
Show More Cited By

Recommendations

Crowd simulation by deep reinforcement learning
MIG '18: Proceedings of the 11th ACM SIGGRAPH Conference on Motion, Interaction and Games

Simulating believable virtual crowds has been an important research topic in many research fields such as industry films, computer games, urban engineering, and behavioral science. One of the key capabilities agents should have is navigation, which is ...
Vision-Based Humanoid Robot Navigation in a Featureless Environment
MCPR 2015: Proceedings of the 7th Mexican Conference on Pattern Recognition - Volume 9116

One of the most basic tasks for any autonomous mobile robot is that of safely navigating from one point to another e.g. service robots should be able to find their way in different kinds of environments. Typically, vision is used to find landmarks in ...
Vision-based maze navigation for humanoid robots

We present a vision-based approach for navigation of humanoid robots in networks of corridors connected through curves and junctions. The objective of the humanoid is to follow the corridors, walking as close as possible to their center to maximize ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MIG '20: Proceedings of the 13th ACM SIGGRAPH Conference on Motion, Interaction and Games

October 2020

190 pages

ISBN:9781450381710

DOI:10.1145/3424636

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 November 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Ontario Research Fund
NSERC

Conference

MIG '20

Sponsor:

SIGGRAPH

MIG '20: Motion, Interaction and Games

October 16 - 18, 2020

SC, Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate -9 of -9 submissions, 100%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
389
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)7

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang HHong S(2025)Creating an Anthropomorphic Folktale Animal: A Pilot Study on Character Design Creativity Derived From Autonomous Behavior Generation Powered by Reinforcement LearningComputer Animation and Virtual Worlds10.1002/cav.7001336:1Online publication date: 28-Jan-2025
https://doi.org/10.1002/cav.70013
Shao WLin FXu RMeng FWang H(2024)UAV-Assisted NOMA Network Power Allocation Under Offshore Multi-energy Complementary Power Generation SystemProceedings of the 13th International Conference on Computer Engineering and Networks10.1007/978-981-99-9239-3_12(125-136)Online publication date: 4-Jan-2024
https://doi.org/10.1007/978-981-99-9239-3_12
Ye JLiu ZLiu TWu YWang Y(2024)Crowd evacuation simulation based on hierarchical agent model and physics‐based character controlComputer Animation and Virtual Worlds10.1002/cav.226335:3Online publication date: 27-May-2024
https://doi.org/10.1002/cav.2263
Younes MKijak EKulpa RMalinowski SMulton F(2023)MAAIPProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36069266:3(1-20)Online publication date: 24-Aug-2023
https://doi.org/10.1145/3606926
Zhang YGopinath DYe YHodgins JTurk GWon J(2023)Simulation and Retargeting of Complex Multi-Character InteractionsACM SIGGRAPH 2023 Conference Proceedings10.1145/3588432.3591491(1-11)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.1145/3588432.3591491
Hu KHaworth BBerseth GPavlovic VFaloutsos PKapadia M(2023)Heterogeneous Crowd Simulation Using Parametric Reinforcement LearningIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.313903129:4(2036-2052)Online publication date: 1-Apr-2023
https://doi.org/10.1109/TVCG.2021.3139031
Ma YKhan QCremers D(2023)Multi Agent Navigation in Unconstrained Environments using a Centralized Attention based Graphical Neural Network Controller2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC57777.2023.10422072(2893-2900)Online publication date: 24-Sep-2023
https://doi.org/10.1109/ITSC57777.2023.10422072
Rempe DLuo ZPeng XYuan YKitani KKreis KFidler SLitany O(2023)Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.01322(13756-13766)Online publication date: Jun-2023
https://doi.org/10.1109/CVPR52729.2023.01322
Zhakatayev AAvazov NRogovchenko YPatzold M(2023)Human Motion Synthesis Using Trigonometric SplinesIEEE Access10.1109/ACCESS.2023.324406211(14293-14308)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3244062
Won JGopinath DHodgins J(2022)Physics-based character controllers using conditional VAEsACM Transactions on Graphics10.1145/3528223.353006741:4(1-12)Online publication date: 22-Jul-2022
https://dl.acm.org/doi/10.1145/3528223.3530067
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten