ABSTRACT
Multi-agent RL is a process of training the agents to collaborate with others. We argue that an additional 'reality gap' in the system aspects occurs when applying sim2real to the multi-agent RL, especially when performing the 'transferred' collaborative task in the real-world environment. In this paper, we propose an ADO framework enabling decentralized agents to participate in performing collaborative tasks without suffering from the reality gap. Our contribution is threefold. First, we clearly identify and summarize the reality gaps in the context of the sim2real of multi-agent RL. Second, we propose a new system model to deal with system issues derived from when executing collaborative tasks. Third, we design and implement a software framework to support system issues required in developing and executing collaborative tasks in the real world.
- Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. (2016), 1--4. arXiv:1606.01540Google Scholar
- Yevgen Chebotar, Ankur Handa, Viktor Makoviychuk, Miles Macklin, Jan Issac, Nathan Ratliff, and Dieter Fox. 2018. Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience. (2018). arXiv:1810.05687Google Scholar
- Carl Hewitt. 2011. Actor Model of Computation. Inconsistency Robustness 2011 (2011), 1--25. arXiv:1008.1459Google Scholar
- Zhang Wei Hong, Yu Ming Chen, Hsuan Kung Yang, Shih Yang Su, Tzu Yun Shann, Yi Hsiang Chang, Brian Hsi Lin Ho, Chih Chieh Tu, Tsu Ching Hsiao, Hsin Wei Hsiao, Sih Pin Lai, Yueh Chuan Chang, and Chun Yi Lee. 2018. Virtual-to-real: Learning to control in visual semantic segmentation. IJCAI International Joint Conference on Artificial Intelligence 2018-July (2018), 4912--4920. arXiv:arXiv:1802.00285v4Google ScholarCross Ref
- Stephen James, Paul Wohlhart, Mrinal Kalakrishnan, Dmitry Kalashnikov, Alex Irpan, Julian Ibarz, Sergey Levine, Raia Hadsell, and Konstantinos Bousmalis. 2018. Sim-to-Real via Sim-to-Sim: Data-efficient Robotic Grasping via Randomized-to-Canonical Adaptation Networks. (2018). arXiv:1812.07252Google Scholar
- Svetoslav Kolev and Emanuel Todorov. 2015. Physically consistent state estimation and system identification for contacts. IEEE-RAS International Conference on Humanoid Robots 2015-December (2015), 1036--1043.Google ScholarCross Ref
- Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. (2015). arXiv:1509.02971Google Scholar
- Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, and Igor Mordatch. 2017. Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in Neural Information Processing Systems 2017-December (2017), 6380--6391. arXiv:arXiv:1706.02275v3Google Scholar
- OpenAI. 2017. Multi Agent Particle Environment. https://github.com/openai/multiagent-particle-envsGoogle Scholar
- Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. 2018. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization. Proceedings - IEEE International Conference on Robotics and Automation (2018), 3803--3810.Google ScholarCross Ref
- Marc Shapiro, Nuno Pregui, Carlos Baquero, Marek Zawirski, Carlos Baquero, Marek Zawirski, and Conflict-free Replicated Data. 2011. Conflict-free Replicated Data Types To cite this version:. (2011).Google Scholar
- Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. 2017. Domain randomization for transferring deep neural networks from simulation to the real world. IEEE International Conference on Intelligent Robots and Systems 2017-September (2017), 23--30. arXiv:arXiv:1703.06907v1Google ScholarCross Ref
Index Terms
- A sim2real framework enabling decentralized agents to execute MADDPG tasks
Recommendations
Multi-agent Reinforcement Learning for Decentralized Stable Matching
Algorithmic Decision TheoryAbstractIn the real world, people/entities usually find matches independently and autonomously, such as finding jobs, partners, roommates, etc. It is possible that this search for matches starts with no initial knowledge of the environment. We propose the ...
Training Cooperative Agents for Multi-Agent Reinforcement Learning
AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent SystemsDeep Learning and back-propagation has been successfully used to perform centralized training with communication protocols among multiple agents in a cooperative environment. In this paper we present techniques for centralized training of Multi-Agent (...
Shifting Reward Assignment for Learning Coordinated Behavior in Time-Limited Ordered Tasks
Advances in Practical Applications of Agents, Multi-Agent Systems, and Complex Systems Simulation. The PAAMS CollectionAbstractWe propose a variable reward scheme in decentralized multi-agent deep reinforcement learning for a sequential task consisting of a number of subtasks which can be completed when all subtasks are executed in a certain order before a deadline by ...
Comments