Abstract
A multi-agent system is a collection of autonomous, interacting agents that share a common environment. These entities observe their environment using sensors and interact with the environment. A multi-agent system that develops cooperative strategies by reinforcement learning does not perform well, mostly because of the sparse reward problem. This study conducts a 3D environment in which robots play the beach volleyball game. This study combines imitation learning (IL) with reinforcement learning (RL) to solve the sparse reward problem. The results show that the proposed approach gets a higher score in the Elo rating system and robots perform better than the conventional RL approach.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The data that support the findings of this study are avaliable from the corresponding author upon reasonable request.
References
Balaji et al (2010) “An introduction to multi-agent systems.” Innovations in multi-agent systems and applications-1. Springer, Berlin, Heidelberg. 1–27
Jin et al (2003) “Towards the applications of multi-agent techniques in intelligent transportation systems.” Proceedings of the 2003 IEEE International Conference on Intelligent Transportation Systems. Vol. 2. IEEE
Serugendo et al (2005) Self-organization in multi-agent systems. Knowl Eng Rev 20(2):165–189
Sutton et al (1999) Reinforcement learning: an introduction. Robotica 17(2):229–235
Levine Sergey et al (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373
Bellemare et al (2013) The arcade learning environment: an evaluation platform for general agents. J Artif Intell Res 47:253–279
Steven W (1992) “Reinforcement learning for the adaptive control of perception and action”
Yang W et al (2020) Survey on sparse reward in deep reinforcement learning. Comput Sci 47:182–191
Vecerik M et al (2017) “Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards.” arXiv preprint arXiv:1707.08817
Wan L et al (2019) “Survey on deep reinforcement learning theory and its application.” 32, 67-81. Pattem. Recognit. Aitificial Intell
Hüttenrauch et al (2019) Deep reinforcement learning for swarm systems. J Mach Learn Res 20(54):1–31
Silver D et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359
Oriol V et al (2017) “Starcraft ii: A new challenge for reinforcement learning.” arXiv:1708.04782
Mnih V et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Tuomas H et al (2018) “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.” International conference on machine learning. PMLR
John S et al (2017) “Proximal policy optimization algorithms.” arXiv preprint arXiv:1707.06347
Zhang J et al (2022) Proximal policy optimization via enhanced exploration efficiency. Inform Sci 609:750–765
Hua Jiang et al (2021) Learning for a robot: deep reinforcement learning, imitation learning, transfer learning. Sensors 21(4):1278
Pierre S et al (2016) “Unsupervised perceptual rewards for imitation learning.” arXiv preprint arXiv:1612.06699
Ng Andrew Y, Russell S (2000) “Algorithms for inverse reinforcement learning.” Icml. Vol. 1
Pomerleau DA (1991) Efficient training of artificial neural networks for autonomous navigation. Neural Comput 3(1):88–97
Vikash K et al (2016) “Learning dexterous manipulation policies from experience and imitation.” arXiv preprint arXiv:1611.05095
Ross Stéphane, Drew Bagnell (2010) “Efficient reductions for imitation learning.” Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings
Pelánek Radek (2016) Applications of the Elo rating system in adaptive educational systems. Comput Educ 98:169–179
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was presented in part at the joint symposium of the 28th International Symposium on Artificial Life and Robotics, the 8th International Symposium on BioComplexity, and the 6th International Symposium on Swarm Behavior and Bio-Inspired Robotics (Beppu, Oita and Online, January 25-27, 2023).
About this article
Cite this article
Han, Z., Liang, Y. & Ohkura, K. Developing multi-agent adversarial environment using reinforcement learning and imitation learning. Artif Life Robotics 28, 703–709 (2023). https://doi.org/10.1007/s10015-023-00912-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10015-023-00912-9