Developing multi-agent adversarial environment using reinforcement learning and imitation learning

Han, Ziyao; Liang, Yupeng; Ohkura, Kazuhiro

doi:10.1007/s10015-023-00912-9

Developing multi-agent adversarial environment using reinforcement learning and imitation learning

Original Article
Published: 17 October 2023

Volume 28, pages 703–709, (2023)
Cite this article

Artificial Life and Robotics Aims and scope Submit manuscript

Ziyao Han¹,
Yupeng Liang¹ &
Kazuhiro Ohkura¹

415 Accesses
Explore all metrics

Abstract

A multi-agent system is a collection of autonomous, interacting agents that share a common environment. These entities observe their environment using sensors and interact with the environment. A multi-agent system that develops cooperative strategies by reinforcement learning does not perform well, mostly because of the sparse reward problem. This study conducts a 3D environment in which robots play the beach volleyball game. This study combines imitation learning (IL) with reinforcement learning (RL) to solve the sparse reward problem. The results show that the proposed approach gets a higher score in the Elo rating system and robots perform better than the conventional RL approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Collision Avoidance in Crowded Zone Using Adversarial Reinforcement Learning

Adaptive Learning of Centralized and Decentralized Rewards in Multi-agent Imitation Learning

Arena Platform for Multi-Agent Reinforcement Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The data that support the findings of this study are avaliable from the corresponding author upon reasonable request.

References

Balaji et al (2010) “An introduction to multi-agent systems.” Innovations in multi-agent systems and applications-1. Springer, Berlin, Heidelberg. 1–27
Jin et al (2003) “Towards the applications of multi-agent techniques in intelligent transportation systems.” Proceedings of the 2003 IEEE International Conference on Intelligent Transportation Systems. Vol. 2. IEEE
Serugendo et al (2005) Self-organization in multi-agent systems. Knowl Eng Rev 20(2):165–189
Article Google Scholar
Sutton et al (1999) Reinforcement learning: an introduction. Robotica 17(2):229–235
Google Scholar
Levine Sergey et al (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373
MathSciNet MATH Google Scholar
Bellemare et al (2013) The arcade learning environment: an evaluation platform for general agents. J Artif Intell Res 47:253–279
Article Google Scholar
Steven W (1992) “Reinforcement learning for the adaptive control of perception and action”
Yang W et al (2020) Survey on sparse reward in deep reinforcement learning. Comput Sci 47:182–191
Google Scholar
Vecerik M et al (2017) “Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards.” arXiv preprint arXiv:1707.08817
Wan L et al (2019) “Survey on deep reinforcement learning theory and its application.” 32, 67-81. Pattem. Recognit. Aitificial Intell
Hüttenrauch et al (2019) Deep reinforcement learning for swarm systems. J Mach Learn Res 20(54):1–31
MathSciNet MATH Google Scholar
Silver D et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359
Article Google Scholar
Oriol V et al (2017) “Starcraft ii: A new challenge for reinforcement learning.” arXiv:1708.04782
Mnih V et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Article Google Scholar
Tuomas H et al (2018) “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.” International conference on machine learning. PMLR
John S et al (2017) “Proximal policy optimization algorithms.” arXiv preprint arXiv:1707.06347
Zhang J et al (2022) Proximal policy optimization via enhanced exploration efficiency. Inform Sci 609:750–765
Article Google Scholar
Hua Jiang et al (2021) Learning for a robot: deep reinforcement learning, imitation learning, transfer learning. Sensors 21(4):1278
Article Google Scholar
Pierre S et al (2016) “Unsupervised perceptual rewards for imitation learning.” arXiv preprint arXiv:1612.06699
Ng Andrew Y, Russell S (2000) “Algorithms for inverse reinforcement learning.” Icml. Vol. 1
Pomerleau DA (1991) Efficient training of artificial neural networks for autonomous navigation. Neural Comput 3(1):88–97
Article Google Scholar
Vikash K et al (2016) “Learning dexterous manipulation policies from experience and imitation.” arXiv preprint arXiv:1611.05095
Ross Stéphane, Drew Bagnell (2010) “Efficient reductions for imitation learning.” Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings
Pelánek Radek (2016) Applications of the Elo rating system in adaptive educational systems. Comput Educ 98:169–179
Article Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Advanced Science and Engineering, Hiroshima University, Hiroshima, Japan
Ziyao Han, Yupeng Liang & Kazuhiro Ohkura

Authors

Ziyao Han
View author publications
You can also search for this author inPubMed Google Scholar
Yupeng Liang
View author publications
You can also search for this author inPubMed Google Scholar
Kazuhiro Ohkura
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Ziyao Han.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was presented in part at the joint symposium of the 28th International Symposium on Artificial Life and Robotics, the 8th International Symposium on BioComplexity, and the 6th International Symposium on Swarm Behavior and Bio-Inspired Robotics (Beppu, Oita and Online, January 25-27, 2023).

About this article

Cite this article

Han, Z., Liang, Y. & Ohkura, K. Developing multi-agent adversarial environment using reinforcement learning and imitation learning. Artif Life Robotics 28, 703–709 (2023). https://doi.org/10.1007/s10015-023-00912-9

Download citation

Received: 16 May 2023
Accepted: 29 September 2023
Published: 17 October 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10015-023-00912-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Developing multi-agent adversarial environment using reinforcement learning and imitation learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Collision Avoidance in Crowded Zone Using Adversarial Reinforcement Learning

Adaptive Learning of Centralized and Decentralized Rewards in Multi-agent Imitation Learning

Arena Platform for Multi-Agent Reinforcement Learning

Explore related subjects

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now