Skip to main content

Advertisement

Log in

Developing multi-agent adversarial environment using reinforcement learning and imitation learning

  • Original Article
  • Published:
Artificial Life and Robotics Aims and scope Submit manuscript

Abstract

A multi-agent system is a collection of autonomous, interacting agents that share a common environment. These entities observe their environment using sensors and interact with the environment. A multi-agent system that develops cooperative strategies by reinforcement learning does not perform well, mostly because of the sparse reward problem. This study conducts a 3D environment in which robots play the beach volleyball game. This study combines imitation learning (IL) with reinforcement learning (RL) to solve the sparse reward problem. The results show that the proposed approach gets a higher score in the Elo rating system and robots perform better than the conventional RL approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The data that support the findings of this study are avaliable from the corresponding author upon reasonable request.

References

  1. Balaji et al (2010) “An introduction to multi-agent systems.” Innovations in multi-agent systems and applications-1. Springer, Berlin, Heidelberg. 1–27

  2. Jin et al (2003) “Towards the applications of multi-agent techniques in intelligent transportation systems.” Proceedings of the 2003 IEEE International Conference on Intelligent Transportation Systems. Vol. 2. IEEE

  3. Serugendo et al (2005) Self-organization in multi-agent systems. Knowl Eng Rev 20(2):165–189

    Article  Google Scholar 

  4. Sutton et al (1999) Reinforcement learning: an introduction. Robotica 17(2):229–235

    Google Scholar 

  5. Levine Sergey et al (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373

    MathSciNet  MATH  Google Scholar 

  6. Bellemare et al (2013) The arcade learning environment: an evaluation platform for general agents. J Artif Intell Res 47:253–279

    Article  Google Scholar 

  7. Steven W (1992) “Reinforcement learning for the adaptive control of perception and action”

  8. Yang W et al (2020) Survey on sparse reward in deep reinforcement learning. Comput Sci 47:182–191

    Google Scholar 

  9. Vecerik M et al (2017) “Leveraging demonstrations for deep reinforcement learning on robotics problems with sparse rewards.” arXiv preprint arXiv:1707.08817

  10. Wan L et al (2019) “Survey on deep reinforcement learning theory and its application.” 32, 67-81. Pattem. Recognit. Aitificial Intell

  11. Hüttenrauch et al (2019) Deep reinforcement learning for swarm systems. J Mach Learn Res 20(54):1–31

    MathSciNet  MATH  Google Scholar 

  12. Silver D et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359

    Article  Google Scholar 

  13. Oriol V et al (2017) “Starcraft ii: A new challenge for reinforcement learning.” arXiv:1708.04782

  14. Mnih V et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  15. Tuomas H et al (2018) “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.” International conference on machine learning. PMLR

  16. John S et al (2017) “Proximal policy optimization algorithms.” arXiv preprint arXiv:1707.06347

  17. Zhang J et al (2022) Proximal policy optimization via enhanced exploration efficiency. Inform Sci 609:750–765

    Article  Google Scholar 

  18. Hua Jiang et al (2021) Learning for a robot: deep reinforcement learning, imitation learning, transfer learning. Sensors 21(4):1278

    Article  Google Scholar 

  19. Pierre S et al (2016) “Unsupervised perceptual rewards for imitation learning.” arXiv preprint arXiv:1612.06699

  20. Ng Andrew Y, Russell S (2000) “Algorithms for inverse reinforcement learning.” Icml. Vol. 1

  21. Pomerleau DA (1991) Efficient training of artificial neural networks for autonomous navigation. Neural Comput 3(1):88–97

    Article  Google Scholar 

  22. Vikash K et al (2016) “Learning dexterous manipulation policies from experience and imitation.” arXiv preprint arXiv:1611.05095

  23. Ross Stéphane, Drew Bagnell (2010) “Efficient reductions for imitation learning.” Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings

  24. Pelánek Radek (2016) Applications of the Elo rating system in adaptive educational systems. Comput Educ 98:169–179

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ziyao Han.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was presented in part at the joint symposium of the 28th International Symposium on Artificial Life and Robotics, the 8th International Symposium on BioComplexity, and the 6th International Symposium on Swarm Behavior and Bio-Inspired Robotics (Beppu, Oita and Online, January 25-27, 2023).

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, Z., Liang, Y. & Ohkura, K. Developing multi-agent adversarial environment using reinforcement learning and imitation learning. Artif Life Robotics 28, 703–709 (2023). https://doi.org/10.1007/s10015-023-00912-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10015-023-00912-9

Keywords