Multiagent Reinforcement Learning for Swarm Confrontation Environments

Zhang, Guanyu; Li, Yuan; Xu, Xinhai; Dai, Huadong

doi:10.1007/978-3-030-27535-8_48

Guanyu Zhang¹⁴,
Yuan Li¹⁵,
Xinhai Xu¹⁵ &
…
Huadong Dai¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11742))

Included in the following conference series:

International Conference on Intelligent Robotics and Applications

3585 Accesses
7 Citations

Abstract

The swarm confrontation problem is always a hot research topic, which has attracted much attention. Previous research focuses on devising rules to improve the intelligence of the swarm, which is not suitable for complex scenarios. Multi-agent reinforcement learning has been used in some similar confrontation tasks. However, many of these works take centralized method to control all entities in a swarm, which is hard to meet the real-time requirement of practical systems. Recently, OpenAI proposes Multi-Agent Deep Deterministic Policy Gradient algorithm (MADDPG), which can be used for centralized training but decentralized execution in multi-agent environments. We examine the method in our constructed swarm confrontation environment and find that it is not easy to deal with complex scenarios. We propose two improved training methods, scenario-transfer training and self-play training, which greatly enhance the performance of MADDPG. Experimental results show that the scenario-transfer training accelerate the convergence speed by 50%, and the self-play training increases the winning rate of MADDPG from 42% to 96%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Peng, P., Quan, Y., Ying, W., Yang, Y., Wang, J.: Multiagent bidirectionally-coordinated nets for learning to play StarCraft combat games (2017)
Google Scholar
Besada-Portas, E., Torre, L.D.L., Cruz, J.M.D.L., Andrés-Toro, B.D.: Evolutionary trajectory planner for multiple uavs in realistic scenarios. IEEE Trans. Robot. 26(4), 619–634 (2010)
Article Google Scholar
Fields, M.A., Haas, E., Hill, S., Stachowiak, C., Barnes, L.: Effective robot team control methodologies for battlefield applications. In: IEEE/RSJ International Conference on Intelligent Robots & Systems (2009)
Google Scholar
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)
Article Google Scholar
Volodymyr, M., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Article Google Scholar
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
MATH Google Scholar
Peters, J., Bagnell, J.A.: Policy gradient methods. Encycl. Mach. Learn. 5(11), 774–776 (2010)
Google Scholar
Buşoniu, L., Babuška, R., De Schutter, B.: Multi-agent reinforcement learning: an overview. In: Srinivasan, D., Jain, L.C. (eds.) Innovations in Multi-Agent Systems and Applications - 1. SCI, vol. 310, pp. 183–221. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14435-6_7
Chapter Google Scholar
Matignon, L., Laurent, G.J., Fort-Piat, N.L.: Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems. Knowl. Eng. Rev. 27(1), 1–31 (2012)
Article Google Scholar
Tampuu, A., et al.: Multiagent cooperation and competition with deep reinforcement learning. Plos One 12(4), e0172395 (2017)
Article Google Scholar
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Neural Information Processing Systems (NIPS) (2017)
Google Scholar
Luo, D., Yang, X.U., Zhang, J.: New progresses on UAV swarm confrontation. Sci. Technol. Rev. 35, 26–31 (2017)
Google Scholar
Yan, J., Minai, A.A., Polycarpou, M.M.: Cooperative real-time search and task allocation in UAV teams. In: IEEE Conference on Decision & Control (2003)
Google Scholar
Busoniu, L., Babuska, R., De Schutter, B.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C 38(2), 156–172 (2008)
Article Google Scholar
Pan, S., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010)
Article Google Scholar
Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10(10), 1633–1685 (2009)
MathSciNet MATH Google Scholar
Genesereth, M.R., Love, N., Pell, B.: General game playing: overview of the aaai competition. AI Mag. 26(2), 62–72 (2005)
Google Scholar
Heinrich, J., Silver, D.: Deep reinforcement learning from self-play in imperfect-information games (2016)
Google Scholar
Babel, L.: Coordinated target assignment and UAV path planning with timing constraints. J. Intell. Robot. Syst. 94, 1–13 (2018)
Google Scholar

Download references

Acknowledgement

This work was partially supported by the National Natural Science Foundation of China (No. 91648204 and 61532007), the National Key Research and Development Program of China (No. 2017YFB1001900 and 2017YFB1301104), and the National Science and Technology Major Project.

Author information

Authors and Affiliations

National University of Defense Technology, Changsha, 410073, Hunan, China
Guanyu Zhang
Artificial Intelligence Research Center, National Innovation Institute of Defense Technology, Beijing, 100071, China
Yuan Li, Xinhai Xu & Huadong Dai

Authors

Guanyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Xinhai Xu
View author publications
You can also search for this author in PubMed Google Scholar
Huadong Dai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan Li .

Editor information

Editors and Affiliations

Shenyang Institute of Automation, Shenyang, China
Haibin Yu
Shenyang Institute of Automation, Shenyang, China
Jinguo Liu
Shenyang Institute of Automation, Shenyang, China
Lianqing Liu
University of Portsmouth, Portsmouth, UK
Zhaojie Ju
Shenyang Institute of Automation, Shenyang, China
Yuwang Liu
University of Portsmouth, Portsmouth, UK
Dalin Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, G., Li, Y., Xu, X., Dai, H. (2019). Multiagent Reinforcement Learning for Swarm Confrontation Environments. In: Yu, H., Liu, J., Liu, L., Ju, Z., Liu, Y., Zhou, D. (eds) Intelligent Robotics and Applications. ICIRA 2019. Lecture Notes in Computer Science(), vol 11742. Springer, Cham. https://doi.org/10.1007/978-3-030-27535-8_48

Download citation

DOI: https://doi.org/10.1007/978-3-030-27535-8_48
Published: 02 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27534-1
Online ISBN: 978-3-030-27535-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics