Abstract
The swarm confrontation problem is always a hot research topic, which has attracted much attention. Previous research focuses on devising rules to improve the intelligence of the swarm, which is not suitable for complex scenarios. Multi-agent reinforcement learning has been used in some similar confrontation tasks. However, many of these works take centralized method to control all entities in a swarm, which is hard to meet the real-time requirement of practical systems. Recently, OpenAI proposes Multi-Agent Deep Deterministic Policy Gradient algorithm (MADDPG), which can be used for centralized training but decentralized execution in multi-agent environments. We examine the method in our constructed swarm confrontation environment and find that it is not easy to deal with complex scenarios. We propose two improved training methods, scenario-transfer training and self-play training, which greatly enhance the performance of MADDPG. Experimental results show that the scenario-transfer training accelerate the convergence speed by 50%, and the self-play training increases the winning rate of MADDPG from 42% to 96%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Peng, P., Quan, Y., Ying, W., Yang, Y., Wang, J.: Multiagent bidirectionally-coordinated nets for learning to play StarCraft combat games (2017)
Besada-Portas, E., Torre, L.D.L., Cruz, J.M.D.L., Andrés-Toro, B.D.: Evolutionary trajectory planner for multiple uavs in realistic scenarios. IEEE Trans. Robot. 26(4), 619–634 (2010)
Fields, M.A., Haas, E., Hill, S., Stachowiak, C., Barnes, L.: Effective robot team control methodologies for battlefield applications. In: IEEE/RSJ International Conference on Intelligent Robots & Systems (2009)
Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354–359 (2017)
Volodymyr, M., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Peters, J., Bagnell, J.A.: Policy gradient methods. Encycl. Mach. Learn. 5(11), 774–776 (2010)
Buşoniu, L., Babuška, R., De Schutter, B.: Multi-agent reinforcement learning: an overview. In: Srinivasan, D., Jain, L.C. (eds.) Innovations in Multi-Agent Systems and Applications - 1. SCI, vol. 310, pp. 183–221. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-14435-6_7
Matignon, L., Laurent, G.J., Fort-Piat, N.L.: Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems. Knowl. Eng. Rev. 27(1), 1–31 (2012)
Tampuu, A., et al.: Multiagent cooperation and competition with deep reinforcement learning. Plos One 12(4), e0172395 (2017)
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Neural Information Processing Systems (NIPS) (2017)
Luo, D., Yang, X.U., Zhang, J.: New progresses on UAV swarm confrontation. Sci. Technol. Rev. 35, 26–31 (2017)
Yan, J., Minai, A.A., Polycarpou, M.M.: Cooperative real-time search and task allocation in UAV teams. In: IEEE Conference on Decision & Control (2003)
Busoniu, L., Babuska, R., De Schutter, B.: A comprehensive survey of multiagent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C 38(2), 156–172 (2008)
Pan, S., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010)
Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10(10), 1633–1685 (2009)
Genesereth, M.R., Love, N., Pell, B.: General game playing: overview of the aaai competition. AI Mag. 26(2), 62–72 (2005)
Heinrich, J., Silver, D.: Deep reinforcement learning from self-play in imperfect-information games (2016)
Babel, L.: Coordinated target assignment and UAV path planning with timing constraints. J. Intell. Robot. Syst. 94, 1–13 (2018)
Acknowledgement
This work was partially supported by the National Natural Science Foundation of China (No. 91648204 and 61532007), the National Key Research and Development Program of China (No. 2017YFB1001900 and 2017YFB1301104), and the National Science and Technology Major Project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, G., Li, Y., Xu, X., Dai, H. (2019). Multiagent Reinforcement Learning for Swarm Confrontation Environments. In: Yu, H., Liu, J., Liu, L., Ju, Z., Liu, Y., Zhou, D. (eds) Intelligent Robotics and Applications. ICIRA 2019. Lecture Notes in Computer Science(), vol 11742. Springer, Cham. https://doi.org/10.1007/978-3-030-27535-8_48
Download citation
DOI: https://doi.org/10.1007/978-3-030-27535-8_48
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27534-1
Online ISBN: 978-3-030-27535-8
eBook Packages: Computer ScienceComputer Science (R0)