Evolutionary reinforcement learning algorithm for large-scale multi-agent cooperation and confrontation applications

Liu, Haiying; Li, ZhiHao; Huang, Kuihua; Wang, Rui; Cheng, Guangquan; Li, Tiexiang

doi:10.1007/s11227-023-05551-2

Evolutionary reinforcement learning algorithm for large-scale multi-agent cooperation and confrontation applications

Published: 29 July 2023

Volume 80, pages 2319–2346, (2024)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Haiying Liu^1,4,
ZhiHao Li²,
Kuihua Huang³,
Rui Wang³,
Guangquan Cheng³ &
…
Tiexiang Li⁴

414 Accesses
Explore all metrics

Abstract

Multi-agent cooperation and confrontation technology have achieved rapid development in recent years. Most extant multi-agent reinforcement learning algorithms simplify the problem by using shared weights or local observation, and are only suitable for scenarios with less than ten agents. Given this, large-scale scene research needs to explore new directions. This paper presents a large-scale multi-agent evolutionary reinforcement jointed method. The multi-agent learning task is separated into numerous stages based on the agent’s scale, and the self-attention mechanism is utilized to handle changing numbers of agents in each step. Simultaneously, to avoid the agents’ poor adaptability in previous stages, the best individuals in the population are chosen at each stage of training via evolutionary techniques. Two typical unmanned aerial vehicle cluster missions, multi-domain joint sea crossing and landing missions, were created to validate the performance of the suggested technique, and the operational rules and reward functions were also given. Experiments have shown that the model trained using the suggested method has good performance and stability and can provide a multi-agent collaborative decision-making model suitable for large-scale environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Asynchronous Multi-agent Pareto Optimization for Diverse UAV Maneuver Strategy Generation

Weighted mean field reinforcement learning for large-scale UAV swarm confrontation

Article 21 June 2022

A MARL-Based Approach to Intelligent Strategic Decision-Making for Air-Sea Confrontation

Data availability

Data will be made available on request.

References

Zhang Y, Zhao H (2022) A multi-agent model for decision making on environmental regulation in urban agglomeration[J]. J Supercomput. https://doi.org/10.1007/s11227-021-04094-8
Article Google Scholar
Hamidi H, Kamankesh A (2018) An approach to intelligent traffic management system using a multi-agent system[J]. Int J Intell Transp Syst Res 16:112–124. https://doi.org/10.1007/s13177-017-0142-6
Article Google Scholar
Ye F, Chen J, Sun Q et al (2021) Decentralized task allocation for heterogeneous multi-UAV system with task coupling constraints[J]. J Supercomput 77:111–132. https://doi.org/10.1007/s11227-020-03264-4
Article Google Scholar
Coronato A, Naeem M, Pietro GD et al (2020) Reinforcement learning for intelligent healthcare applications: a survey[J]. Artif Intell Med 109:101964. https://doi.org/10.1016/j.artmed.2020.101964
Article Google Scholar
Kiran BR, Sobh I, Talpaert V et al (2021) Deep reinforcement learning for autonomous driving: a survey[J]. IEEE Trans Intell Transp Syst 23(6):4909–4926. https://doi.org/10.1109/TITS.2021.3054625
Article Google Scholar
Johannink T, Bahl S, Nair A, et al (2019) Residual reinforcement learning for robot control[C]. In: 2019 International Conference on Robotics and Automation (ICRA), IEEE, pp 6023–6029. https://doi.org/10.1109/ICRA.2019.8794127
Lowe R, Wu YI, Tamar A et al (2017) Multi-agent actor-critic for mixed cooperative-competitive environments. Adv Neural Inf Process Syst 30:6382–6393
Google Scholar
Foerster J, Farquhar G, Afouras T, et al. (2018) Counterfactual multi-agent policy gradients[C]. In: Proceedings of the AAAI Conference on Artificial Intelligence. https://doi.org/10.1609/aaai.v32i1.11794
Niroui F, Zhang K, Kashino Z et al (2019) Deep reinforcement learning robot for search and rescue applications: exploration in unknown cluttered environments[J]. IEEE Robot Autom Lett 4(2):610–617. https://doi.org/10.1109/LRA.2019.2891991
Article Google Scholar
Yang Y, Luo R, Li M, et al (2018) Mean field multi-agent reinforcement learning[C]. In: International Conference on Machine Learning. PMLR, pp 5571–5580
Iqbal S, Sha F (2019) Actor-attention-critic for multi-agent reinforcement learning[C]. In: International Conference on Machine Learning. PMLR, pp 2961–2970
Christianos F, Papoudakis G, Rahman MA, et al (2021) Scaling multi-agent reinforcement learning with selective parameter sharing[C]. In: International Conference on Machine Learning. PMLR, pp 1989–1998
Drugan MM (2019) Reinforcement learning versus evolutionary computation: a survey on hybrid algorithms[J]. Swarm Evol Comput 44:228–246. https://doi.org/10.1016/j.swevo.2018.03.011
Article Google Scholar
Bodnar C, Day B, Lió P (2020) Proximal distilled evolutionary reinforcement learning[C]. In: Proceedings of the AAAI Conference on Artificial Intelligence, 34(04), pp 3283–3290. https://doi.org/10.1609/aaai.v34i04.5728
Majumdar S, Khadka S, Miret S, et al (2020) Evolutionary reinforcement learning for sample-efficient multiagent coordination[C]. In: International Conference on Machine Learning. PMLR, pp 6651–6660
Khadka S, Tumer K (2018) Evolution-guided policy gradient in reinforcement learning. Adv Neural Inf Process Syst 31:1196–1208
Google Scholar
Conti E, Madhavan V, Petroski SF et al (2018) Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents. Adv Neural Inf Process Syst 31:5032–5043
Google Scholar
Shopov V, Markova V (2018) A study of the impact of evolutionary strategies on performance of reinforcement learning autonomous agents[J]. ICAS 2018:56–60
Google Scholar
Czarnecki W, Jayakumar S, Jaderberg M, et al (2018) Mix and match agent curricula for reinforcement learning[C]. In: International Conference on Machine Learning. PMLR, pp 1087–1095s
Li Z, Liu H, Huang K, Cheng G, Wang R (2022) Multi-domain cooperative action planning strategy based on reinforcement learning. In: 2022 IEEE International Conference on Unmanned Systems (ICUS), Guangzhou, China, pp 910–915
Littman ML (1994) Markov games as a framework for multi-agent reinforcement learning[M]. Machine learning proceedings. Morgan Kaufmann 1994:157–163
Google Scholar
Mao H, Zhang Z, Xiao Z et al. (2018) Modelling the dynamic joint policy of teammates with rattention multi-agent DDPG[C]. In: Adaptive Agents and Multi-Agents Systems.International Foundation for Autonomous Agents and Multiagent Systems, ACM, pp 1108–1116

Download references

Acknowledgments

Research for this paper was supported by the Equipment advance research project (50912020401), the Hunan Key Laboratory of intelligent decision-making technology for emergency management (2020TP1013) and the Natural Science Basic Research Plan in Shanxi Province of China (No.2018JM6011). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.

Funding

This study was funded by Equipment advance research project, 50912020401, Hunan Key Laboratory of intelligent decision-making technology for emergency management, 2020TP1013, the Natural Science Basic Resaerch Plan in Shanxi Province of China, 2018JM6011.

Author information

Authors and Affiliations

College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, People’s Republic of China
Haiying Liu
Research institute 52 of China Electronics Technology Group Corporation, Hangzhou, 311122, Zhejiang, People’s Republic of China
ZhiHao Li
College of System Engineering, National University of Defense Technology, Changsha, 410073, People’s Republic of China
Kuihua Huang, Rui Wang & Guangquan Cheng
Nanjing Center for Applied Mathematics, Nanjing, 211135, Jiangsu, People’s Republic of China
Haiying Liu & Tiexiang Li

Authors

Haiying Liu
View author publications
You can also search for this author in PubMed Google Scholar
ZhiHao Li
View author publications
You can also search for this author in PubMed Google Scholar
Kuihua Huang
View author publications
You can also search for this author in PubMed Google Scholar
Rui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guangquan Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Tiexiang Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

HL contributed to conceptualization, methodology, research management. ZL contributed to methodology, software. KH contributed to conceptualization, methodology. RW contributed to conceptualization, methodology. GC contributed to conceptualization, methodology. TL contributed to algorithm support. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Haiying Liu or Kuihua Huang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, H., Li, Z., Huang, K. et al. Evolutionary reinforcement learning algorithm for large-scale multi-agent cooperation and confrontation applications. J Supercomput 80, 2319–2346 (2024). https://doi.org/10.1007/s11227-023-05551-2

Download citation

Accepted: 14 July 2023
Published: 29 July 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11227-023-05551-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evolutionary reinforcement learning algorithm for large-scale multi-agent cooperation and confrontation applications

Abstract

Access this article

Similar content being viewed by others

Asynchronous Multi-agent Pareto Optimization for Diverse UAV Maneuver Strategy Generation

Weighted mean field reinforcement learning for large-scale UAV swarm confrontation

A MARL-Based Approach to Intelligent Strategic Decision-Making for Air-Sea Confrontation

Data availability

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Evolutionary reinforcement learning algorithm for large-scale multi-agent cooperation and confrontation applications

Abstract

Access this article

Similar content being viewed by others

Asynchronous Multi-agent Pareto Optimization for Diverse UAV Maneuver Strategy Generation

Weighted mean field reinforcement learning for large-scale UAV swarm confrontation

A MARL-Based Approach to Intelligent Strategic Decision-Making for Air-Sea Confrontation

Data availability

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation