A Deep Reinforcement Learning Approach for Cooperative Target Defense

Xiong, Yanxue; Wang, Zhigang; Ke, Liangjun

doi:10.1007/978-981-19-9297-1_2

A Deep Reinforcement Learning Approach for Cooperative Target Defense

Conference paper
First Online: 20 January 2023

582 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1744))

Abstract

This paper considers a new variant of the pursuit-evasion problem, called the cooperative target defense problem with three agents (attacker, targeter, and defender) in a 3D space. The targeter tries to fly as quickly as possible from a starting point to the terminal, while the defender seeks to protect it from the attacker. The problem is difficult to solve under traditional game theory methods, while deep reinforcement learning (DRL) has shown strong adaptability in these complex and higher-dimensional tasks. Inspired by the successful applications of Proximal Policy Optimization (PPO), this paper proposes a PPO-based algorithm for the problem, intending to derive the optimal behavioral policies for both sides. We design the corresponding state space, action space, and rewards of the agents. Three kinds of reward functions are proposed for the attacker and compared by experimental results. Our study provides a good foundation for the cooperative target defense problem.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Andrychowicz, O.M., et al.: Learning dexterous in-hand manipulation. Int. J. Robot. Res. 39(1), 3–20 (2020)
Article Google Scholar
Degrave, J., et al.: Magnetic control of tokamak plasmas through deep reinforcement learning. Nature 602(7897), 414–419 (2022)
Article Google Scholar
Fu, H., Liu, H.H.T.: Optimal solution of a target defense game with two defenders and a faster intrude. Unmanned Syst. 9(03), 247–262 (2021)
Article Google Scholar
Givigi, S.N., Schwartz, H.M., Lu, X.: A reinforcement learning adaptive fuzzy controller for differential games. J. Intell. Rob. Syst. 59(1), 3–30 (2010)
Article MATH Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MATH Google Scholar
Kong, W., Zhou, D., Yang, Z., Zhao, Y., Zhang, K.: UAV autonomous aerial combat maneuver strategy generation with observation error based on state-adversarial deep deterministic policy gradient and inverse reinforcement learning. Electronics 9(7), 1121 (2020)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
Liang, L., Deng, F., Lu, M., Chen, J.: Analysis of role switch for cooperative target defense differential game. IEEE Trans. Autom. Control 66(2), 902–909 (2020)
Article MATH Google Scholar
Liang, L., Deng, F., Peng, Z., Li, X., Zha, W.: A differential game for cooperative target defense. Automatica 102, 58–71 (2019)
Article MATH Google Scholar
Lin, B., Qiao, L., Jia, Z., Sun, Z., Sun, M., Zhang, W.: Control strategies for target-attacker-defender games of USVs. In: 2021 6th International Conference on Automation, Control and Robotics Engineering (CACRE), pp. 191–198 (2021). https://doi.org/10.1109/CACRE52464.2021.9501329
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Sun, W., Tsiotras, P., Lolla, T., Subramani, D.N., Lermusiaux, P.F.: Multiple-pursuer/one-evader pursuit-evasion game in dynamic flowfields. J. Guid. Control. Dyn. 40(7), 1627–1637 (2017)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
MATH Google Scholar
Tang, X., Ye, D., Huang, L., Sun, Z., Sun, J.: Pursuit-evasion game switching strategies for spacecraft with incomplete-information. Aerosp. Sci. Technol. 119, 107112 (2021)
Article Google Scholar
Vinyals, O., et al.: Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575(7782), 350–354 (2019)
Article Google Scholar
Von Moll, A., Casbeer, D.W., Garcia, E., Milutinović, D.: Pursuit-evasion of an evader by multiple pursuers. In: 2018 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 133–142. IEEE (2018)
Google Scholar
Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of PPO in cooperative, multi-agent games. arXiv preprint arXiv:2103.01955 (2021)
Zhou, Z., Zhang, W., Ding, J., Huang, H., Stipanović, D.M., Tomlin, C.J.: Cooperative pursuit with voronoi partitions. Automatica 72, 64–72 (2016)
Article MATH Google Scholar

Download references

Acknowledgment

This work was supported by the National Natural Science Foundation of China under Grant 61973244 and Grant 61573277. It is also supported by the open fund of CETC Key Laboratory of Data Link Technology (CLDL-20202101-1).

Author information

Authors and Affiliations

State Key Laboratory for Manufacturing Systems Engineering, School of Automation Science and Engineering, Xi’an Jiaotong University, Xi’an, China
Yanxue Xiong & Liangjun Ke
CETC Key Laboratory of Data Link Technology Xi’an, Xi’an, China
Zhigang Wang

Authors

Yanxue Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Zhigang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Liangjun Ke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liangjun Ke .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Ying Tan
Southern University of Science and Technology, Shenzhen, China
Yuhui Shi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xiong, Y., Wang, Z., Ke, L. (2022). A Deep Reinforcement Learning Approach for Cooperative Target Defense. In: Tan, Y., Shi, Y. (eds) Data Mining and Big Data. DMBD 2022. Communications in Computer and Information Science, vol 1744. Springer, Singapore. https://doi.org/10.1007/978-981-19-9297-1_2

Download citation

DOI: https://doi.org/10.1007/978-981-19-9297-1_2
Published: 20 January 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-9296-4
Online ISBN: 978-981-19-9297-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics