A differential game for cooperative target defense☆
Introduction
The multi-player pursuit–evasion game is an important tool to deal with the maneuver decision problem arising in the cooperative control of multi-agent systems (Bopardikar et al., 2009, Ding et al., 2010, Ramana and Kothari, 2015), especially in confrontational circumstances (D’Andrea and Murray, 2003, Ge et al., 2008, Huang et al., 2015, Isaacs, 1965, Kraska and Rzymowski, 2011). In many practical applications, agents are required to assist other agents in completing confrontation tasks such as an interceptor defending an asset against an intruder (Li & Cruz, 2011), a torpedo safeguarding a naval ship against a submarine (Boyell, 1976), and a bodyguard protecting a potential victim against a bandit (Rusnak, 2005). The common feature of these application scenarios is that there are three players – Target, Attacker, and Defender – and this is known as a Target–Attacker–Defender (TAD) game.
In a TAD game, the Attacker aims to capture the Target, while avoiding being captured by the Defender, and the Defender tries to defend the Target from being captured by the Attacker, while trying to capture the Attacker at an opportune moment. In other words, the Target and Defender cooperate in a team, and the Attacker is on the opposite side. This study, which focuses on solving such a TAD game, differs from conventional pursuit–evasion games because the players’ roles have changed. The task of the players is not merely to chase or to escape; the Attacker also needs to escape from the Defender. Meanwhile, the Target must cooperate with the Defender in addition to escaping from the Attacker. Moreover, the Defender cooperates with the Target to prevent the Attacker from achieving his or her goal. These changes in the players’ roles make it more difficult to solve the TAD problem.
The TAD game was first presented by Boyell (1976), who examines the setting in which a moving target launches a missile or torpedo to defend itself against the missile. Subsequently, because of the nature of this problem, much scholarly attention has been paid to the issue. Different types of cooperation mechanisms in the Target–Defender team have been developed by Casbeer, Garcia, Fuchs, and Pachter (2015), Garcia, Casbeer, and Pachter (2015a, 2015b, 2017), Garcia et al., 2014, Garcia et al., 2015, Pachter, Garcia, and Casbeer (2014), Perelman, Shima, and Rusnak (2011), Prokopov and Shima (2013), Shima (2011) and Shaferman and Shima (2010). For example, in Shaferman and Shima (2010), Shima (2011), Prokopov and Shima (2013) and Perelman et al. (2011), where the Defender is a fired missile from an evading aircraft aiming to defend against an incoming homing missile, the authors considered the line-of-sight (LOS) guidance law for the Defender. The implementation of the LOS guidance law requires the Defender to stay on/ride the LOS between the Target and Attacker. In Casbeer et al. (2015), Garcia et al. (2015, 2014), Garcia et al. (2015a, 2015b, 2017), and Pachter et al. (2014), where the Defender missile is launched by the Target or by a Target-friendly platform, the cooperation between the Target and Defender is such that the Defender captures the Attacker before the Attacker captures the Target. In the aforementioned literature, the Attacker, regarded as a non-recycled missile, aims only to minimize the distance between itself and the Target. Whether the Attacker is captured by the Defender is seldom considered. Balancing the roles of the Attacker is considered only in Rubinsky and Gutman, 2012, Rubinsky and Gutman, 2014, who present the switch time for the Attacker in the TAD end-game based on the distance between the Target and Attacker for non-cooperation between the Target and Defender. In addition, most of the abovementioned studies focus on quantitative analysis, leaving some fundamental issues unsolved. For example, under what conditions can a player win or lose the game? How should a control scheme for players be designed?
These issues can be solved within a game of kind that constructs a barrier (Isaacs, 1965), namely a semi-permeable surface that partitions the state space into disjoint regions. Each region is associated with a player who wins the game if the initial position of the players lies in that region. Generally, there are two methods of solving the multi-player pursuit–evasion problem in terms of a game of kind. The first method is to divide the game into several sub-games and then analyze the optimal behaviors of the players for each sub-game by using Isaacs’ classic approach (Isaacs, 1965). Under this approach, the barrier is constructed by integrating the so-called retrogressive path equations (RPEs) from the points on the boundary of the usable part of the target set (or terminal manifold). For example, Bhattacharya et al., 2014, Bhattacharya et al., 2016 study a visibility-based target tracking game in the presence of a circular obstacle by reducing the dimension of the state space to three and constructing the barrier using Isaacs’ techniques according to the symmetry of the environment. The second method is the explicit policy method (Bakolas and Tsiotras, 2012, Bopardikar et al., 2008, Bopardikar et al., 2009, Isaacs, 1965, Zha et al., 2016), in which the strategy is given to the players and then the possibility of winning the game is analyzed. For example, Ramana and Kothari (2017) study a multi-player pursuit–evasion game with one superior evader, using overlapping Apollonius circles around that evader. Oyler, Kabamba, and Girard (2016) consider a P3 game (prey, protector, predator) in which the protector and prey aim to rendezvous before the latter is captured by the predator and the conditions for the two sides dominating the game of kind are presented by means of Apollonius circles.
In some respects, the TAD game is a two-pronged pursuit–evasion problem: Attacker–Target and Defender–Attacker. However, it is difficult to divide the target set in the game into several successive sub-target sets in sequence. Hence, although the TAD game can be described by a three-dimensional state space, solving the RPEs analytically in this case is challenging. Inspired by the abovementioned methods in the literature, we thus decompose the TAD game into sub-problems and demonstrate strategies for players by using the explicit policy method. We then use geometric analysis and the Pontryagin maximum principle to analyze the possibility of the players winning or losing the game.
The main contributions of this work are threefold. First, we simultaneously take into account balancing the roles of the Attacker and cooperation of the Target–Defender team. Second, we combine the explicit policy method with geometric analysis to solve the TAD game and attain the barrier, thereby providing a new notion to solve such a game of kind. Finally, we fuse a game of kind and a game of degree into the TAD game. We can then attain the optimal trajectories associated with the optimal control strategies of the players for every zone, which are the complete solutions of the TAD game. To the best of our knowledge, this is the first study that provides such a complete solution of the TAD game. The results obtained can be employed to solve the maneuvering decision problems arising in the cooperative control of multi-agent systems in adversarial environments such as search and rescue operations and the recovery of military equipment.
The rest of the paper is organized as follows. Section 2 formulates the TAD game. In Section 3, we briefly introduce some explicit policies as well as provide the winning condition for the players and construct the barrier of the TAD game. In Section 4, the optimal control strategies and corresponding trajectories for the players in different winning regions are obtained. Finally, Section 5 concludes.
Section snippets
Problem formulation
In this section, we present the problem formulation of the TAD game. As shown in Fig. 1, the Target, Attacker, and Defender move in the plane at speeds of , , and , respectively. The dynamics of the Target, Attacker, and Defender can be described in the following equations: where the positions of the Target, Attacker, and Defender are denoted as , , and , respectively and the corresponding control
Game of kind
To confirm the conditions under which the players win or lose the game, we turn to the theory of games of kind in this section. The TAD game formulated in the previous section includes three terminal sets, namely , , and . It is difficult to construct the barrier directly by using Isaacs’ classic approach because sketching the boundary of the terminal set is a challenge. Instead, we can address this problem by using the explicit policy method that
Optimal strategies for different regions
We have already analyzed the TAD problem from the perspective of the game of kind and constructed the barrier that divides the space into and . Hence, in this section, we present the optimal strategy for each player in the different winning areas. When the state lies in the winning region of the Attacker (), the Attacker hopes to capture the Target in the shortest time, but the Target–Defender team tries to extend the length of the game as much as possible. In the same way, when the
Conclusion
In this study, we investigate a TAD game in which the Attacker tries to capture the Target, while avoiding being intercepted by the Defender and the Defender cooperates with the Target to intercept the Attacker and defend the Target. We consider the cooperation in the Target–Defender team and balancing the role of the Attacker between pursuer and evader. By employing the explicit policy method, we construct a barrier that separates the whole space into the winning region of the Attacker and the
Li Liang received the B.E and M.S. degrees in control science and engineering from the Inner Mongolia University, Hohhot, China, in 2003 and 2006, respectively.
She is currently a Ph.D. student with the School of Automation, Beijing Institute of Technology. Her research interests include differential games, multi-agent systems, multi-objective optimization and decision.
References (31)
- et al.
Relay pursuit of a maneuvering target using dynamic Voronoi diagrams
Automatica
(2012) - et al.
A cooperative Homicidal Chauffeur game
Automatica
(2009) - et al.
Active target defense using first order missile models
Automatica
(2017) - et al.
Pursuit-evasion games in the presence of obstacles
Automatica
(2016) The lady, the bandits and the body guards–a two team dynamic game
IFAC Proceedings Volumes
(2005)- et al.
On the construction of barrier in a visibility based pursuit evasion game
- et al.
A visibility-based pursuit-evasion game with a circular obstacle
Journal of Optimization Theory and Applications
(2016) - et al.
On discrete-time pursuit-evasion games with sensing limitations
IEEE Transactions on Robotics
(2008) Defending a moving target against missile or torpedo attack
IEEE Transaction on Aerospace and Electronic Systems
(1976)- et al.
Cooperative target defense differential game with a constrained-maneuverable Defender
The RoboFlag competition
Multi-UAV convoy protection: An optimal approach to path planning and coordination
IEEE Transactions on Robotics
Active target defense differential game with a fast defender
Cooperative strategies for optimal aircraft defense from an attacking missile
Journal of Guidance Control and Dynamics
Cooperative aircraft defense from an attacking missile
Cited by (105)
Cooperative pursuit with multiple pursuers based on Deep Minimax Q-learning
2024, Aerospace Science and TechnologyCooperative game penetration guidance for multiple hypersonic vehicles under safety critical framework
2024, Chinese Journal of AeronauticsMulti-Player Linear-Quadratic Exponential Stochastic Differential Games on Directed Graphs
2023, IFAC-PapersOnLineCooperative line-of-sight guidance with optimal evasion strategy for three-body confrontation
2023, ISA TransactionsCitation Excerpt :It should be pointed out that, however, the analytical as well as numerical solutions of nonlinear two-point boundary value problems are very challenging, and thus greatly limits the scope of its applications. The authors in Ref. [25] developed the optimal strategies for three vehicles at the terminal time, and then constructed a barrier that divides the state space into a winning region for the Attacker and a winning region for the Target–Defender team by employing the explicit policy method. Later, the role switch of the Attacker and the winning conditions for the Target–Defender team were further characterized in Ref. [26], but unfortunately, only the optimal heading angle at the terminal moment are discussed.
Multiple-Pursuer Single-Evader Reach-Avoid Games in Constant Flow Fields
2024, IEEE Transactions on Automatic Control
Li Liang received the B.E and M.S. degrees in control science and engineering from the Inner Mongolia University, Hohhot, China, in 2003 and 2006, respectively.
She is currently a Ph.D. student with the School of Automation, Beijing Institute of Technology. Her research interests include differential games, multi-agent systems, multi-objective optimization and decision.
Fang Deng received the B.E. and Ph.D. degrees in control science and engineering from the Beijing Institute of Technology, Beijing, China, in 2004 and 2009, respectively.
He is currently a Professor with the School of Automation, Beijing Institute of Technology. His current research interests include multi-objective optimization and decision, intelligent fire control, intelligent information processing and smart wearable devices.
Zhihong Peng received the B.S. degree from the Xiangtan Mining Institute (currently, the Hunan University of Science and Technology), Xiangtan, China, in 1995, and the Ph.D. degree from Central South University, Changsha, China, in 2000.
She held one post-doctoral appointment at the Beijing Institute of Technology, Beijing, China. Since 2012, she has been a Professor with the School of Automation, Beijing Institute of Technology. Her current research interests include intelligent information processing, and multi-agent cooperation, optimization and decision.
Xinxing Li received the M.S. in applied mathematics from Beijing Institute of Technology, Beijing, China, in 2014. Currently, he is working towards his Ph.D. degree in control science and engineering at School of Automation, Beijing Institute of Technology. His research interests include optimal control, game theory and reinforcement learning.
Wenzhong Zha received the B.S. degree in mathematics and the Ph.D. degree in control science and engineering from Beijing Institute of Technology, Beijing, China, in 2008 and 2016, respectively. He is the leader of autonomous intelligence of Key Laboratory of Cognition and Intelligence Technology, Information Science Academy of China Electronics Technology Group Corporation.
His current research interests include differential games, dynamic games, robotics, simultaneous location and mapping, multi-agent systems, and incomplete information processing.
- ☆
This work was supported by the National Natural Science Foundation of China (Grant No. 61203078) and the Key Program of National Natural Science Foundation of China (Grant No. U1613225). The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Gurdal Arslan under the direction of Editor Ian R. Petersen.