Elsevier

Automatica

Volume 102, April 2019, Pages 58-71
Automatica

A differential game for cooperative target defense

https://doi.org/10.1016/j.automatica.2018.12.034Get rights and content

Abstract

Multi-player pursuit–evasion games are crucial for addressing the maneuver decision problem arising in the cooperative control of multi-agent systems. This work addresses a particular pursuit–evasion game with three players, Target, Attacker, and Defender. The Attacker aims to capture the Target, while avoiding being captured by the Defender and the Defender tries to defend the Target from being captured by the Attacker, while trying to capture the Attacker at an opportune moment. A two-pronged pursuit–evasion problem in this game is considered and we focus on two aspects: the cooperation between the Target and Defender and balancing the roles of the Attacker between pursuer and evader. A barrier based on the explicit policy method and geometric analysis method is constructed to separate the whole state space into two disjoint parts that correspond to two winning regions for the Attacker and Target–Defender team. The main contributions of this work are obtaining the players’ winning regions and providing a complete game solution by analyzing the optimal strategies and trajectories of the players based on the barrier.

Introduction

The multi-player pursuit–evasion game is an important tool to deal with the maneuver decision problem arising in the cooperative control of multi-agent systems (Bopardikar et al., 2009, Ding et al., 2010, Ramana and Kothari, 2015), especially in confrontational circumstances (D’Andrea and Murray, 2003, Ge et al., 2008, Huang et al., 2015, Isaacs, 1965, Kraska and Rzymowski, 2011). In many practical applications, agents are required to assist other agents in completing confrontation tasks such as an interceptor defending an asset against an intruder (Li & Cruz, 2011), a torpedo safeguarding a naval ship against a submarine (Boyell, 1976), and a bodyguard protecting a potential victim against a bandit (Rusnak, 2005). The common feature of these application scenarios is that there are three players – Target, Attacker, and Defender – and this is known as a Target–Attacker–Defender (TAD) game.

In a TAD game, the Attacker aims to capture the Target, while avoiding being captured by the Defender, and the Defender tries to defend the Target from being captured by the Attacker, while trying to capture the Attacker at an opportune moment. In other words, the Target and Defender cooperate in a team, and the Attacker is on the opposite side. This study, which focuses on solving such a TAD game, differs from conventional pursuit–evasion games because the players’ roles have changed. The task of the players is not merely to chase or to escape; the Attacker also needs to escape from the Defender. Meanwhile, the Target must cooperate with the Defender in addition to escaping from the Attacker. Moreover, the Defender cooperates with the Target to prevent the Attacker from achieving his or her goal. These changes in the players’ roles make it more difficult to solve the TAD problem.

The TAD game was first presented by Boyell (1976), who examines the setting in which a moving target launches a missile or torpedo to defend itself against the missile. Subsequently, because of the nature of this problem, much scholarly attention has been paid to the issue. Different types of cooperation mechanisms in the Target–Defender team have been developed by Casbeer, Garcia, Fuchs, and Pachter (2015), Garcia, Casbeer, and Pachter (2015a, 2015b, 2017), Garcia et al., 2014, Garcia et al., 2015, Pachter, Garcia, and Casbeer (2014), Perelman, Shima, and Rusnak (2011), Prokopov and Shima (2013), Shima (2011) and Shaferman and Shima (2010). For example, in Shaferman and Shima (2010), Shima (2011), Prokopov and Shima (2013) and Perelman et al. (2011), where the Defender is a fired missile from an evading aircraft aiming to defend against an incoming homing missile, the authors considered the line-of-sight (LOS) guidance law for the Defender. The implementation of the LOS guidance law requires the Defender to stay on/ride the LOS between the Target and Attacker. In Casbeer et al. (2015), Garcia et al. (2015, 2014), Garcia et al. (2015a, 2015b, 2017), and Pachter et al. (2014), where the Defender missile is launched by the Target or by a Target-friendly platform, the cooperation between the Target and Defender is such that the Defender captures the Attacker before the Attacker captures the Target. In the aforementioned literature, the Attacker, regarded as a non-recycled missile, aims only to minimize the distance between itself and the Target. Whether the Attacker is captured by the Defender is seldom considered. Balancing the roles of the Attacker is considered only in Rubinsky and Gutman, 2012, Rubinsky and Gutman, 2014, who present the switch time for the Attacker in the TAD end-game based on the distance between the Target and Attacker for non-cooperation between the Target and Defender. In addition, most of the abovementioned studies focus on quantitative analysis, leaving some fundamental issues unsolved. For example, under what conditions can a player win or lose the game? How should a control scheme for players be designed?

These issues can be solved within a game of kind that constructs a barrier (Isaacs, 1965), namely a semi-permeable surface that partitions the state space into disjoint regions. Each region is associated with a player who wins the game if the initial position of the players lies in that region. Generally, there are two methods of solving the multi-player pursuit–evasion problem in terms of a game of kind. The first method is to divide the game into several sub-games and then analyze the optimal behaviors of the players for each sub-game by using Isaacs’ classic approach (Isaacs, 1965). Under this approach, the barrier is constructed by integrating the so-called retrogressive path equations (RPEs) from the points on the boundary of the usable part of the target set (or terminal manifold). For example, Bhattacharya et al., 2014, Bhattacharya et al., 2016 study a visibility-based target tracking game in the presence of a circular obstacle by reducing the dimension of the state space to three and constructing the barrier using Isaacs’ techniques according to the symmetry of the environment. The second method is the explicit policy method (Bakolas and Tsiotras, 2012, Bopardikar et al., 2008, Bopardikar et al., 2009, Isaacs, 1965, Zha et al., 2016), in which the strategy is given to the players and then the possibility of winning the game is analyzed. For example, Ramana and Kothari (2017) study a multi-player pursuit–evasion game with one superior evader, using overlapping Apollonius circles around that evader. Oyler, Kabamba, and Girard (2016) consider a P3 game (prey, protector, predator) in which the protector and prey aim to rendezvous before the latter is captured by the predator and the conditions for the two sides dominating the game of kind are presented by means of Apollonius circles.

In some respects, the TAD game is a two-pronged pursuit–evasion problem: Attacker–Target and Defender–Attacker. However, it is difficult to divide the target set in the game into several successive sub-target sets in sequence. Hence, although the TAD game can be described by a three-dimensional state space, solving the RPEs analytically in this case is challenging. Inspired by the abovementioned methods in the literature, we thus decompose the TAD game into sub-problems and demonstrate strategies for players by using the explicit policy method. We then use geometric analysis and the Pontryagin maximum principle to analyze the possibility of the players winning or losing the game.

The main contributions of this work are threefold. First, we simultaneously take into account balancing the roles of the Attacker and cooperation of the Target–Defender team. Second, we combine the explicit policy method with geometric analysis to solve the TAD game and attain the barrier, thereby providing a new notion to solve such a game of kind. Finally, we fuse a game of kind and a game of degree into the TAD game. We can then attain the optimal trajectories associated with the optimal control strategies of the players for every zone, which are the complete solutions of the TAD game. To the best of our knowledge, this is the first study that provides such a complete solution of the TAD game. The results obtained can be employed to solve the maneuvering decision problems arising in the cooperative control of multi-agent systems in adversarial environments such as search and rescue operations and the recovery of military equipment.

The rest of the paper is organized as follows. Section 2 formulates the TAD game. In Section 3, we briefly introduce some explicit policies as well as provide the winning condition for the players and construct the barrier of the TAD game. In Section 4, the optimal control strategies and corresponding trajectories for the players in different winning regions are obtained. Finally, Section 5 concludes.

Section snippets

Problem formulation

In this section, we present the problem formulation of the TAD game. As shown in Fig. 1, the Target, Attacker, and Defender move in the plane at speeds of VA, VT, and VD, respectively. The dynamics of the Target, Attacker, and Defender can be described in the following equations: ẋT=VTcosϕˆ,ẏT=VTsinϕˆẋA=VAcosχˆ,ẏA=VAsinχˆẋD=VDcosψˆ,ẏD=VDsinψˆ where the positions of the Target, Attacker, and Defender are denoted as (xT,yT), (xA,yA), and (xD,yD), respectively and the corresponding control

Game of kind

To confirm the conditions under which the players win or lose the game, we turn to the theory of games of kind in this section. The TAD game formulated in the previous section includes three terminal sets, namely {R>0,r=0,d>0}, {R=0,r>0,d>0}, and {R>0,r>0,d=0}. It is difficult to construct the barrier directly by using Isaacs’ classic approach because sketching the boundary of the terminal set is a challenge. Instead, we can address this problem by using the explicit policy method that

Optimal strategies for different regions

We have already analyzed the TAD problem from the perspective of the game of kind and constructed the barrier that divides the space into DA and DTD. Hence, in this section, we present the optimal strategy for each player in the different winning areas. When the state lies in the winning region of the Attacker (DA), the Attacker hopes to capture the Target in the shortest time, but the Target–Defender team tries to extend the length of the game as much as possible. In the same way, when the

Conclusion

In this study, we investigate a TAD game in which the Attacker tries to capture the Target, while avoiding being intercepted by the Defender and the Defender cooperates with the Target to intercept the Attacker and defend the Target. We consider the cooperation in the Target–Defender team and balancing the role of the Attacker between pursuer and evader. By employing the explicit policy method, we construct a barrier that separates the whole space into the winning region of the Attacker and the

Li Liang received the B.E and M.S. degrees in control science and engineering from the Inner Mongolia University, Hohhot, China, in 2003 and 2006, respectively.

She is currently a Ph.D. student with the School of Automation, Beijing Institute of Technology. Her research interests include differential games, multi-agent systems, multi-objective optimization and decision.

References (31)

  • D’AndreaR. et al.

    The RoboFlag competition

  • DingX.C. et al.

    Multi-UAV convoy protection: An optimal approach to path planning and coordination

    IEEE Transactions on Robotics

    (2010)
  • GarciaE. et al.

    Active target defense differential game with a fast defender

  • GarciaE. et al.

    Cooperative strategies for optimal aircraft defense from an attacking missile

    Journal of Guidance Control and Dynamics

    (2015)
  • GarciaE. et al.

    Cooperative aircraft defense from an attacking missile

  • Cited by (105)

    • Cooperative line-of-sight guidance with optimal evasion strategy for three-body confrontation

      2023, ISA Transactions
      Citation Excerpt :

      It should be pointed out that, however, the analytical as well as numerical solutions of nonlinear two-point boundary value problems are very challenging, and thus greatly limits the scope of its applications. The authors in Ref. [25] developed the optimal strategies for three vehicles at the terminal time, and then constructed a barrier that divides the state space into a winning region for the Attacker and a winning region for the Target–Defender team by employing the explicit policy method. Later, the role switch of the Attacker and the winning conditions for the Target–Defender team were further characterized in Ref. [26], but unfortunately, only the optimal heading angle at the terminal moment are discussed.

    View all citing articles on Scopus

    Li Liang received the B.E and M.S. degrees in control science and engineering from the Inner Mongolia University, Hohhot, China, in 2003 and 2006, respectively.

    She is currently a Ph.D. student with the School of Automation, Beijing Institute of Technology. Her research interests include differential games, multi-agent systems, multi-objective optimization and decision.

    Fang Deng received the B.E. and Ph.D. degrees in control science and engineering from the Beijing Institute of Technology, Beijing, China, in 2004 and 2009, respectively.

    He is currently a Professor with the School of Automation, Beijing Institute of Technology. His current research interests include multi-objective optimization and decision, intelligent fire control, intelligent information processing and smart wearable devices.

    Zhihong Peng received the B.S. degree from the Xiangtan Mining Institute (currently, the Hunan University of Science and Technology), Xiangtan, China, in 1995, and the Ph.D. degree from Central South University, Changsha, China, in 2000.

    She held one post-doctoral appointment at the Beijing Institute of Technology, Beijing, China. Since 2012, she has been a Professor with the School of Automation, Beijing Institute of Technology. Her current research interests include intelligent information processing, and multi-agent cooperation, optimization and decision.

    Xinxing Li received the M.S. in applied mathematics from Beijing Institute of Technology, Beijing, China, in 2014. Currently, he is working towards his Ph.D. degree in control science and engineering at School of Automation, Beijing Institute of Technology. His research interests include optimal control, game theory and reinforcement learning.

    Wenzhong Zha received the B.S. degree in mathematics and the Ph.D. degree in control science and engineering from Beijing Institute of Technology, Beijing, China, in 2008 and 2016, respectively. He is the leader of autonomous intelligence of Key Laboratory of Cognition and Intelligence Technology, Information Science Academy of China Electronics Technology Group Corporation.

    His current research interests include differential games, dynamic games, robotics, simultaneous location and mapping, multi-agent systems, and incomplete information processing.

    This work was supported by the National Natural Science Foundation of China (Grant No. 61203078) and the Key Program of National Natural Science Foundation of China (Grant No. U1613225). The material in this paper was not presented at any conference. This paper was recommended for publication in revised form by Associate Editor Gurdal Arslan under the direction of Editor Ian R. Petersen.

    View full text