Nash Bargaining Solution based rendezvous guidance of unmanned aerial vehicles

doi:10.1016/j.jfranklin.2018.08.005

Journal of the Franklin Institute

Volume 355, Issue 16, November 2018, Pages 8106-8140

https://doi.org/10.1016/j.jfranklin.2018.08.005 Get rights and content

Abstract

This paper addresses a finite-time rendezvous problem for a group of unmanned aerial vehicles (UAVs), in the absence of a leader or a reference trajectory. When the UAVs do not cooperate, they are assumed to use Nash equilibrium strategies (NES). However, when the UAVs can communicate among themselves, they can implement cooperative game theoretic strategies for mutual benefit. In a convex linear quadratic differential game (LQDG), a Pareto-optimal solution (POS) is obtained when the UAVs jointly minimize a team cost functional, which is constructed through a convex combination of individual cost functionals. This paper proposes an algorithm to determine the convex combination of weights corresponding to the Pareto-optimal Nash Bargaining Solution (NBS), which offers each UAV a lower cost than that incurred from the NES. Conditions on the cost functions that make the proposed algorithm converge to the NBS are presented. A UAV, programmed to choose its strategies at a given time based upon cost-to-go estimates for the rest of the game duration, may switch to NES finding it to be more beneficial than continuing with a cooperative strategy it previously agreed upon with the other UAVs. For such scenarios, a renegotiation method, that makes use of the proposed algorithm to obtain the NBS corresponding to the state of the game at an intermediate time, is proposed. This renegotiation method helps to establish cooperation between UAVs and prevents non-cooperative behaviour. In this context, the conditions of time consistency of a cooperative solution have been derived in connection to LQDG. The efficacy of the guidance law derived from the proposed algorithm is illustrated through simulations.

Introduction

Unmanned aerial vehicles (UAV) have found widespread use in applications like surveillance, reconnaissance, intelligent transport, etc. Cooperation among such unmanned agents helps to accomplish a set of predefined mission goals [1], [2], which may require coordination among the agents so that they agree over the meeting point or front (consensus problem), satisfy some velocity constraints when they achieve consensus, or also maintain specific distances between the agents (formation control). This work addresses a problem where distributed control strategies need to be designed such that the states or outputs of a group of agents converge to a common value, called consensus point, within a finite time. The consensus point or reference trajectory is not predefined/specified, nor does a leader UAV possess information about it.

A game theoretic approach is considered in this work by assuming an egocentric model of the UAVs, in which each UAV minimizes a cost function associated with it, under certain behavioral assumptions about the other UAVs. In existing schemes in the literature, the UAVs implement noncooperative Nash equilibrium strategies (NES) to safeguard their own interests if communication and cooperation between UAVs are not allowed [2]. The states of a UAV may be coupled with those of the other UAVs in its cost functional. In such cases, the cost incurred by a UAV is affected by the strategy of another UAV if the UAVs minimize their cost functionals through noncooperative strategies.

On the other hand, when communication and agreement making are permitted, individual costs may be minimized if each UAV is informed about the decisions of the other UAVs and can lower the team cost through a suitable cooperative strategy. Given the team cost when the UAVs follow noncooperative strategies, a cooperative strategy can be found that results in a lower team cost [3], [4]. If the UAVs are considered as rational decision makers, all UAVs will prefer a cooperative strategy that results in a lower cost for all the members. However, such solutions may not be unique. Of these strategies, we look for the set of cooperative strategies such that all the UAVs cannot get cost benefits simultaneously by switching from one such strategy to another in this set. This set of strategies are called Pareto-optimal strategies [5]. To highlight the subtleties involved in making a choice in such situations we give below two examples of positional consensus of the UAVs in a group. In these examples, choice of consensus state made by a UAV is its strategy.

Consider two UAVs, starting from points A and point B, trying to meet at a common point (see Fig. 1). Both UAVs consider the distance of the meeting point from their initial positions to be the cost functions. Note that any point C′ on $\bar{A B}$ is a Pareto-optimal solution since no other consensus point C′′ on $\bar{A B}$ exists such that $\bar{A C^{″}} \leq \bar{A C^{'}}$ and $\bar{C^{″} B} \leq \bar{C^{'} B},$ with strict inequality holding for at least one player. Now, consider a criterion that the sum of the squares of the distances to the consensus point has to be minimum. This criterion leads to the unique Pareto-optimal solution at point C, which is the midpoint of line $\bar{A B}$ .

Consider another positional consensus problem involving three UAVs, with their initial positions at E, F, and G (see Fig. 2). If the UAVs fail to reach a consensus over a meeting point, then they have to meet at the point O which is equidistant from their initial positions. But, each player can gain by traveling a lesser distance if they meet at G′ instead, since $\bar{E G^{'}} \leq \bar{E O},$ $\bar{F G^{'}} \leq \bar{F O},$ $\bar{G G^{'}} \leq \bar{G O}$ . However, G′ may not be the only point to satisfy this criterion. A nonempty set K ⊂ △EFG (shown in gray in Fig. 2) exists such that no two points K′, K′′ ∈ K can be found to satisfy the set of conditions $\bar{E K^{'}} \leq \bar{E K^{″}},$ $\bar{F K^{'}} \leq \bar{F K^{″}},$ $\bar{G K^{'}} \leq \bar{G K^{″}},$ or $\bar{E K^{″}} \leq \bar{E K^{'}},$ $\bar{F K^{″}} \leq \bar{F K^{'}},$ $\bar{G K^{″}} \leq \bar{G K^{'}}$ with the strict inequality holding for at least one UAV. All points in the set K are called Pareto-optimal solutions. Thus, as in the previous example, one may want to obtain a set of desired criteria that would result in a unique Pareto-optimal solution in K.

The problems become more complicated with large number of UAVs. The costs may be nonlinear and consensus on velocity may also be a desired objective in addition to positional consensus. Thus, some additional criteria of a Pareto-optimal solution need to be identified in order to obtain a unique solution. Cooperative game theoretic solution techniques like Nash bargaining solution (NBS), egalitarian solution, etc., possess properties that help to yield a unique solution.

A variety of techniques, including game theoretic ones, have been used for consensus, rendezvous, and formation flying problems for multi-agent systems [6]. Of these, those related to achieving consensus, or rendezvous, or formation of unmanned vehicles and agents with linear or non-holonomic dynamics, are relevant to this work, and are reviewed briefly below.

Three dimensional nonlinear guidance laws based on a back-stepping technique were proposed for a cooperative group of UAVs in [7]. A decentralized protocol requiring only the state information of the neighbour agents was proposed in [8] to achieve positional consensus. Decentralized sliding mode controllers were proposed in [9] to achieve a consensus in altitude and heading angle for a connected and leaderless swarm of UAVs. Papers that address rendezvous and the associated generalized problem of formation control are adaptive navigation gain based proportional navigation [10], deviated pursuit guidance law [11], vision-based nonlinear pursuit [12], bio-inspired method [13], optical-navigation [14], and special-point-based maneuver [15]. Formation control has also been addressed using Lyapunov technique [16], feedback linearization [17], decentralized control [18], leaderless framework [19], cyclic pursuit [20], line-of-sight guidance [21], and decentralized overlapping feedback laws [22]. Leader-follower strategies were adopted for formation control in [23], [24], [25], [26], [27], [28], [29], [30]. Although the techniques in these works [7]–[30] ensure convergence to consensus or rendezvous, they do not use any optimization criteria.

Some papers obtain consensus through optimization of a cost. For instance, [31] uses a single centralized cost function. Decentralized optimization schemes like dual decomposition method [32], decentralized receding horizon control [33], [34], [35], model predictive control based hierarchical optimization framework [36], and mixed integer linear programming algorithm [37] have been also proposed for formation control of decoupled dynamic agents. All these papers assume that the vehicles are cooperative and they do not have any conflict of interest.

Another approach, which is the most relevant to our work, is to treat consensus as a decision making problem among multiple players. A negotiation scheme based distributed optimization technique was proposed in [38]. Decentralized and distributed task allocation schemes, based on team theory, game theory, and negotiation techniques were used in [39]. Formation control by cooperative game theoretic methods was first suggested in [40]. In [41], an algorithm was presented for a multiple UAV system to reach Nash equilibrium; the UAVs’ decisions were based on local dynamics and global constraints. A stationary consensus protocol was designed in [42], where the agents were shown to reach a group decision value by optimizing individual objective functions. A penalty method based approach was proposed in [43] with decentralized optimization leading to the Nash Bargaining solution. In [5], two algorithms were proposed to realize NBS for nonzero-sum linear quadratic differential games (NZSLQDG) for a team of players, but without any formal proof of convergence. In [44], the criterion for individual controller structure was derived such that agents attain NBS assuming local information structure and using an algorithm from [5]. Formation control of robots with double integrator dynamics was formulated as a finite time LQDG in [45]. In [46], a feedback information Nash equilibrium solution was proposed for a time-varying formation control of mobile robots. Formation of UAVs was formulated as a pursuer-evader game between a group of UAVs (pursuers) and a virtual target (evader) in [47]. In [48], a formation control problem for a multiple-UAV scenario was formulated as a differential game problem, and an open-loop Nash strategy design approach was proposed for distributed control.

In this paper, the rendezvous problem of UAVs is posed as a NZSLQDG. The cost functionals of the UAVs are assumed to be convex such that a Pareto-optimal solution can be realized by minimizing a team cost obtained by a convex combination of individual UAV costs. However, the individual costs incurred cannot be easily related to the weights used to construct the team cost. Hence, selection of the bargaining weights, to realize a particular cooperative game theoretic solution, becomes difficult. The main contribution of this paper is in proposing a multi-player negotiation algorithm, and presenting the conditions on the cost functionals of the UAVs that make the proposed algorithm converge to the convex combination of weights leading to the Nash bargaining solution (NBS) in an NZSLQDG.

Note that while [5] and [44] both dealt with NBS of NZSLQDG, they do not consider the problems involving autonomous agents which pose certain unique challenges. One such problem is that as the states evolve in time, and at some intermediate point in time an autonomous UAV may find it beneficial to switch to a non-cooperative mode, defeating the purpose of cooperation. In other words, a cooperative agreement made at the beginning of the game may fail to sustain till the end notwithstanding its fulfilling the overall individual rationality criterion [49]. Thus, in addition to finding a Pareto-optimal solution like the NBS, it is pertinent to look into the time consistency [50] criteria of a solution and formulate a mechanism to keep the players in cooperation over the whole duration where there is a possibility that the players may start with a cooperative solution which may not remain time consistent after some time. The time consistency issue has not been addressed in consensus/rendezvous related problems involving autonomous UAVs so far in the literature. In this work, the conditions for time consistency of a cooperative solution in an NZSLQDG, both in the weak and strong sense, have been derived. We present examples where the NBS is not weak time consistent. In such cases, we propose that the UAVs initially negotiate on the NBS, and employ the controls as derived from the NBS as long as the cost-to-go estimated by each UAV is less for NBS than for Nash equilibrium based solution. However, the UAVs are made to renegotiate at the instant when one or multiple UAVs find that the cost-to-go along the previously derived NBS trajectory is greater than that of the Nash equilibrium based solution. After renegotiation at an intermediate point, UAVs continue in the trajectories dictated by the NBS that they reached in the latest renegotiation. This proposed scheme would prevent a UAV to switch from cooperative to noncooperative mode of play. In the simulation examples, we have shown how this renegotiation mechanism keeps the UAVs in cooperation for the whole game duration.

The organization of the paper is as follows: In Section 2, the game theoretic formulation of output consensus problem of UAVs is presented. Algorithm to find the NBS is proposed in Section 3. In the same section, we describe the time consistency issue of cooperative differential games, and propose the renegotiation mechanism. Some problems of rendezvous related to multi-UAV team have been simulated in Section 4 with a discussion on the simulation results. The paper concludes with a discussion on future direction of research in Section 5.

Section snippets

Consensus over outputs

Consider UAVs indexed by $N = {1, 2, \dots, n}$ . The state vector ξ_i of UAV i evolves as, ${\dot{ξ}}_{i} = A_{i} ξ_{i} + b_{i} u_{i}$ where, u_i is the control vector of UAV i. Let ξ denote the collection of state vectors of all the UAVs $ξ = {[ξ_{1}^{T}, \dots, ξ_{n}^{T}]}^{T}$ . When the consensus state is not predefined, consensus building at a time instant t_f, starting from an time instant t₀ (t_f > t₀), is the problem of driving the output vectors $h_{1}, \dots, h_{n}$ of the UAVs such that the condition $∥ h_{i} (t_{f}) - h_{j} (t_{f}) ∥ < ϵ$ is satisfied for a sufficiently small ϵ > 0, for

NBS algorithm

The solutions to the minimization problem in Eq. (14) is a function of the parameter α which provides a set of Pareto-optimal solutions (Fig. 3). The proposed algorithm should lead to α^NBS starting from any $α \in A$ . This is difficult because the relation between an $α \in A$ and the corresponding individual costs $J_{i}^{α} (ξ_{0}, t_{0})$ is not explicitly known.

In [5], an algorithm has been proposed to find NBS that utilizes the property mentioned in Eq. (31). The proposed algorithm uses Eq. (31) for an update step,

Rendezvous problems and simulation results

In the following, two variants of rendezvous problems are considered among a group of UAVs. We assume that collision avoidance, though a desired feature in a guidance law designed for multi-UAV scenario, is not a main issue as the vehicles fly at different altitude levels.

Conclusions

In this work, we presented the problem of rendezvous among a group of unmanned aerial vehicles (UAV), in the absence of a leader UAV and any reference trajectory, as a cost optimization problem for individual UAVs. In a cooperative game-theoretic treatment of the problem, an algorithm is proposed which converges to the Nash bargaining solution, among the set of Pareto-optimal solutions, where each UAV incurs a lower cost than what it obtains from Nash equilibrium. Due to its generalized nature,

References (53)

LiY. et al.
Nonlinear protocols for distributed consensus in directed networks of dynamic agents
J. Frankl. Inst.
(2015)
DongX. et al.
Time-varying formation control for unmanned aerial vehicles with switching interaction topologies
Control Eng. Pract.
(2016)
A. Mahmood et al.
Decentralized formation flight control of quadcopters using robust feedback linearization
J. Frankl. Inst.
(2017)
D.M. Stipanović et al.
Decentralized overlapping control of a formation of unmanned aerial vehicles
Automatica
(2004)
ChangK. et al.
Coordinated formation control design with obstacle avoidance in three-dimensional space
J. Frankl. Inst.
(2015)
LiZ. et al.
Decentralized output-feedback formation control of multiple 3-dof laboratory helicopters
J. Frankl. Inst.
(2015)
O. Saif et al.
Flocking of multiple unmanned aerial vehicles by LQR control
Proceedings of the International Conference on Unmanned Aircraft Systems (ICUAS)
(2014)
T. Keviczky et al.
Decentralized receding horizon control for large scale dynamically decoupled systems
Automatica
(2006)
DaiL. et al.
Distributed MPC for formation of multi-agent systems with collision avoidance and obstacle avoidance
J. Frankl. Inst.
(2017)
ZhaoW. et al.
Quadcopter formation flight control combining MPC and robust feedback linearization
J. Frankl. Inst.
(2014)

M. Radmanesh et al.

Flight formation of UAVs in presence of moving obstacles using fast-dynamic mixed integer linear programming

Aerosp. Sci. Technol.

(2016)

J.P. Wangermann et al.

Optimization and coordination of multiagent systems using principled negotiation

J. Guid. Control Dyn.

(1999)

GuD.

A differential game approach to formation control

IEEE Trans. Control Syst. Technol.

(2008)

P.K. Dutta

A folk theorem for stochastic games

J. Econ. Theory

(1995)

G. Zaccour

Time consistency in cooperative differential games: a tutorial

INFOR: Inf. Syst. Oper. Res.

(2008)

B. Sinopoli et al.

Distributed control applications within sensor networks

Proc. IEEE

(2003)

T. Shima et al.

UAV Cooperative Decision and Control

(2009)

D. Bauso

Cooperative control and optimization: a neuro-dynamic programming approach

(2003)

J.R. Marden et al.

Game Theory and Distributed Control

Handbook of Game Theory

(2012)

J.C. Engwerda

LQ Dynamic Optimization and Differential Games

(2005)

CaoY. et al.

An overview of recent progress in the study of distributed multi-agent coordination

IEEE Trans. Ind. Inf.

(2013)

M. Ahmed et al.

Nonlinear guidance and consensus for unmanned vehicles with time varying connection topologies

Proceedings of the Forty-Ninth AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition

(2011-76)

S. Rao et al.

Sliding mode control-based autopilots for leaderless consensus of unmanned aerial vehicles

IEEE Trans. Control Syst. Technol.

(2014)

A.L. Smith

Proportional navigation with adaptive terminal guidance for aircraft rendezvous

J. Guid. Control Dyn.

(2008)

A. Ratnoo

Variable deviated pursuit for rendezvous guidance

J. Guid. Control Dyn.

(2015)

J.W. Nichols et al.

Aerial rendezvous of small unmanned aircraft using a passive towed cable system

J. Guid. Control Dyn.

(2014)

Cited by (8)

NABGAMES: NASH BARGAINING GAME FOR IMPROVING COVERAGE IN UNMANNED AERIAL VEHICLES (UAV)
2023, Journal of Theoretical and Applied Information Technology
Pareto Optimal Strategy Under H<inf>∞</inf> Constraint for the Mean-Field Stochastic Systems in Infinite Horizon
2023, IEEE Transactions on Cybernetics
A Bargaining Game-Based Human-Machine Shared Driving Control Authority Allocation Strategy
2023, IEEE Transactions on Intelligent Transportation Systems
Cooperative linear quadratic differential games for uncertain systems with conservative players
2023, Asian Journal of Control
A Survey on Intelligent Computation Offloading and Pricing Strategy in UAV-Enabled MEC Network: Challenges and Research Directions
2022, arXiv
Nash Bargaining Solution for the Formation Control Problem
2021, 2021 9th International Conference on Systems and Control, ICSC 2021

View all citing articles on Scopus

View full text

Nash Bargaining Solution based rendezvous guidance of unmanned aerial vehicles

Abstract

Introduction

Section snippets

Consensus over outputs

NBS algorithm

Rendezvous problems and simulation results

Conclusions

J. Frankl. Inst.

Control Eng. Pract.

J. Frankl. Inst.

Automatica

J. Frankl. Inst.

J. Frankl. Inst.

Automatica

J. Frankl. Inst.

J. Frankl. Inst.

Aerosp. Sci. Technol.

J. Guid. Control Dyn.

IEEE Trans. Control Syst. Technol.

J. Econ. Theory

INFOR: Inf. Syst. Oper. Res.

Distributed control applications within sensor networks

Proc. IEEE

UAV Cooperative Decision and Control

Cooperative control and optimization: a neuro-dynamic programming approach

Game Theory and Distributed Control

Handbook of Game Theory

LQ Dynamic Optimization and Differential Games

An overview of recent progress in the study of distributed multi-agent coordination

IEEE Trans. Ind. Inf.

Nonlinear guidance and consensus for unmanned vehicles with time varying connection topologies

Proceedings of the Forty-Ninth AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition

Sliding mode control-based autopilots for leaderless consensus of unmanned aerial vehicles

IEEE Trans. Control Syst. Technol.

Proportional navigation with adaptive terminal guidance for aircraft rendezvous

J. Guid. Control Dyn.

Variable deviated pursuit for rendezvous guidance

J. Guid. Control Dyn.

Aerial rendezvous of small unmanned aircraft using a passive towed cable system

J. Guid. Control Dyn.