Abstract
In this paper, we propose non-model-based strategies for locally stable convergence to Nash equilibrium in quadratic noncooperative games where acquisition of information (of two different types) incurs delays. Two sets of results are introduced: (a) one, which we call cooperative scenario, where each player employs the knowledge of the functional form of his payoff and knowledge of other players’ actions, but with delays; and (b) the second one, which we term the noncooperative scenario, where the players have access only to their own payoff values, again with delay. Both approaches are based on the extremum seeking perspective, which has previously been reported for real-time optimization problems by exploring sinusoidal excitation signals to estimate the Gradient (first derivative) and Hessian (second derivative) of unknown quadratic functions. In order to compensate distinct delays in the inputs of the players, we have employed predictor feedback. We apply a small-gain analysis as well as averaging theory in infinite dimensions, due to the infinite-dimensional state of the time delays, in order to obtain local convergence results for the unknown quadratic payoffs to a small neighborhood of the Nash equilibrium. We quantify the size of these residual sets and corroborate the theoretical results numerically on an example of a two-player game with delays.






Similar content being viewed by others
Notes
By strict concavity, we mean \(J_i(\theta )\) is strictly concave in \(\theta _i\) for all \(\theta _{-i}\), this being so for each \(i=1,\ldots , N\).
If the scalar \(c<0\) is considered, the direction of convection must be reversed such that the boundary u(0, t) is replaced by u(1, t) and vice versa.
Abbreviations
- ES:
-
Extremum seeking
- ODE:
-
Ordinary differential equation
- PDE:
-
Partial differential equation
- FDE:
-
Functional differential equation
- ISS:
-
Input-to-state stability
References
Fudenberg, D., Tirole, J.: Game Theory. The MIT Press, Cambridge (1991)
Başar, T., Zaccour, G. (eds.): Handbook of Dynamic Game Theory, vol. I. Springer International Publishing, Berlin (2018)
Han, Z., Niyato, D., Saad, W., Başar, T.: Game Theory for Next Generation Wireless and Communication Networks: Modeling, Analysis, and Design. Cambridge University Press, Cambridge (2019)
Amina, S., Schwartz, G.A., Sastry, S.S.: Security of interdependent and identical networked control systems. Automatica 49, 186–192 (2013)
Başar, T., Zaccour, G. (eds.): Handbook of Dynamic Game Theory, (Applications of Dynamic Games), vol. II. Springer International Publishing, Berlin (2018)
Starr, A.W., Ho, Y.C.: Nonzero-sum differential games. J. Optim. Theory Appl. 3, 184–206 (1969)
Petrovic, B., Gajic, Z.: Recursive solution of linear-quadratic Nash games for weakly interconnected systems. J. Optim. Theory Appl. 56, 463–477 (1988)
Srikant, R., Başar, T.: Iterative computation of noncooperative equilibria in nonzero-sum differential games with weakly coupled players. J. Optim. Theory Appl. 71, 137–168 (1991)
Wang, W., Sun, H., Van den Brink, R., Xu, G.: The family of ideal values for cooperative games. J. Optim. Theory Appl. 180, 1065–1086 (2018)
Cotrina, J., Zúñiga, J.: Time-dependent generalized Nash equilibrium problem. J. Optim. Theory Appl. 179, 1054–1064 (2018)
Aussel, D., Svensson, A.: Towards tractable constraint qualifications for parametric optimisation problems and applications to generalised Nash games. J. Optim. Theory Appl. 182, 404–416 (2019)
Alasseur, C., Taher, I.B., Matoussi, A.: An extended mean field game for storage in smart grids. J. Optim. Theory Appl. 184, 644–670 (2020)
Başar, T., Olsder, G.J.: Dynamic Noncooperative Game Theory. SIAM Series in Classics in Applied Mathematics. SIAM, Philadelphia (1999)
Nash, J.F.: Noncooperative games. Ann. Math. 54, 286–295 (1951)
Li, S., Başar, T.: Distributed learning algorithms for the computation of noncooperative equilibria. Automatica 23, 523–533 (1987)
Başar, T.: Relaxation techniques and the on-line asynchronous algorithms for computation of noncooperative equilibria. J. Econ. Dyn. Control 11, 531–549 (1987)
Zhu, Q., Tembine, H., Başar, T.: Hybrid learning in stochastic games and its applications in network security (chapter 14). In: Lewis, F.L., Liu, D. (eds.) Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. Series on Computational Intelligence, pp. 305–329. IEEE Press/Wiley, New York (2013)
Frihauf, P., Krstic, M., Başar, T.: Nash equilibrium seeking in noncooperative games. IEEE Trans. Autom. Control 57, 1192–1207 (2012)
Krstic, M., Wang, H.H.: Stability of extremum seeking feedback for general dynamic systems. Automatica 36, 595–601 (2000)
Alpcan, T., Başar, T.: Network Security: A Decision and Game Theoretic Approach. Cambridge University Press, Cambridge (2011)
Ciletti, M.D.: Differential games with information time lag: norm-invariant systems. J. Optim. Theory Appl. 9, 293–301 (1972)
Mori, K., Shimemura, E.: Linear differential games with delayed and noisy information. J. Optim. Theory Appl. 13, 275–289 (1974)
Kaskosz, B., Tadumadze, T.: A differential game of evasion with delays. J. Optim. Theory Appl. 44, 231–268 (1984)
Ehtamo, H., Hämäläinen, R.P.: Incentive strategies and equilibria for dynamic games with delayed information. J. Optim. Theory Appl. 63, 355–369 (1989)
Glizer, V.Y., Shinar, J.: Optimal evasion from a pursuer with delayed information. J. Optim. Theory Appl. 111, 7–38 (2001)
Pamen, O.M.: Optimal control for stochastic delay systems under model uncertainty: a stochastic differential game approach. J. Optim. Theory Appl. 167, 998–1031 (2015)
Carmona, R., Fouque, J.-P., Mousavi, S.M., Sun, L.-H.: Systemic risk and stochastic games with delay. J. Optim. Theory Appl. 179, 366–399 (2018)
Krstic, M.: Delay Compensation for Nonlinear, Adaptive, and PDE Systems. Birkhauser, Boston (2009)
Oliveira, T.R., Krstic, M., Tsubakino, D.: Extremum seeking for static maps with delays. IEEE Trans. Autom. Control 62, 1911–1926 (2017)
Oliveira, T.R., Tsubakino, D., Krstic, M.: A simplified multivariable gradient extremum seeking for distinct input delays with delay-independent convergence rates. In: American Control Conference (ACC), Denver, CO, USA, pp. 608–613 (2020)
Karafyllis, I., Krstic, M.: Input-to-State Stability for PDEs. Springer, Cham (2018)
Hale, J.K., Lunel, S.M.V.: Averaging in infinite dimensions. J. Integral Equ. Appl. 2, 463–494 (1990)
Khalil, H.K.: Nonlinear Systems. Prentice Hall, Upper Saddle River (2002)
Hale, J.K., Lunel, S.M.V.: Introduction to Functional Differential Equations. Springer, Berlin (1993)
Fridman, E.: Introduction to Time-Delay Systems: Analysis and Control. Birkhäuser, Basel (2014)
Ghaffari, A., Krstic, M., Nesic, D.: Multivariable Newton-based extremum seeking. Automatica 48, 1759–1767 (2012)
Artstein, Z.: Linear systems with delayed controls: a reduction. IEEE Trans. Autom. Control 27, 869–879 (1982)
Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge Univ. Press, Cambridge (1985)
Oliveira, T.R., Hsu, L., Peixoto, A.J.: Output-feedback global tracking for unknown control direction plants with application to extremum-seeking control. Automatica 47, 2029–2038 (2011)
Feiling, J., Koga, S., Krstic, M., Oliveira, T.R.: Gradient extremum seeking for static maps with actuation dynamics governed by diffusion PDEs. Automatica 95, 197–206 (2018)
Oliveira, T.R., Feiling, J., Koga, S., Krstic, M.: Multivariable extremum seeking for PDE dynamic systems. IEEE Trans. Autom. Control (Early Access). 10.1109/TAC.2020.3005177 (2020)
Acknowledgements
The first and second authors thank the Brazilian funding agencies CAPES, CNPq and FAPERJ for the financial support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Averaging and Small-Gain Theorems
Appendix: Averaging and Small-Gain Theorems
Theorem A.1
(Averaging Theorem for FDEs [32]) Consider the delay system
where \(\epsilon \) is a real parameter, \(x_t(\varTheta ) = x (t+\varTheta )\) for \(-r\le \varTheta \le 0\), and \(f : {\mathbb {R}}_{+} \times \varOmega \rightarrow {\mathbb {R}}^n\) is a continuous functional from a neighborhood \(\varOmega \) of 0 of the supremum-normed Banach space \(X = C([-r, 0]; {\mathbb {R}}^n)\) of continuous functions from \([-r, 0]\) to \({\mathbb {R}}^n\). Assume that \(f(t,\varphi )\) is periodic in t uniformly with respect to \(\varphi \) in compact subsets of \(\varOmega \) and that f has a continuous Fréchet derivative \(\partial f (t,\varphi )/\partial \varphi \) in \(\varphi \) on \({\mathbb {R}}_{+} \times \varOmega \). If \(y = y_0\in \varOmega \) is an exponentially stable equilibrium for the average system
where \(f_0(\varphi )=\lim _{T\rightarrow \infty }\frac{1}{T} \int _{0}^{T} f(s,\varphi ) \hbox {d}s\), then, for some \(\epsilon _0 > 0\) and \(0 <\epsilon \le \epsilon _0\), there is a unique periodic solution \(t \mapsto x^*(t,\epsilon )\) of (134) with the properties of being continuous in t and \(\epsilon \), satisfying \(|x^*(t, \epsilon ) - y_0| \le {\mathcal {O}}(\epsilon )\), for \(t \in {\mathbb {R}}_{+}\), and such that there is \(\rho >0\) so that if \(x(\cdot ;\varphi )\) is a solution of (134) with \(x(s) = \varphi \) and \(|\varphi - y_0| < \rho \), then \(|x(t)-x^*(t,\epsilon )| \le C e^{-\gamma (t-s)}\), for \(C>0\) and \(\gamma >0\).
Theorem A.2
(Small-Gain Theorem for ODE and Hyperbolic PDE Loops [31]) Consider generalized solutions of the following initial-boundary value problem
The state of the system (136)–(138) is \((u(z,t),x(t))\in C^{0}([0,1]\times {\mathbb {R}}_{+}) \times {\mathbb {R}}^n\), while the other variables \(d\in C^{0}({\mathbb {R}}_+;{\mathbb {R}}^q)\), \(f\in C^{0}([0,1] \times {\mathbb {R}}_+)\) and \(v\in C^{0}({\mathbb {R}}_+\,;{\mathbb {R}}^m)\) are external inputs. We assume that \((0,0) \in C^{0}([0,1])\times {\mathbb {R}}^n\) is an equilibrium point for the input-free system, i.e., \(F(0,0,0)=0\), \(g(z,0,0)=0\), and \(\varphi (0,0,0)=0\). Now, we assume that the ODE subsystem satisfies the ISS property:
-
(H1) There exist constants \(M, \sigma >0\), \(b_3, \gamma _3\ge 0\), such that for every \(x_0\in {\mathbb {R}}^n\), \(u\in C^{0}([0,1] \times {\mathbb {R}}_{+})\) and \(v\in C^0({\mathbb {R}}_{+}\,;{\mathbb {R}}^{m})\) the unique solution \(x \in C^{1}({\mathbb {R}}_{+}\,;{\mathbb {R}}^{n})\) of (136) with \(x(0)=x_0\) satisfies the following estimate
$$\begin{aligned} |x(t)| \le M |x_0| \exp (-\sigma t) + \max _{0\le s \le t}(\gamma _3 \Vert u(s)\Vert _{\infty }+b_3|v(s)|), \quad \forall t \ge 0. \end{aligned}$$(139)We next need to estimate the static gain of the interconnections. To this purpose, we employ the following further assumption.
-
(H2) There exist constants \(b_2,\gamma _1,\gamma _2,A,B \ge 0\) such that the following growth conditions hold for every \(x\in C^{1}({\mathbb {R}}_{+};{\mathbb {R}}^{n})\), \(u \in C^{0}([0,1]\times {\mathbb {R}}_{+})\) and \(d \in C^{0}({\mathbb {R}}_{+};{\mathbb {R}}^{q})\):
$$\begin{aligned} |g(z,x,u)|\le & {} A \Vert u\Vert _{\infty } + \gamma _1 |x|, \quad \forall z \in [0,1], \end{aligned}$$(140)$$\begin{aligned} |\varphi (d,u,x)|\le & {} B \Vert u\Vert _{\infty } + \gamma _2 |x| + b_2 |d|. \end{aligned}$$(141)
Let \(c>0\)Footnote 2 be a given constant and \(a \in C^{0}([0,1])\) be a given function. Consider the mappings as \(F:{\mathbb {R}}^n \times C^{0}([0,1]) \times {\mathbb {R}}^m \rightarrow {\mathbb {R}}^n\), \(g:[0,1] \times {\mathbb {R}}^n \times C^{0}([0,1]) \rightarrow {\mathbb {R}}\), \(\varphi : {\mathbb {R}}^q \times C^{0}([0,1]) \times {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) being continuous mappings with \(F(0,0,0)=0\) for which there exist constants \(L>0\), \({\bar{N}}\in {[0,1 [}\) such that the inequalities \(\max _{0\le z \le 1}(|g(z,x,u)-g(z,y,w)|)+|F(x,u,v)-F(y,w,v)|\le L|x-y|+L\Vert u-w\Vert _{\infty }\), \(|\varphi (d,u,x)-\varphi (d,w,y)|\le \bar{N}|x-y|+\bar{N}\Vert u-w\Vert _{\infty }\), hold for all \(u, w \in C^{0}([0,1])\), \(x,y \in {\mathbb {R}}^n\), \(v\in {\mathbb {R}}^{m}\), \(d\in {\mathbb {R}}^{q}\). Suppose that Assumptions (H1) and (H2) hold and that the following small-gain condition is satisfied:
with \(p(z) {:=} \exp \left( c^{-1}\int _{0}^{z}a(w)dw\right) \) for \(z \in [0,1]\) [recall (8.2.11) and (8.2.14)] in [31, Section 8.2]. Then, there exist constants \(\delta , \varTheta , \gamma > 0\) such that for every \(u_0 \in C^{0}([0,1])\), \(x_0 \in {\mathbb {R}}^{n}\), \(d \in C^{0}({\mathbb {R}}_{+}\,;{\mathbb {R}}^{q})\) with \(u_0(0) = \varphi (d(0),u_0,x_0)\), \(f \in C^{0}([0,1] \times {\mathbb {R}}_{+})\), and \(v \in C^{0}({\mathbb {R}}_{+}\,;{\mathbb {R}}^{m})\) the unique generalized solution of the initial-boundary value problem (136), (137), (138) satisfies the following estimate:
Rights and permissions
About this article
Cite this article
Oliveira, T.R., Rodrigues, V.H.P., Krstić, M. et al. Nash Equilibrium Seeking in Quadratic Noncooperative Games Under Two Delayed Information-Sharing Schemes. J Optim Theory Appl 191, 700–735 (2021). https://doi.org/10.1007/s10957-020-01757-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-020-01757-z
Keywords
- Extremum seeking
- Nash equilibrium
- (Non)cooperative games
- Time delays
- Predictor feedback
- Averaging in infinite dimensions