Skip to main content
Log in

Robust Mean Field Games

  • Published:
Dynamic Games and Applications Aims and scope Submit manuscript

Abstract

Recently there has been renewed interest in large-scale games in several research disciplines, with diverse application domains as in the smart grid, cloud computing, financial markets, biochemical reaction networks, transportation science, and molecular biology. Prior works have provided rich mathematical foundations and equilibrium concepts but relatively little in terms of robustness in the presence of uncertainties. In this paper, we study mean field games with uncertainty in both states and payoffs. We consider a population of players with individual states driven by a standard Brownian motion and a disturbance term. The contribution is threefold: First, we establish a mean field system for such robust games. Second, we apply the methodology to production of an exhaustible resource. Third, we show that the dimension of the mean field system can be significantly reduced by considering a functional of the first moment of the mean field process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The distance and the norm referred to throughout the paper is the \(L^2\) norm \(\Vert \cdot \Vert _2\). For instance, \(\hbox {dist}(x,A)=\min _{y \in A} \Vert x-y\Vert _2\).

References

  1. Aldous D (1985) Exchangeability and related topics. In: Hennequin P (ed) Ecole d’ Ete de Probabilites de Saint-Flour XIII–1983. Springer, Heidelberg. Lecture notes in mathematics, vol 1117, pp 1–198

  2. Bagagiolo F, Bauso D (2011) Objective function design for robust optimality of linear control under state-constraints and uncertainty. ESAIM Control Optim Calc Var 17:155–177

    Article  MathSciNet  MATH  Google Scholar 

  3. Başar T (1999) Nash equilibria of risk-sensitive nonlinear stochastic differential games. J Optim Theory Appl 100(3):479–498

    Article  MathSciNet  MATH  Google Scholar 

  4. Başar T, Bernhard P (1995) \(H^\infty \)-optimal control and related minimax design problems: a dynamic game approach. Birkhäuser, Boston, MA

    MATH  Google Scholar 

  5. Başar T, Olsder GJ (1999) Dynamic noncooperative game theory. SIAM Series in Classics in Applied Mathematics, Philadelphia

    MATH  Google Scholar 

  6. Bauso D, Lehrer E, Solan E, Venel X (2015) Attainability in repeated games with vector payoffs. INFORMS Mathematics of Operations Research. http://dx.doi.org/10.1287/moor.2014.0693

  7. Bauso D, Tembine H (2015) Crowd-averse cyber-physical systems: the paradigm of robust mean field games. IEEE T Automat Contr, Accepted June 2015

  8. Bauso D, Tembine H, Başar T (2012) Robust mean field games with application to production of an exhaustible resource. In: Proceedings of the 7th IFAC symposium on robust control design, Aalborg, Denmark, June 20–22

  9. Benamou JD, Brenier Y (2000) A computational fluid mechanics solution to the Monge–Kantorovich mass transfer problem. Numer Math 84:375–393

    Article  MathSciNet  MATH  Google Scholar 

  10. Blackwell D (1956) An analog of the minimax theorem for vector payoffs. Pac J Math 6(1):1–8

    Article  MathSciNet  MATH  Google Scholar 

  11. Cardaliaguet P (2011) Notes on mean-field games. https://www.ceremade.dauphine.fr/cardalia/MFG100629

  12. Cesa-Bianchi N, Lugosi G (2006) Prediction, learning and games. Cambridge University Press, Cambridge, MA

    Book  MATH  Google Scholar 

  13. de Finetti B (1931) Funzione caratteristica di un fenomeno aleatorio. In: Atti della R. Academia Nazionale dei Lincei, Serie 6. Memorie, Classe di Scienze Fisiche, vol 4. Mathematice e Naturale, pp 251–299

  14. Elliot NJ, Kalton NJ (1972) The existence of value in differential games of pursuit and evasion. J Differ Equ 12:504–523

    Article  MathSciNet  MATH  Google Scholar 

  15. Foster D, Vohra R (1999) Regret in the on-line decision problem. Games Econ Behav 29:7–35

    Article  MathSciNet  MATH  Google Scholar 

  16. Gomes DA, Saúde J (2014) Mean field games models—a brief survey. Dyn Games Appl 4(2):110–154

    Article  MathSciNet  MATH  Google Scholar 

  17. Gueant O, Lasry JM, Lions PL (2010) Mean field games and applications. In: Paris-Princeton lectures, Springer, Berlin

  18. Hart S (2005) Adaptive heuristics. Econometrica 73:1401–1430

    Article  MathSciNet  MATH  Google Scholar 

  19. Hart S, Mas-Colell A (2001) A general class of adaptive strategies. J Econ Theory 98:26–54

    Article  MathSciNet  MATH  Google Scholar 

  20. Hart S, Mas-Colell A (2003) Regret-based continuous-time dynamics. Games Econ Behav 45:375–394

    Article  MathSciNet  MATH  Google Scholar 

  21. Hewitt E, Savage LJ (1955) Symmetric measures on Cartesian products. Trans Am Math Soc 80:470–501

    Article  MathSciNet  MATH  Google Scholar 

  22. Hou T-F (1971) Approachability in a two-person game. Ann Math Stat 42:735–744

    Article  MathSciNet  MATH  Google Scholar 

  23. Huang MY, Caines PE, Malhamé RP (2003) Individual and mass behaviour in large population stochastic wireless power control problems: centralized and nash equilibrium solutions. In: IEEE conference on decision and control, HI, USA, pp 98–103

  24. Huang MY, Malhame RP, Caines PE (2006) Nash certainty equivalence in large population stochastic dynamic games: connections with the physics of interacting particle systems. In: 45th IEEE conference on decision and control, San Diego, pp 4921–4926

  25. Huang MY, Caines PE, Malhamé RP (2006) Large population stochastic dynamic games: closed loop Kean–Vlasov systems and the Nash certainty equivalence principle. Commun Inf Syst 6(3):221–252

    MathSciNet  MATH  Google Scholar 

  26. Huang M, Caines P, Malhamé R (2007) Population cost-coupled LQG problems with non-uniform agents: individual-mass behaviour and decentralized \(\epsilon \)-Nash equilibria. Trans Autom Control 52(9):1560–1571

    Article  Google Scholar 

  27. Jovanovic B, Rosenthal RW (1988) Anonymous sequential games. J Math Econ 17:77–87

    Article  MathSciNet  MATH  Google Scholar 

  28. Lasry J-M, Lions P-L (2006) Jeux à champ moyen. I Le cas stationnaire. Comptes Rendus Math 343(9):619–625

    Article  MathSciNet  Google Scholar 

  29. Lasry J-M, Lions P-L (2006) Jeux à champ moyen. II Horizon fini et controle optimal. Comptes Rendus Math 343(10):679–684

    Article  MathSciNet  Google Scholar 

  30. Lasry J-M, Lions P-L (2007) Mean field games. Jpn J Math 2(1):229–260

    Article  MathSciNet  MATH  Google Scholar 

  31. Lehrer E (2002) Allocation processes in cooperative games. Int J Game Theory 31:341–351

    Article  MathSciNet  MATH  Google Scholar 

  32. Lehrer E (2002) Approachability in infinite dimensional spaces. Int J Game Theory 31(2):253–268

    Article  MathSciNet  MATH  Google Scholar 

  33. Lehrer E (2003) A wide range no-regret theorem. Games Econ Behav 42:101–115

    Article  MathSciNet  MATH  Google Scholar 

  34. Lehrer E, Solan E (2006) Excludability and bounded computational capacity strategies. Math Oper Res 31(3):637–648

    Article  MathSciNet  MATH  Google Scholar 

  35. Lehrer E, Solan E (2009) Approachability with bounded memory. Games Econ Behav 66:995–1004

    Article  MathSciNet  MATH  Google Scholar 

  36. Lehrer E, Solan E, Bauso D (2011) Repeated games over networks with vector payoffs: the notion of attainability. In: Proceedings of the NetGCoop 2011, Paris, October 2011

  37. Lehrer E, Sorin S (2007) Minmax via differential inclusion. Convex Anal 14(2):271–273

    MathSciNet  MATH  Google Scholar 

  38. Loparo KA, Feng X (1996) Stability of stochastic systems. In: The control handbook. CRC Press, Boca Raton, FL

  39. Pesenti R, Bauso D (2011) Mean field linear quadratic games with set up costs. In: Proceedings of the international conference on NETwork Games, COntrol and OPtimization (NetGCooP 2011), Paris, pp 12–14

  40. Roxin E (1969) The axiomatic approach in differential games. J Optim Theory Appl 3:153–163

    Article  MathSciNet  MATH  Google Scholar 

  41. Selten R (1970) Preispolitik der Mehrprodktenunternehmung in der statischen theorie. Springer, Berlin

    Book  MATH  Google Scholar 

  42. Soulaimani AS, Quincampoix M, Sorin S (2009) Approchability theory, discriminating domain and differential games. SIAM J Control Optim 48(4):2461–2479

    Article  MathSciNet  MATH  Google Scholar 

  43. Spinat X (2002) A necessary and sufficient condition for approachability. Math Oper Res 27:31–44

    Article  MathSciNet  MATH  Google Scholar 

  44. Tembine H (2011) Mean field stochastic games. https://sites.google.com/site/tembine/stochasticmeanfield

  45. Tembine H, Le Boudec JY, ElAzouzi R, Altman E (2009) Mean field asymptotic of Markov decision evolutionary games. In: International IEEE conference on game theory for networks, Gamenets

  46. Tembine H, Zhu Q, Başar T (2014) Risk-sensitive mean field games, games. IEEE T Automat Contr 59(4):835–850

    Article  MathSciNet  Google Scholar 

  47. Tembine H, Zhu Q, Başar T (2011) Risk-sensitive mean-field stochastic differential games. In: Proceedings of the 2011 IFAC World Congress, Milan, Italy, August 29–September 2, pp 3222–3227

  48. Varaiya P (1967) The existence of solution to a differential game. SIAM J Control Optim 5:153–162

    Article  MathSciNet  MATH  Google Scholar 

  49. Vieille N (1992) Weak approachability. Math Oper Res 17:781–791

    Article  MathSciNet  MATH  Google Scholar 

  50. Weintraub GY, Benkard C, Van Roy B (2005) Oblivious equilibrium: a mean field approximation for large-scale dynamic games. In: Advances in neural information processing systems. MIT Press, Cambridge, MA

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dario Bauso.

Additional information

The work of D. Bauso was supported by the 2012 “Research Fellow” Program of the Dipartimento di Matematica, Università di Trento and by PRIN 20103S5RN3 “Robust decision making in markets and organizations, 2013–2016.” The work of T. Başar was supported in part by the U.S. Air Force Office of Scientific Research (AFOSR) under MURI Grant FA9550-10-1-0573, and in part by NSA through the Information Trust Institute at the University of Illinois.

Appendix

Appendix

In this appendix, we provide proofs for Propositions 2, 3, 4, 5, and 6, which constitute the main results of the paper.

1.1 Proof of Proposition 2

We utilize the strict convexity assumption for the function \(c(\cdot )\) with respect to the control. Then the Hamiltonian is well posed, and the derivative of the Hamiltonian with respect to p provides the drift term of the state from which we deduce the (individual state) feedback optimal control.

1.2 Proof of Proposition 3

The first equation is a backward Hamilton–Jacobi–Bellman–Fleming equation starting at \(T>0.\) The second equation is the evolution of the state distribution given by the forward equation, obtained from Definition 3. Collecting all, one arrives at the mean field system.

1.3 Proof of Proposition 4

The proof follows from Proposition 3 by letting \(\sigma _t=0\) which eliminates the disturbance term.

1.4 Proof of Proposition 5

We first prove condition (33). To do this, let us write the Hamiltonian as:

$$\begin{aligned}&{H}_t(x_t,\partial _x v_t,\bar{m}_t)= \inf _{u} \Big \{-h(\bar{m}_t,\zeta _t^*)u \nonumber \\&\quad + \left[ \frac{a}{2} u^2 + b u\right] + \partial _x v_t (\alpha _t x_t+\beta _t u)\Big \}=0. \end{aligned}$$
(65)

Differentiating with respect to u, we obtain

$$\begin{aligned} a u - h(\bar{m}_t,\zeta _t^*) + b + \partial _x v_t \beta _t = 0 \end{aligned}$$

from which we can derive (33).

We now prove (29)–(32). First notice that (30)–(32) are the boundary conditions and derive straightforwardly from HJBF equations and the evolution of the distribution of states.

To prove (29), let us replace u appearing in the Hamiltonian (65) by its expression (33):

$$\begin{aligned} {H}_t(x_t,\partial _x v_t,\bar{m}_t)\,&=\,u_t^* [-h(\bar{m}_t,\zeta _t^*)+b + \partial _x v_t \beta _t] \\&\quad +\, \frac{a}{2} (u_t^*)^2 + \partial _x v_t \alpha _t x_t \\&= \left( h(\bar{m}_t,\zeta _t^*) - b - \partial _x v_t \beta _t \right) ^2 \left( -\frac{1}{a} + \frac{1}{2a} \right) + \partial _x v_t \alpha _t x_t\\&= -\frac{1}{2a} \left( h(\bar{m}_t,\zeta _t^*) - b - \partial _x v_t \beta _t \right) ^2 + \partial _x v_t \alpha _t x_t\\&=-\frac{1}{2a} \Big (h(\bar{m}_t,\zeta _t^*)^2 + b^2 + ( \partial _x v_t \beta _t)^2 - 2 h(\bar{m}_t,\zeta _t^*) b\\&\quad -\, 2 h(\bar{m}_t,\zeta _t^*) \partial _x v_t \beta _t + 2 b \partial _x v_t \beta _t \Big ) + \partial _x v_t \alpha _t x_t \\&= -\,\frac{1}{2a} \beta _t^2 |\partial _x v_t|^2 + \Big [ -\frac{1}{2a} (- 2 h(\bar{m}_t,\zeta _t^*) \beta _t + 2 b \beta _t) \\&\quad +\,\alpha _t x_t\Big ] \partial _x v_t -\frac{1}{2a} \left( h(\bar{m}_t,\zeta _t^*)^2 + b^2 - 2 h(\bar{m}_t,\zeta ^*_t)b \right) . \end{aligned}$$

Using the above expression of the Hamiltonian in the HJBF equation (21), we arrive at (29).

To prove (31), we simply plug (13) and (33) into (24), and this concludes the proof.

1.5 Proof of Proposition 6

To prove that the feedback policy \(u^*\) as in (19) solves the disturbance attenuation problem (item 1), observe that

$$\begin{aligned} J^{\infty }(x,u^*,m^*,\zeta ):= & {} \mathbb {E} \Big (g(x_T) + \int _{0}^T c_t(x_t,u^*_t,m^*_t) \mathrm{d}t\Big ).\\= & {} \mathbb {E} \Big (v_T(x_T) +\int _{0}^T c_t(x_t,u^*_t,m^*_t) \mathrm{d}t\Big ).\\= & {} \mathbb {E} \Big ( v_0(x) + \int _{0}^T ( \partial _t v_t(x) + \partial _x v_t(x) \frac{\mathrm{d}x_t}{\mathrm{d}t} + c_t(x_t,u^*_t,m^*_t) )\mathrm{d}t\Big ).\\= & {} \mathbb {E} \Big ( v_0(x) + \int _{0}^T ( - \tilde{H}_t(x,\partial _xv_t,m^*_t)-\frac{\sigma _t^2 x^2}{2}\partial ^2_{xx}v_t(x) \\&\quad +\,\partial _x v_t(x) \frac{\mathrm{d}x_t}{\mathrm{d}t} + c_t(x_t,u^*_t,m^*_t) )\hbox {d}t\Big )\\\le & {} \mathbb {E} \Big ( v_0(x) + \int _{0}^T ( - c(x,u^*_t,m^*_t) +\gamma ^2(\zeta ^*)^2 - \partial _x v_t(x) \frac{\mathrm{d}x_t^*}{\mathrm{d}t}\\&-\,\frac{\sigma _t^2 x^2}{2}\partial ^2_{xx}v_t(x) + \partial _x v_t(x) \frac{\mathrm{d}x_t^*}{\mathrm{d}t} + c_t(x_t,u^*_t,m^*_t) )\mathrm{d}t\Big ) \\&\quad +\,\gamma ^2 q_0(x) - \gamma ^2 q_0(x) \\\le & {} \mathbb {E} \Big ( \int _{0}^T ( \gamma ^2(\zeta ^*)^2 - \frac{\sigma _t^2 x^2}{2}\partial ^2_{xx}v_t(x) )\mathrm{d}t\Big ) + \gamma ^2 q_0(x)\\\le & {} \gamma ^2 \left( \Vert \zeta \Vert ^2 + q_0(x)\right) . \end{aligned}$$

The first equality is by definition itself of cost \(J^{\infty }(x,u^*,m^*,\zeta )\). In the second equality, we use the boundary condition (22). In the third equality, we replace \(v_T(x_T)\) by \(v_0(x) + \int _{0}^T ( \partial _t v_t(x) \hbox {d}t + \partial _x v_t(x) \hbox {d}x_t\). In the fourth equality, we use the HJBF equation (15). In the fifth inequality, we use the expression of the robust Hamiltonian (17). Here \(\mathrm{d}x_t^*\) denotes the infinitesimal state under \(u^*\) and \(\zeta ^*\).

To prove that \(\mathbb R^{|\mathcal {X}|}_-\) is attainable by \(\mathcal {V}_0(x)\) (item 2.), we need to show that \(\lim _{t \rightarrow 0} \Vert [ \mathcal {V}_t ]_+ \Vert = 0 \). To see that this holds true, note that for every \(x \in \mathcal {X},\,v_0(x) \le \gamma ^2 q_0(x)\) as \(\gamma \ge \hat{\gamma }^{\mathrm{NCL}}\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bauso, D., Tembine, H. & Başar, T. Robust Mean Field Games. Dyn Games Appl 6, 277–303 (2016). https://doi.org/10.1007/s13235-015-0160-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13235-015-0160-4

Keywords

Navigation