Risk-Sensitive Mean Field Games via the Stochastic Maximum Principle

Moon, Jun; Başar, Tamer

doi:10.1007/s13235-018-00290-z

Risk-Sensitive Mean Field Games via the Stochastic Maximum Principle

Published: 13 December 2018

Volume 9, pages 1100–1125, (2019)
Cite this article

Dynamic Games and Applications Aims and scope Submit manuscript

825 Accesses
11 Citations
Explore all metrics

Abstract

In this paper, we consider risk-sensitive mean field games via the risk-sensitive maximum principle. The problem is analyzed through two sequential steps: (i) risk-sensitive optimal control for a fixed probability measure, and (ii) the associated fixed-point problem. For step (i), we use the risk-sensitive maximum principle to obtain the optimal solution, which is characterized in terms of the associated forward–backward stochastic differential equation (FBSDE). In step (ii), we solve for the probability law induced by the state process with the optimal control in step (i). In particular, we show the existence of the fixed point of the probability law of the state process determined by step (i) via Schauder’s fixed-point theorem. After analyzing steps (i) and (ii), we prove that the set of N optimal distributed controls obtained from steps (i) and (ii) constitutes an approximate Nash equilibrium or $\epsilon $-Nash equilibrium for the N player risk-sensitive game, where $\epsilon \rightarrow 0$ as $N \rightarrow \infty $ at the rate of $O(\frac{1}{N^{1/(n+4)}})$. Finally, we discuss extensions to heterogeneous (non-symmetric) risk-sensitive mean field games.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Partially Observed Discrete-Time Risk-Sensitive Mean Field Games

Article 07 June 2022

Naci Saldi, Tamer Başar & Maxim Raginsky

Risk, Optimization and Meanfield Type Control

Mean-Field-Type Games with Jump and Regime Switching

Article 28 February 2019

Alain Bensoussan, Boualem Djehiche, … Sheung Chi Phillip Yam

Notes

In fact, for step (i), we provide the risk-sensitive maximum principle for the general r-dimensional Brownian motion in “Appendix A” section, which generalizes the result of the one-dimensional Brownian motion in [42] when the diffusion coefficient is independent of control.
This assumption can be relaxed to a complete separable metric space [65].
A related discussion on this issue is provided in [6, Chapter 6].
This existence is dependent on the value of $\gamma $, and when $\gamma $ is large, the corresponding Riccati equation always admits a unique solution [7, 48].
A discussion on Lipschitz continuity of the optimal control in stochastic optimal control theory can be found in [28].
In this Appendix, we do not state specific regularity conditions for the corresponding stochastic optimal control problem, since they are quite similar to the assumptions already made in the paper. See [29, 42, 60, 65] for the regularity conditions for the stochastic optimal control problem.

References

Achdou Y, Capuzzo-Dolcetta I (2010) Mean field games: numerical methods. SIAM J Numer Anal 48(3):1136–1162
MathSciNet MATH Google Scholar
Achdou Y, Camilli F, Capuzzo-Dolcetta I (2012) Mean field games: numerical methods for the planning problem. SIAM J Control Opt 50(1):77–109
MathSciNet MATH Google Scholar
Ahuja S (2016) Wellposedness of mean field games with common noise under a weak monotonicity condition. SIAM J Control Opt 54(1):30–48
MathSciNet MATH Google Scholar
Andersson D, Djehiche B (2010) A maximum principle for SDEs of mean-field type. Appl Math Opt 63(3):341–356
MathSciNet MATH Google Scholar
Başar T (1999) Nash equilibria of risk-sensitive nonlinear stochastic differential games. J Opt Theory Appl 100(3):479–498
MathSciNet MATH Google Scholar
Başar T, Olsder GJ (1999) Dynamic noncooperative game theory, 2nd edn. SIAM, Philadelphia
MATH Google Scholar
Başar T, Bernhard P (1995) $\text{ H }^\infty $ optimal control and related minimax design problems, 2nd edn. Birkhäuser, Boston
MATH Google Scholar
Bardi M, Priuli FS (2014) Linear-quadratic N-person and mean-field games with ergodic cost. SIAM J Control Opt 52(5):3022–3052
MathSciNet MATH Google Scholar
Bauso D, Tembine H, Başar T (2016) Robust mean field games. Dyn Games Appl 6(3):277–303
MathSciNet MATH Google Scholar
Bauso D, Tembine H, Başar T (2016) Opinion dynamics in social networks through mean-field games. SIAM J Control Opt 54(6):3225–3257
MathSciNet MATH Google Scholar
Bensoussan A, Frehse J, Yam P (2013) Mean field games and mean field type control theory. Springer, New York
MATH Google Scholar
Bensoussan A, Sung KCJ, Yam SCP, Yung SP (2014) Linear–quadratic mean field games. arXiv:1404.5741
Bensoussan A, Chau MHM, Yam SCP (2015) Mean field Stackelberg games: aggregation of delayed instructions. SIAM J Control Opt 53(4):2237–2266
MathSciNet MATH Google Scholar
Billingsley P (1999) Convergence of probability measures, 2nd edn. Wiley, New York
MATH Google Scholar
Bolley F (2008) Separability and completeness for the Wasserstein distance. Seminarire de probabilities XLI, Lecture Notes Math, pp 371–377
Google Scholar
Cardaliaguet P (2012) Notes on mean field games. Technical report
Cardaliaguet P, Lehalle CA (2018) Mean field game of controls and an application to trade crowding. Math Financ Econ 12(3):335–363
MathSciNet MATH Google Scholar
Carmona R, Delarue F (2013) Mean field forward–backward stochastic differential equations. Electron Commun Probab 18(68):1–15
MathSciNet MATH Google Scholar
Carmona R, Delarue F (2013) Probabilistic analysis of mean-field games. SIAM J Control Opt 51(4):2705–2734
MathSciNet MATH Google Scholar
Carmona R, Delarue F (2015) Forward–backward stochastic differential equations and controlled McKean–Vlasov dynamics. Ann Probab 43(5):2647–2700
MathSciNet MATH Google Scholar
Carmona R, Delarue F, Lachapelle A (2012) Control of McKean–Vlasov dynamics versus mean field games. Math Financ Econ 7(2):131
MathSciNet MATH Google Scholar
Conway JB (2000) A course in functional analysis. Springer, Berlin
Google Scholar
Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms, 3rd edn. The MIT Press, London
MATH Google Scholar
Couillet R, Perlaza SM, Tembine H, Debbah M (2012) Electrical vehicles in the smart grid: a mean field game analysis. IEEE J Sel Areas Commun 30(6):1086–1096
Google Scholar
Delarue F (2002) On the existence and uniqueness of solutions to FBSDEs in a non-degenerate case. Stoch Process Appl 99:209–286
MathSciNet MATH Google Scholar
Djehiche B, Hamadene S (2016) Optimal control and zero-sum stochastic differential game problems of mean-field type. arXiv:1603.06071v3
Duncan TE (2013) Linear–exponential–quadratic Gaussian control. IEEE Trans Autom Control 58(11):2910–2911
MathSciNet MATH Google Scholar
Fleming W, Rishel R (1975) Deterministic and stochastic optimal control. Springer, Berlin
MATH Google Scholar
Fleming W, Soner HM (2006) Controlled Markov processes and viscosity solutions, 2nd edn. Springer, Berlin
MATH Google Scholar
Fleming WH, James MR (1995) The risk-sensitive index and the $\text{ H }_2$ and $\text{ H }_\infty $ norms for nonlinear systems. Math Control Signals Syst 8:199–221
Google Scholar
Horowitz J, Karandikar R (1994) Mean rate of convergence of empirical measures in the Wasserstein metric. J Comput Appl Math 55:261–273
MathSciNet MATH Google Scholar
Huang J, Li N (2018) Linear–quadratic mean-field game for stochastic delayed systems. IEEE Trans Autom Control 63(8):2722–2729
MathSciNet MATH Google Scholar
Huang M (2010) Large-population LQG games involving a major player: the Nash certainty equivalence principle. SIAM J Control Opt 48(5):3318–3353
MathSciNet MATH Google Scholar
Huang M, Caines PE, Malhamé RP (2003) Individual and mass behaviour in large population stochastic wireless power control problems: centralized and Nash equilibrium solutions. In: Proceedings of the 42nd IEEE conference on decision and control, pp 98–103
Huang M, Malhamé Roland P, Caines Peter E (2006) Large population stochastic dynamic games: closed-loop McKean–Vlasov systems and the Nash certainty equivalence principle. Commun Inf Syst 6(3):221–252
MathSciNet MATH Google Scholar
Huang M, Caines PE, Malhamé RP (2007) Large-population cost-coupled LQG problems with nonuniform agents: individual-mass behavior and decentralized $\epsilon $-Nash equilibria. IEEE Trans Autom Control 52(9):1560–1571
MathSciNet MATH Google Scholar
Jacobson D (1973) Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games. IEEE Trans Autom Control 18(2):124–131
MathSciNet MATH Google Scholar
Jordain B, Meleard S, Woyczynski W (2008) Nonlinear SDEs driven by Levy processes and related PDEs. Latin Am J Probab 4:1–29
MATH Google Scholar
Karatzas I, Shreve SE (2000) Brownian motion and stochastic calculus. Springer, Berlin
MATH Google Scholar
Lasry JM, Lions PL (2007) Mean field games. Jpn J Math 2(1):229–260
MathSciNet MATH Google Scholar
Li T, Zhang JF (2008) Asymptotically optimal decentralized control for large population stochastic multiagent systems. IEEE Trans Autom Control 53(7):1643–1660
MathSciNet MATH Google Scholar
Lim EBA, Zhou XY (2005) A new risk-sensitive maximum principle. IEEE Trans Autom Control 50(7):958–966
MathSciNet MATH Google Scholar
Luenberger DG (1969) Optimization by vector space methods. Wiley, New York
MATH Google Scholar
Ma J, Yong J (1999) Forward–backward stochastic differential equations and their applications. Springer, Berlin
MATH Google Scholar
Ma J, Callaway DS, Hiskens IA (2013) Decentralized charging control of large populations of plug-in electric vehicles. IEEE Trans Control Syst Technol 21(1):67–78
Google Scholar
Moon J, Başar T (2015) Linear-quadratic stochastic differential Stackelberg games with a high population of followers. In: Proceedings of the 54th IEEE conference on decision and control, pp 2270–2275
Moon J, Başar T (2016) Robust mean field games for coupled Markov jump linear systems. Int J Control 89(7):1367–1381
MathSciNet MATH Google Scholar
Moon J, Başar T (2017) Linear quadratic risk-sensitive and robust mean field games. IEEE Trans Autom Control 62(3):1062–1077
MathSciNet MATH Google Scholar
Moon J, Başar T (2018) Linear quadratic mean field Stackelberg differential games. Automatica 97:200–213
MathSciNet MATH Google Scholar
Nourian M, Caines P (2013) $\epsilon $-Nash mean field game theory for nonlinear stochastic dynamical systems with major and minor agents. SIAM J Control Opt 51(4):3302–3331
MathSciNet MATH Google Scholar
Nourian M, Caines PE, Malhamé RP, Huang Minyi (2013) Nash, social and centralized solutions to consensus problems via mean field control theory. IEEE Trans Autom Control 58(3):639–653
MathSciNet MATH Google Scholar
Nourian M, Caines PE, Malhamé RP (2014) A mean field game synthesis of initial mean consensus problems: a continuum approach for non-Gaussian behavior. IEEE Trans Autom Control 59(2):449–455
MathSciNet MATH Google Scholar
Pardoux E, Tang S (1999) Forward–backward stochastic differential equations and quasilinear parabolic PDEs. Probab Theory Relat Fields 114:123–150
MathSciNet MATH Google Scholar
Parise F, Grammatico S, Colombino M, Lygeros J (2014) Mean field constrained charging policy for large populations of plug-in electric vehicles. In: Proceedings of 53rd IEEE conference on decision and control, pp 5101–5106
Peng S, Wu Z (1999) Fully coupled forward–backward stochastic differential equations and applications to optimal control. SIAM J Control Opt 37(3):825–843
MathSciNet MATH Google Scholar
Pham H (2009) Continuous-time stochastic control and optimization with financial applications. Springer, Berlin
MATH Google Scholar
Rachev ST, Ruschendorf L (1998) Mass transportation theory: volume I: theory. Springer, Berlin
MATH Google Scholar
Smart DR (1974) Fixed point theorems. Cambridge University Press, Cambridge
MATH Google Scholar
Tembine H, Zhu Q, Başar T (2014) Risk-sensitive mean field games. IEEE Trans Autom Control 59(4):835–850
MathSciNet MATH Google Scholar
Touzi N (2013) Optimal stochastic control, stochastic target problems, and backward SDE. Springer, Berlin
MATH Google Scholar
Wang B, Zhang J (2012) Mean field games for large-population multiagent systems with Markov jump parameters. SIAM J Control Opt 50(4):2308–2334
MathSciNet MATH Google Scholar
Weintraub GY, Benkard CL, Van Roy B (2008) Markov perfect industry dynamics with many firms. Econometrica 76(6):1375–1411
MathSciNet MATH Google Scholar
Whittle P (1990) Risk-sensitive optimal control. Wiley, New York
MATH Google Scholar
Yin H, Mehta PG, Meyn SP, Shanbhag UV (2012) Synchronization of coupled oscillators is a game. IEEE Trans Autom Control 57(4):920–935
MathSciNet MATH Google Scholar
Yong J, Zhou XY (1999) Stochastic controls: Hamiltonian systems and HJB equations. Springer, Berlin
MATH Google Scholar
Zhou XY (1996) Sufficient conditions of optimality for stochastic systems with controllable diffusions. IEEE Trans Autom Control 41(8):1176–1179
MathSciNet MATH Google Scholar
Zhu Q, Başar T (2013) Multi-resolution large population stochastic differential games and their application to demand response management in the smart grid. Dyn Games Appl 3(1):66–88
MathSciNet MATH Google Scholar
Zhu Q, Tembine H, Başar T (2011) Hybrid risk-sensitive mean-field stochastic differential games with application to molecular biology. In: Proceedings of the 50th IEEE CDC and ECC, pp 4491–4497

Download references

Acknowledgements

The authors would like to thank the Associate Editor and the two anonymous reviewers for careful reading of and helpful suggestions on the earlier version of the manuscript. This research was supported in part by the National Research Foundation of Korea (NRF) Grant funded by the Ministry of Science and ICT, South Korea (NRF-2017R1E1A1A03070936, NRF-2017R1A5A1015311), in part by Institute for Information and Communications Technology Promotion (IITP) Grant funded by the Korea government (MSIT), South Korea (No. 2018-0-00958), and in part by the Office of Naval Research (ONR) MURI Grant N00014-16-1-2710.

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, South Korea
Jun Moon
Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
Tamer Başar

Authors

Jun Moon
View author publications
You can also search for this author in PubMed Google Scholar
Tamer Başar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Moon.

Appendices

Appendix A: The Risk-Sensitive Maximum Principle

This appendix proves the maximum principle for the risk-sensitive optimal control problem. We note that the risk-sensitive maximum principle in [42] considered the one-dimensional Brownian motion with the maximization of the Hamiltonian. Here, we extend this to the general r-dimensional Brownian motion with the minimization of the Hamiltonian (but we still call it the “maximum principle”).^{Footnote 6}

Consider the SDE

$$\begin{aligned} \mathrm{d}x(t)&= f(t,x,u)\mathrm{d}t + \sigma (t) \mathrm{d}B(t),~ x(0)=x_0, \end{aligned}$$

and the risk-sensitive cost function

$$\begin{aligned} J(u)&= \gamma \log {\mathbb {E}} \left[ \exp \left\{ \frac{1}{\gamma } \int _0^T l(t,x(t),u(t))\mathrm{d}t + \frac{1}{\gamma }m(x(T))\right\} \right] , \end{aligned}$$

(A.1)

where B is the r-dimensional standard Brownian motion. Let $\{\mathcal {F} \}_{t \ge 0}$ be the filtration generated by B.

We first state the risk-sensitive maximum principle, which is similar to [42, Theorem 3.1]. See also [65, Theorem 3.2, Chapter 3] for the risk-neutral stochastic maximum principle.

Theorem A1

Let $(x,\bar{u})$ be an optimal pair for the risk-sensitive optimal control problem in (A.1). Then there exists a unique pair $(p,q) \in \mathcal {L}_{\mathcal {F}}^2(0,T;{\mathbb {R}}^n) \times \mathcal {L}_{\mathcal {F}}^2(0,T;{\mathbb {R}}^{n \times r})$ such that it is the solution of the following BSDE:

$$\begin{aligned} \mathrm{d}p(t)&= -\left[ f_x^\top (t,x,\bar{u}) p(t) + l_x(t,x,\bar{u}) + \frac{1}{\gamma } q(t) \sigma ^\top (t) p(t) \right] \mathrm{d}t + q(t) \mathrm{d}B(t)\nonumber \\ p(T)&= m_x(x(T)), \end{aligned}$$

(A.2)

Also, the following optimality condition holds:

$$\begin{aligned} H(t,x,\bar{u},p,q) = \min _{u \in U} H(t,x,u,p,q), \end{aligned}$$

(A.3)

where the Hamiltonian H is given by

$$\begin{aligned} H(t,x,u,p,q) = p^\top f + l + {{\,\mathrm{Tr}\,}}\left( q^\top \sigma \right) + \frac{1}{\gamma }p^\top \sigma \sigma ^\top p. \end{aligned}$$

(A.4)

$\square $

Remark A1

Note that the term $\frac{1}{\gamma } q(t) \sigma ^\top (t) p(t)$ in the BSDE p in (A.2) is different from that of the one-dimensional Brownian motion case in [42]. $\square $

We now prove Theorem A1.

Proof of Theorem A1

First, it is easy to see that the risk-sensitive optimal control problem in (A.1) can be converted into the following Mayer form:

$$\begin{aligned} J(u)&= \gamma \log {\mathbb {E}} \left[ \exp \{ \frac{1}{\gamma } (m(x(T)) + y(T))\} \right] \end{aligned}$$

(A.5)

$$\begin{aligned} \mathrm{d}x(t)&= f(t,x,u)\mathrm{d}t + \sigma (t) \mathrm{d}B(t),~ x(0)=x_0 \end{aligned}$$

(A.6)

$$\begin{aligned} \mathrm{d}y(t)&= l(t,x,u)\mathrm{d}t,~ y(0) = 0, \end{aligned}$$

(A.7)

Note that with this reformulation, we can apply the risk-neutral maximum principle in [65, Theorem 3.2, Chapter 3].

From [65, Theorem 3.2, Chapter 3], the Hamiltonian for the optimal control problem in (A.5) is given by

$$\begin{aligned} \bar{H}(t,x,u,p,q)&= \left\langle p,\begin{pmatrix} f \\ l \end{pmatrix} \right\rangle + {{\,\mathrm{Tr}\,}}(q^\top \begin{pmatrix} \sigma \\ 0 \end{pmatrix}), \end{aligned}$$

(A.8)

where p is the adjoint process satisfying

$$\begin{aligned} \mathrm{d}p(t)&= - \begin{pmatrix} f_x(t,x,\bar{u}) &{} 0 \\ l_x^\top (t,x,\bar{u}) &{} 0 \end{pmatrix}^\top p(t) + q(t) \mathrm{d}B(t) \nonumber \\ p(T)&= \frac{1}{\gamma }\exp \left\{ \frac{1}{\gamma } (m(x(T)) + y(T))\right\} \begin{pmatrix} m_x(x(T)) \\ 1 \end{pmatrix}. \end{aligned}$$

(A.9)

Note that p is an $(n+1)$-dimensional adjoint process with $p^\top =(p_1^\top ,p_2)^\top $, where $p_1$ is an n-dimensional stochastic process associated with the constraint (A.6). Moreover, q is an $(n+1)\times r$ dimensional matrix stochastic process.

We define the associated value function for (A.5):

$$\begin{aligned} v(t) = \inf _{u} {\mathbb {E}} \left[ \exp \left\{ \frac{1}{\gamma } (m(x(T)) + y(T))\right\} \right] , \end{aligned}$$

where $v(t) >0$ and $v(T) = \exp \{ \frac{1}{\gamma } (m(x(T)) + y(T))\}$. Due to the relationship between the maximum principle and dynamic programming, the associated value function logarithmic transformation (see [29, Chapter VI], [5, 42]) leads to

$$\begin{aligned} p(t) = v_{(x,y)}(t),~~V = \gamma \log v, \end{aligned}$$

(A.10)

where p(T) satisfies the terminal condition in (A.9). The gradient of V can be written as

$$\begin{aligned} \tilde{p}(t) = \gamma \frac{p(t)}{v(t)}, \end{aligned}$$

(A.11)

where $\tilde{p} = (\bar{p}^\top , \tilde{p}_2)^\top \in {\mathbb {R}}^{n+1}$, in which $\bar{p}$ is an n-dimensional backward stochastic process.

We now obtain the expression of $\tilde{p}$ in (A.11). Under the non-degeneracy assumption (stated in A4) in Sect. 2), v is the smooth value function of the optimal control problem in (A.5) as mentioned in [65, Chapter 4] and [42]. Also, we can see that there is no running cost. Then in view of the proof in [65, Theorem 4.1, Chapter 5] and [42], and by using the Itô formula, we have

$$\begin{aligned} \mathrm{d}v(t)&= p_1^\top (t) \sigma (t) \mathrm{d}B(t). \end{aligned}$$

(A.12)

By using the Itô formula again with (A.12), and from (A.11), we have

$$\begin{aligned} \mathrm{d}\frac{1}{v(t)}&= - \frac{1}{\gamma v(t)} \bar{p}^\top (t) \sigma (t) \mathrm{d}B(t) + \frac{1}{\gamma ^2 v(t)} \bar{p}^\top (t) \sigma (t) \sigma ^\top (t) \bar{p}(t) \mathrm{d}t. \end{aligned}$$

(A.13)

By using the Itô formula for (A.11) with (A.13) and (A.9), we have

$$\begin{aligned} \mathrm{d}\tilde{p}(t)&= \mathrm{d} \left( \gamma \frac{p(t)}{v(t)}\right) \nonumber \\&= - \begin{pmatrix} f_x(t,x,\bar{u}) &{} 0 \\ l_x^\top (t,x,\bar{u}) &{} 0 \end{pmatrix}^\top \tilde{p}(t)\mathrm{d}t - \frac{1}{\gamma } \tilde{q}(t) \sigma ^\top (t) \bar{p}(t) \mathrm{d}t + \tilde{q}(t) \mathrm{d}B(t), \end{aligned}$$

(A.14)

where

$$\begin{aligned} \tilde{p}(T) = \begin{pmatrix} m_x(x(T)) \\ 1 \end{pmatrix}, \end{aligned}$$

and $\tilde{q}$ is an $(n+1) \times r$ dimensional stochastic process:

$$\begin{aligned} \tilde{q}(t)&= \frac{\gamma q(t)}{v(t)} - \frac{1}{\gamma } \tilde{p}(t) \bar{p}^\top (t) \sigma (t). \end{aligned}$$

In view of the value function transformation in (A.10), it is easy to see that $\tilde{p}_2(t) = 1$ for all $t \in [0,T]$, and $\tilde{q}(t)$ satisfies

$$\begin{aligned} \tilde{q}(t)&= \frac{\gamma q(t)}{v(t)} - \frac{1}{\gamma } \tilde{p}(t) \bar{p}^\top (t) \sigma (t) =: \begin{pmatrix} \bar{q}(t) \\ \tilde{q}_2(t) \end{pmatrix}, \end{aligned}$$

(A.15)

where $\tilde{q}(t)$ is an $(n \times r)$-dimensional stochastic process, whereas $\tilde{q}_2(t)$ is an $(1 \times r)$-dimensional stochastic process with $\tilde{q}_2(t) = 0$ a.s. for all $t \in [0,T]$.

We note that $\tilde{p} = (\bar{p}^\top ,\tilde{p}_2)^\top $ and $\bar{p}(T) = m_x(x(T))$. Then expanding (A.14) and together with (A.15), we can easily show that $\bar{p}$ satisfies the backward SDE in (A.2). Moreover, by substituting the relationships of (A.11) and (A.15) into the Hamiltonian in (A.8), one can arrive at the Hamiltonian of the risk-sensitive optimal control problem in (A.4) with the optimality condition given in (A.3). Since our derivation can be reversed, this completes the proof of the risk-sensitive maximum principle for the r-dimensional Brownian motion. The proof of Theorem A1 is done. $\square $

Appendix B: Proof of Theorem 1

First, we note that from the risk-sensitive maximum principle in Theorem A1 in “Appendix A” section, with the optimal control $\bar{u}$, there exists a unique solution of the FBSDE in (7). Then under (A2)–(A5), by applying Four Step Scheme introduced in [25] (see also [53] and [44, Chapter 4] for Four Step Scheme under the strong regularity assumptions), the BSDE for p can be expressed in terms of x as follows:

$$\begin{aligned} \theta (t,x(t)) = p(t),~ \theta (T,x) = p(T) \end{aligned}$$

(B.1)

almost surely for $t \in [0,T]$ [25, Corollary 1.5]. In fact, in view of Four Step Scheme and Itô formula, one can show that $\theta (t,x)$ is a classical solution of the particular quasi-linear parabolic partial differential equation with the terminal condition $\theta (T,x) = p(T) = m_x(x(T),\mu (T))$ [25, ($\hbox {E}^\prime $)], [44, Chapter 4] and [53]. Also, from [25, Corollary 1.5], we have

$$\begin{aligned} |\theta (t,x_1) - \theta (t,x_2) | \le c |x_1 - x_2 |,~ \forall x_1,x_2 \in {\mathbb {R}}^n, \end{aligned}$$

(B.2)

for some constant $c \ge 0$. Hence, with (B.1), the SDE for x in (7) can be written as follows:

$$\begin{aligned} \mathrm{d}x(t)&=f(t,x,\mu ,w(t,x,\mu ,\theta (t,x)))\mathrm{d}t + \sigma (t) \mathrm{d}B(t), \end{aligned}$$

(B.3)

where $x(0) = x_0$. Note that (B.3) is now decoupled with the BSDE p in (7).

We now use Schauder’s fixed-point theorem to complete the proof. Its statement is given as follows: LetXbe a nonempty closed and bounded convex subset of a normed spaceS. LetTbe a continuous mapping ofXinto a compact subset$K \subset X$. ThenThas a fixed point. [58, Theorem 4.1.1].

We first note that the 1-Wasserstein metric on $\mathcal {P}_1(\mathcal {C}([0,T];{\mathbb {R}}^n))$ is equivalent to the Kantorovich–Rubinstein distance [16, Theorem 5.5]

$$\begin{aligned} W_1(\mu ^*(t),\mu ^\prime (t))&= \sup \left\{ \int _{\mathcal {C}([0,T];{\mathbb {R}}^n)} f(x) \mathrm{d}\mu ^*(t,x) - \int _{\mathcal {C}([0,T];{\mathbb {R}}^n)} f(x) \mathrm{d}\mu ^\prime (t,x) \right\} , \end{aligned}$$

where $\mu ^*,\mu ^\prime \in \mathcal {P}_1(\mathcal {C}([0,T];{\mathbb {R}}^n))$, and the supremum is taken over the set of all 1-Lipschitz continuous maps f. Indeed, it can be seen that the 1-Wasserstein metric is induced by the Kantorovich–Rubinstein norm [57], which, together with the fact that $\mathcal {C}([0,T];{\mathbb {R}}^n)$ is a normed space with the norm $|\cdot |_{\infty } := \sup _{0 \le t \le T} |\cdot |$ [43], implies that $\mathcal {P}_1(\mathcal {C}([0,T];{\mathbb {R}}^n))$ is a normed space [15].

We define the following set for $c > 0$:

$$\begin{aligned} \mathcal {E}=\{\mu \in \mathcal {P}_4(\mathcal {C}\left( [0,T];{\mathbb {R}}^n\right) )~:~M_4(\mu ) \le c\}, \end{aligned}$$

where we have the inclusion $\mathcal {E} \subset \mathcal {P}_2(\mathcal {C}([0,T];{\mathbb {R}}^n)) \subset \mathcal {P}_1(\mathcal {C}([0,T];{\mathbb {R}}^n)) $. Then, it is easy to check that $\mathcal {E}$ is bounded and convex and is closed with respect to the 1-Wasserstein metric. The latter follows from the fact that for any convergent sequence of measures $\mu _k \in \mathcal {E}$, $k \ge 1$, to $\mu $ with the 1-Wasserstein metric, we have $W_4(\mu _k,\mu ) \le W_1(\mu _k,\mu )$ for $k \ge 1$ [16, Section 5], which implies $\mu \in \mathcal {E}$. In the proof below, a constant c can vary from line to line.

Now, note that $\bar{u} \in {\mathcal {U}}$ is the optimal control that minimizes the Hamiltonian in (8). Then with (A2), (A5) and (B.2), the standard estimate of the SDEs in [65, Theorem 6.3, Chapter 1] implies that there exists a constant c, depending on $x_0$, $\beta $ and T, such that ${\mathbb {E}}[\sup _{0 \le t \le T} |x(t)|^4] \le c$. Hence, by considering the mapping $\varPsi $ on $\mathcal {E}$, and noticing that $\mathcal {E} \subset \mathcal {P}_1(\mathcal {C}([0,T];{\mathbb {R}}^n))$, we have $\varPsi : \mathcal {E} \rightarrow \mathcal {E}$, i.e., $\varPsi \mu \in \mathcal {E}$, for any $\mu \in \mathcal {E}$.

To prove compactness of $\varPsi (\mathcal {E})$, we show tightness of a sequence of measures $\varPsi \mu _k$ with respect to the 1-Wasserstein metric, where $\mu _k \in \mathcal {E}$, $k \ge 1$. Note that the initial condition of the SDE is not random and $\sigma $ is uniformly bounded in $t \in [0,T]$ from (A4). Then, from [65, Theorem 6.3, Chapter 1], for any $\delta > 0$ and $s \in [t,t+\delta ]$, ${\mathbb {E}}[|x(s) - x(t)|^2] \le c$, where c depends on the initial condition of the SDE and $\delta $. Hence, in view of [14, Theorem 7.3] and [14, Corollary, page 83], $\{\varPsi \mu _k\}$ is tight, which implies that $\varPsi (\mathcal {E})$ is relatively compact with respect to the 1-Wasserstein metric [14, Theorem 5.1].

It remains to show that $\varPsi $ is continuous on $\mathcal {E}$ with respect to the 1-Wasserstein metric. That is, for every $\epsilon > 0$, there exists $\eta > 0$ such that with $\mu ^*,\mu ^\prime \in \mathcal {E}$, $W_1(\mu ^*,\mu ^\prime ) < \eta $ implies $W_1(\varPsi \mu ^*,\varPsi \mu ^\prime )< \epsilon $. Note that $\mu ^*$ is not a fixed point of $\varPsi $. Let $x^*$ and $x^\prime $ be generated by two SDEs corresponding to $\mu ^*$ and $\mu ^\prime $, respectively. From the definition of $W_1$, for any $\mu ^*,\mu ^\prime \in \mathcal {E} \subset \mathcal {P}_2(\mathcal {C}([0,T];{\mathbb {R}}^n)) \subset \mathcal {P}_1(\mathcal {C}([0,T];{\mathbb {R}}^n))$,

$$\begin{aligned} W_1(\varPsi \mu ^*,\varPsi \mu ^\prime ) \le {\mathbb {E}} \left[ \sup _{0 \le t \le T} |x^*(t) - x^\prime (t)| \right] . \end{aligned}$$

(B.4)

From Gronwall’s lemma, (A2), (A5) and (B.2) and by following the proof in [19, Proposition 3.8], there exists a constant $c > 0$ such that ${\mathbb {E}} [ \sup _{0 \le t \le T} |x^*(t) - x^\prime (t)|^2 ] \le c (\int _0^T W_2^2(\mu ^*(t),\mu ^\prime (t)) \mathrm{d}t )^{1/2}$. Then by using Jensen’s inequality and the fact that $W_2(\mu ^*(t),\mu ^\prime (t)) \le W_1(\mu ^*(t),\mu ^\prime (t))$ [16, Section 5], we have

$$\begin{aligned} {\mathbb {E}} \left[ \sup _{0 \le t \le T} |x^*(t) - x^\prime (t) | \right] \le c \left( \int _0^T W_1^2(\mu ^*(t),\mu ^\prime (t)) \mathrm{d}t \right) ^{1/4}. \end{aligned}$$

This, together with (B.4), implies continuity of $\varPsi $ on $\mathcal {E}$ with respect to the 1-Wasserstein metric. This completes the proof of the theorem. $\square $

Appendix C: Proof of Theorem 2

To prove Theorem 2, we first need the following lemma:

Lemma C1

There exists a constant $c>0$, dependent on n, $M_{5+n}< \infty $ and T, such that

$$\begin{aligned} {\mathbb {E}}\left[ W_2^2\left( \nu _N^*(t),\mu ^*(t)\right) \right] \le \frac{c}{N^{2/(n+4)}},~ \forall t \in [0,T]. \end{aligned}$$

Moreover, $W_2(\nu _N^*(t),\mu ^*(t)) \rightarrow 0$ as $N \rightarrow \infty $ almost surely for all $t \in [0,T]$.

A proof of this lemma can be found in [57, Theorem 10.2.1] and [59, Proposition 5.1], or [31, 38]. In fact, the proof relies on Gronwall’s lemma with the Lipschitz property, the strong law of large numbers of the empirical distribution, and exchangeability of the stochastic processes $x_i^*$. The second part of Lemma C1 follows from [57, page 323].

We now proceed with the proof of Theorem 2.

Proof of Theorem 2

Since the players are symmetric, that is, the players are invariant under arbitrary permutations, we only need to consider the case when $i=1$. In the proof below, the constant c can vary from line to line.

We note that $x_1^*$ defined in (10) is the SDE for player $i=1$ with the optimal distributed control $u_1^*$ given in (11). As mentioned, $x_1^*$ is decoupled with other players since f does not depend on the mean field from (A6), which implies that it is statistically independent from other players. Furthermore, we note that $x_1$ is the SDE of player $i=1$ with an arbitrary control $u_1 \in {\mathcal {U}}_{\mathcal {F}}^1$. Then it should be clear from the definitions of $x_1$ and $x_1^*$ that $x_1$ is identical to $x_1^*$ when $u_1 = u_1^*$.

In the proof below, note that the empirical distribution $\nu _N^*$ in (12) is obtained when the N players are under the optimal distributed control in (11). Then in view of Lemma C1, we have for $t \in [0,T]$,

$$\begin{aligned} {\mathbb {E}} \left[ W_2^2(\nu _N^*(t),\mu ^*(t)) \right] = O\left( \frac{1}{N^{2/(n+4)}} \right) . \end{aligned}$$

We also note that

$$\begin{aligned} \nu _N(t) = \frac{1}{N}\delta _{x_1(t)} + \frac{1}{N}\sum _{i=2}^N \delta _{x_i^*(t)}, \end{aligned}$$

which is the empirical distribution when $x_1$ is under an arbitrary control $u_1$, while other players are with the optimal distributed control in (11).

Due to boundedness of f and $\sigma $ in t, one can show that by using Itô isometry, there exists a constant $c > 0$ (dependent on $\beta $ and T) such that

$$\begin{aligned} {\mathbb {E}} \left[ \sup _{0 \le t \le T} |x_1(t)|^2 \right] \le c + c {\mathbb {E}} \left[ \int _0^T |u_1(t)|^2 \mathrm{d}t \right] . \end{aligned}$$

(C.1)

Since $u_i^*$ satisfies ${\mathbb {E}}[\int _0^T |u_i^*(t)|^2 \mathrm{d}t] < \infty $, $2 \le i \le N$ and f and $\sigma $ are bounded, from (A2), (A5), (B.2) and [65, Theorem 6.3, Chapter 1], we can show the estimate ${\mathbb {E}} [ \sup _{0 \le t \le T} |x_i^*(t)|^2 ] \le c$ for $2 \le i \le N$. This, together with (C.1) and Itô isometry leads to the following inequality:

$$\begin{aligned}&\frac{1}{N}\left( {\mathbb {E}} \left[ \sup _{0 \le t \le T} |x_1(t)|^2 \right] + \sum _{i=2}^N {\mathbb {E}} \left[ \sup _{0 \le t \le T} |x_i^*(t)|^2 \right] \right) \le c + \frac{c}{N}{\mathbb {E}} \left[ \int _0^T |u_1(t)|^2 \mathrm{d}t \right] , \end{aligned}$$

(C.2)

which is also bounded since we have $u_1 \in {\mathcal {U}}_{\mathcal {F}}^1$ with ${\mathbb {E}}[\int _0^T |u_1(t)|^2 \mathrm{d}t ] < \infty $.

Consider the following inequality:

$$\begin{aligned}&{\mathbb {E}} \left[ W_2^2(\nu _N(t),\mu ^*(t)) \right] \end{aligned}$$

(C.3)

$$\begin{aligned}&\quad \le c {\mathbb {E}} \left[ W_2^2 \left( \nu _N(t),\frac{1}{N-1} \sum _{i=2}^N \delta _{x_i^*(t)} \right) \right] \end{aligned}$$

(C.4)

$$\begin{aligned}&\qquad + c {\mathbb {E}} \left[ W_2^2 \left( \frac{1}{N-1} \sum _{i=2}^N \delta _{x_i^*(t)},\mu ^*(t) \right) \right] . \end{aligned}$$

(C.5)

We now show boundedness of ${\mathbb {E}} [ W_2^2(\nu _N(t),\mu ^*) ]$ in (C.3) with respect to N for $t \in [0,T]$.

First, from Lemma C1, we have

$$\begin{aligned} {\mathbb {E}} \left[ W_2^2 \left( \frac{1}{N-1} \sum _{i=2}^N \delta _{x_i^*(t)},\mu ^*(t) \right) \right] \le \frac{c}{N^{2/(n+4)}}, \end{aligned}$$

(C.6)

which is the bound for (C.5).

Also, for (C.4), by the definition of $W_2$, we have

$$\begin{aligned}&{\mathbb {E}} \left[ W_2^2 \left( \nu _N(t),\frac{1}{N-1} \sum _{i=2}^N \delta _{x_i^*(t)} \right) \right] \le \frac{c}{N(N-1)} \sum _{i=2}^n {\mathbb {E}} \left[ |x_1(t) - x_i^*(t)|^2 \right] \le \frac{c}{N}, \end{aligned}$$

(C.7)

where the last inequality follows from (C.1) and (C.2). Hence, for (C.3), in view of (C.6), and (C.7), we have

$$\begin{aligned} {\mathbb {E}} \left[ W_2^2(\nu _N(t),\mu ^*(t)) \right] \le \frac{c}{N^{2/(n+4)}},~ t \in [0,T]. \end{aligned}$$

(C.8)

By applying Jensen’s inequality, we have (see also [29, Chapter VI])

$$\begin{aligned} J_1^N\left( u^{N*}\right)&\ge {\mathbb {E}} \left[ \int _0^T l(t,x_1^*(t),\nu _N^*(t),u_1^*(t))\mathrm{d}t + m(x_1^*(T),\nu _N^*(T)) \right] =: L_1^N\left( u^{N*}\right) \\ \bar{J}_1\left( u_1^*,\mu ^*\right)&\ge {\mathbb {E}} \left[ \int _0^T l(t,x_1^*(t),\mu ^*(t),u_1^*(t))\mathrm{d}t + m(x_1^*(T),\mu ^*(T)) \right] =: \bar{L}_1\left( u_1^*,\mu ^*\right) . \end{aligned}$$

Note that $L_1^N$ and $\bar{L}_1$ are risk-neutral cost functions. Then, there exists a constant c, dependent on T and the Lipschitz constant $\beta $ in (A2), such that

$$\begin{aligned} \left| J_1^N\left( u^{N*}\right) - \bar{J}_1\left( u_1^*,\mu ^*\right) \right|&\le c \left| L_1^N\left( u^{N*}\right) - \bar{L}_1\left( u_1^*,\mu ^*\right) \right| . \end{aligned}$$

(C.9)

Therefore, by using Cauchy–Schwarz inequality, the Lipschitz properties of l and m in (A2), and the fact that $u_1^* \in {\mathcal {U}}_{\mathcal {F}^1}^1$, we can show that

$$\begin{aligned}&\left| J_1^N\left( u^{N*}\right) - \bar{J}_1\left( u_1^*,\mu ^*\right) \right| \\&\quad \le c {\mathbb {E}} \left[ W_2^2(\mu ^*(T),\nu _N^*(T)) \right] ^{1/2} + c \int _0^T {\mathbb {E}} \left[ W_2^2(\mu ^*(t),\nu _N^*(t)) \right] ^{1/2} \mathrm{d}t\le \frac{c}{N^{1/(n+4)}}, \end{aligned}$$

where the first inequality follows from (C.9) and (A2), and the second inequality is due to Lemma C1. In the first inequality, we have used the fact that ${\mathbb {E}} \left[ W_2(\mu ^*(t),\nu _N^*(t)) \right] \le {\mathbb {E}} \left[ W_2^2(\mu ^*(t),\nu _N^*(t)) \right] ^{1/2}$ due to Jensen’s inequality.

In view of the above inequality, we have

$$\begin{aligned}&\left| J_1^N\left( u^{N*}\right) - \bar{J}_1\left( u_1^*,\mu ^*\right) \right| = O\left( \frac{1}{N^{1/(n+4)}} \right) , \end{aligned}$$

(C.10)

which shows that for any i, $1 \le i \le N$,

$$\begin{aligned} J_i^N\left( u^{N*}\right) - \frac{c}{N^{1/(n+4)}} \le \bar{J}_i\left( u_i^*,\mu ^*\right) . \end{aligned}$$

This implies that for sufficiently large N, the cost difference between $J_i^N(u^{N*})$ and $\bar{J}(u_i^*,\mu ^*)$ is negligible as a consequence of Lemma C1. This result can also be explained by the law of large numbers of the empirical distribution of $x_i^*$, $1 \le i \le N$, due to Lemma C1.

Furthermore, with a similar reasoning as in (C.9) and (C.10), and due to the empirical estimate obtained in (C.8), we have $J_1^N(u_1,u_2^*,\ldots ,u_N^*) \ge \bar{J}_1(u_1,\mu ^*) - \frac{c}{N^{1/(n+4)}} \ge \bar{J}_1(u_1^*,\mu ^*) - \frac{c}{N^{1/(n+4)}}$. Note that the second inequality follows from step (i) and (15), since $\bar{J}_1(u_1^*,\mu ^*) \le \bar{J}_1(u_1,\mu ^*)$ for $u_1 \in {\mathcal {U}}_{\mathcal {F}}^1$. This, together with (C.10), implies that the set of the optimal distributed controls, $u^{N*}=\{u_1^*,\ldots ,u_N^*\}$, where $u_i^*$ is given in (11), constitutes an $\epsilon _N$-Nash equilibrium. Also, we have $\epsilon _N \rightarrow 0$ as $N \rightarrow \infty $ with the convergence rate of $O(1/N^{1/(n+4)})$. This completes the proof of the theorem. $\square $

Appendix D: Lemma for Sect. 5

We have the following lemma for Sect. 5, which is a modified version of Lemma C1 in Appendix C.

Lemma D1

Suppose that the conditions in Theorem 3 hold. Then the following estimate holds: for $t \in [0,T]$,

$$\begin{aligned} {\mathbb {E}}\left[ W_2^2\left( \nu _N^*(t),\mu ^*(t)\right) \right] = O\left( \frac{1}{N^{2/(n+4)}} + \sup _{ k \in \mathcal {K}} | \pi _k^N - \pi _k|^2 \right) . \end{aligned}$$

Moreover, $W_2(\nu _N^*(t),\mu ^*(t)) \rightarrow 0$ as $N \rightarrow \infty $ almost surely for all $t \in [0,T]$.

Proof

First, observe that the empirical distribution of $x_i$, $ 1 \le i \le N$, $\nu _N^*$, with the individual optimal controls and the associated fixed point, that is, $\nu _N^*$, has the following relationship:

$$\begin{aligned} \nu _N^*(t)&= \frac{1}{N}\sum _{i=1}^N \delta _{x_i(t)} = \frac{1}{N} \sum _{k=1}^K \sum _{i \in \mathcal {N}_k} \delta _{x_i(t)} = \frac{1}{N} \sum _{k=1}^K N_k \bar{\nu }_k^{N,*}(t) = \sum _{k=1}^K \pi _k^N \bar{\nu }_k^{N,*}(t), \end{aligned}$$

where $\bar{\nu }_k^{N,*}(t) = \frac{1}{N_k} \sum _{i \in \mathcal {N}_k} \delta _{x_i(t)}$. Note that $\bar{\nu }_k^{N,*}(t) \rightarrow \mu _k$ almost surely as $N_k \rightarrow \infty $ for each $k \in \mathcal {K}$.

Since $W_2$ is a distance, we have

$$\begin{aligned} {\mathbb {E}} \left[ W_2^2(\nu _N^*(t),\mu ^*(t)) \right]&\le c {\mathbb {E}} \left[ W_2^2 \left( \sum _{k=1}^K \pi _k^N \bar{\nu }_k^{N,*}(t),\sum _{k=1}^K \pi _k^N \mu _k^*(t) \right) \right] \end{aligned}$$

(D.1)

$$\begin{aligned}&\quad + c {\mathbb {E}} \left[ W_2^2 \left( \sum _{k=1}^K \pi _k^N \mu _k^*(t),\sum _{k=1}^K \pi _k \mu _k^*(t) \right) \right] . \end{aligned}$$

(D.2)

For (D.2), we first show that there exists a constant c, independent of N, such that

$$\begin{aligned}&{\mathbb {E}} \left[ W_2^2 \left( \sum _{k=1}^K \pi _k^N \mu _k^*(t),\sum _{k=1}^K \pi _k \mu _k^*(t) \right) \right] \le c \sup _{ k \in \mathcal {K}} | \pi _k^N - \pi _k|^2. \end{aligned}$$

(D.3)

By (18), we have $\sum _{k=1}^K \pi _k^N \mu _k^*(t) \rightarrow \sum _{k=1}^K \pi _k \mu _k^*(t)$ as $N \rightarrow \infty $, which is equivalent to saying that $W_2 (\sum _{k=1}^K \pi _k^N \mu _k^*(t),\sum _{k=1}^K \pi _k \mu _k^*(t) ) \rightarrow 0$ as $N \rightarrow \infty $ [57, Chapter 10.2]. This implies (D.3).

We now consider (D.1). It satisfies the following inequality:

$$\begin{aligned}&{\mathbb {E}} \left[ W_2^2 \left( \sum _{k=1}^K \pi _k^N \bar{\nu }_k^{N,*}(t),\sum _{k=1}^K \pi _k^N \mu _k^*(t) \right) \right] \le {\mathbb {E}} \left[ W_2^2 \left( \sum _{k=1}^K \pi _k^N \bar{\nu }_k^{N,*}(t),\sum _{k=1}^K \pi _k \bar{\nu }_k^{N,*}(t) \right) \right] \nonumber \\&\qquad + {\mathbb {E}} \left[ W_2^2 \left( \sum _{k=1}^K \pi _k \bar{\nu }_k^{N,*}(t),\sum _{k=1}^K \pi _k \mu _k^*(t) \Bigr ) \right) \right] + {\mathbb {E}} \left[ W_2^2 \left( \sum _{k=1}^K \pi _k \mu _k^*(t),\sum _{k=1}^K \pi _k^N \mu _k^*(t) \Bigr ) \right) \right] . \end{aligned}$$

(D.4)

In view of the assumption in (18) and (D.3), the first and last terms in (D.4) are bounded above by $c \sup _{ k \in \mathcal {K}} | \pi _k^N - \pi _k|^2$. For the second term in (D.4), from Lemma C1 in “Appendix C” section, we have

$$\begin{aligned}&{\mathbb {E}} \left[ W_2^2 \left( \sum _{k=1}^K \pi _k \bar{\nu }_k^{N,*}(t),\sum _{k=1}^K \pi _k \mu _k^*(t) \Bigr ) \right) \right] = O\left( \frac{1}{N^{2/(n+4)}} \right) ,~ \forall t \in [0,T]. \end{aligned}$$

This yields the desired result, thus completing the proof. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moon, J., Başar, T. Risk-Sensitive Mean Field Games via the Stochastic Maximum Principle. Dyn Games Appl 9, 1100–1125 (2019). https://doi.org/10.1007/s13235-018-00290-z

Download citation

Published: 13 December 2018
Issue Date: December 2019
DOI: https://doi.org/10.1007/s13235-018-00290-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Risk-Sensitive Mean Field Games via the Stochastic Maximum Principle

Abstract

Access this article

Similar content being viewed by others

Partially Observed Discrete-Time Risk-Sensitive Mean Field Games

Risk, Optimization and Meanfield Type Control

Mean-Field-Type Games with Jump and Regime Switching

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: The Risk-Sensitive Maximum Principle

Theorem A1

Remark A1

Proof of Theorem A1

Appendix B: Proof of Theorem 1

Appendix C: Proof of Theorem 2

Lemma C1

Proof of Theorem 2

Appendix D: Lemma for Sect. 5

Lemma D1

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Risk-Sensitive Mean Field Games via the Stochastic Maximum Principle

Abstract

Access this article

Similar content being viewed by others

Partially Observed Discrete-Time Risk-Sensitive Mean Field Games

Risk, Optimization and Meanfield Type Control

Mean-Field-Type Games with Jump and Regime Switching

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A: The Risk-Sensitive Maximum Principle

Theorem A1

Remark A1

Proof of Theorem A1

Appendix B: Proof of Theorem 1

Appendix C: Proof of Theorem 2

Lemma C1

Proof of Theorem 2

Appendix D: Lemma for Sect. 5

Lemma D1

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation