Decentralized optimality conditions of stochastic differential decision problems via Girsanov’s measure transformation

Charalambous, Charalambos D.

doi:10.1007/s00498-016-0168-3

Decentralized optimality conditions of stochastic differential decision problems via Girsanov’s measure transformation

Original Article
Published: 06 July 2016

Volume 28, article number 19, (2016)
Cite this article

Mathematics of Control, Signals, and Systems Aims and scope Submit manuscript

Charalambos D. Charalambous¹

255 Accesses
17 Citations
Explore all metrics

Abstract

In this paper, we apply two methods to derive necessary and sufficient decentralized optimality conditions for stochastic differential decision problems with multiple Decision Makers (DMs), which aim at optimizing a common pay-off, based on the notions of decentralized global optimality and decentralized person-by-person (PbP) optimality. Method 1: We utilize the stochastic maximum principle to derive necessary and sufficient conditions which consist of forward and backward Stochastic Differential Equations (SDEs), and conditional variational Hamiltonians, conditioned on the information structures of the DMs. The sufficient conditions for decentralized PbP optimality are local conditions, closely related to the necessary conditions for decentralized PbP optimality. However, under certain convexity condition on the Hamiltonian, and a global version of the sufficient conditions for decentralized PbP optimality, we show decentralized global optimality. Method 2: We utilize the value processes of decentralized PbP optimal policies, we relate them to solutions of backward SDEs, we identify sufficient conditions for decentralized PbP optimality, and we show these are precisely those derived via the maximum principle. For both methods, as usual, we utilize Girsanov’s theorem to transform the initial decentralized stochastic optimal decision problems, to equivalent decentralized stochastic optimal decision problems on a reference probability space, in which the controlled process and the information processes which generate part of the information structures of the DMs, are independent of any of the decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sum-of-Squares Relaxations for Information Theory and Variational Inference

Article 05 April 2024

{Euclidean, metric, and Wasserstein} gradient flows: an overview

Article Open access 14 March 2017

Expected utility theory with probability grids and preference formation

Article Open access 28 August 2019

References

Benes VE (1970) Existence of optimal strategies based on specified information, for a class of stochastic decision problems. SIAM J Control Optim 8(2):179–188
Benes VE (1971) Existence of optimal stochastic control law. SIAM J Control Optim 9(3):446–472
Article MathSciNet MATH Google Scholar
Davis MHA, Varaiya P (1973) Dynamic programming conditions for partially observable stochastic systems. SIAM J Control Optim 11(2):226–261
Striebel C (1975) Optimal control of discrete-time stochastic systems. Lecture notes in economics and mathematical systems, vol 110. Springer, Berlin
Elliott RJ (1977) The optimal control of a stochastic system. SIAM J Control Optim 15(5):756–778
Article MathSciNet MATH Google Scholar
Bismut JM (1978) An introductory approach to duality in optimal stochastic control. SIAM Rev 30:62–78
Article MathSciNet MATH Google Scholar
Striebel C (1984) Martingale conditions for the optimal control of continuous time stochastic systems. Stoch Process Appl 18(2):329–347
Haussmann UG (1986) A stochastic minimum principle for optimal control of diffusions. Pitman Logman research notes in mathematics, vol 15. Wiley, New York
Bensoussan A (1982) Stochastic control of partially observable systems. Cambridge University Press, Cambridge
MATH Google Scholar
Ahmed N (1998) Linear and nonlinear filtering for scientists and engineers. World Scientific, Singapore
Book MATH Google Scholar
Yong J, Zhou XY (1999) Stochastic controls, Hamiltonian systems and HJB equations. Springer, Berlin
MATH Google Scholar
Charalambous CD, Hibey JL (1996) Minimum principle for partially observable nonlinear risk-sensitive control problems using measure-valued decompositions. Stoch Stoch Rep 57:247–288
Witsenhausen HS (1968) A counter example in stochastic optimum control. SIAM J Control Optim 6(1):131–147
Article MathSciNet MATH Google Scholar
Kumar PR, van Schuppen JH (1980) On Nash equilibrium solutions in stochastic dynamic games. IEEE Trans Autom Control 25(6):1146–1149
Article MathSciNet MATH Google Scholar
Witsenhausen HJ (1971) On information structures, feedback and causality. SIAM J Control Optim 9(2):149–160
Article MathSciNet MATH Google Scholar
Witsenhausen H (1988) Equivalent stochastic control problems. Math Control Signals Syst 1:3–11
Article MathSciNet MATH Google Scholar
Ho Y-C, Chu K-C (1972) Team decision theory and information structures in optimal control problems-part I. IEEE Trans Autom Control 17(1):15–22
Article MathSciNet MATH Google Scholar
Ho Y-C, Chu K-C (1973) On the equivalence of information structures in static and dynamic teams. IEEE Trans Autom Control 18(2):187–188
Kurtaran B-Z, Sivan R (1974) Linear-Quadratic-Gaussian control with one-step-delay sharing pattern. IEEE Trans Autom Control 19(5):571–574
Yoshikawa T (1975) Dynamic programming approach to decentralized stochastic control problems. IEEE Trans Autom Control 20(6):796–797
Kurtaran B-Z (1975) A concise derivation of the LQG one-step-delay sharing problem solution. IEEE Trans Autom Control 20(6):808–810
Article MathSciNet MATH Google Scholar
Varaiya P, Walrand J (1977) Decentralized stochastic control. University of California, Berkeley, Technical Report
Varaiya P, Walrand J (1978) On delay sharing patterns. IEEE Trans Autom Control 23(3):443–445
Article MathSciNet MATH Google Scholar
Bagghi A, Basar T (1980) Teams decision theory for linear continuous-time systems. IEEE Trans Autom Control 25(6):1154–1161
Article Google Scholar
Ho Y (1980) Team decision theory and information structures. Proc IEEE 68:644–655
Article Google Scholar
Krainak J, Speyer JL, Marcus SI (1982) Static team problems-part I: sufficient conditions and the exponential cost criterion. IEEE Trans Autom Control 27(4):839–848
Article MathSciNet MATH Google Scholar
Krainak J, Speyer JL, Marcus SI (1982) Static team problems-part II: affine control laws, projections, algorithms, and the LEGT problem. IEEE Trans Autom Control 27(4):848–859
Article MathSciNet MATH Google Scholar
Walrand J, Varaiya P (1983) Optimal causal coding-decoding problems. IEEE Trans Inf Theory 29(3):814–820
Article MathSciNet MATH Google Scholar
Walrand JC, Varaiya P (1983) Causal coding and control of Markov chains. Syst Control Lett 3(8):189–192
MathSciNet MATH Google Scholar
Bansar R (1985) An equilibrium theory for multiperson decision making with multiple probabilistic models. IEEE Trans Autom Control 30(2):118–132
Article MathSciNet Google Scholar
Bansar R, Basar T (1987) Stochastic teams with nonclassical information revisited: when is an affine law optimal. IEEE Trans Autom Control 32(6):554–559
Article MathSciNet MATH Google Scholar
Aicardi M, Davoli F, Minciardi R (1987) Decentralized optimal control of markov chains with a common past information. IEEE Trans Autom Control 32(11):1028–1031
Article MathSciNet MATH Google Scholar
de Waal PR, van Schuppen JH (2000) A class of team problems with discrete action spaces: optimality conditions based on multimodularity. SIAM J Control Optim 38(3):875–892
Article MathSciNet MATH Google Scholar
Teneketzis D (2006) On the structure of optimal real-time encoders and decoders in noisy communication. IEEE Trans Inf Theory 52(9):4017–4035
Article MathSciNet MATH Google Scholar
Mahajan A, Teneketzis D (2009) Optimal design of sequential real-time communication systems. IEEE Trans Inf Theory 55(11):5317–5337
Article MathSciNet Google Scholar
Mahajan A, Teneketzis D (2009) Optimal performance of networked control systems with non-classical information structures. SIAM J Control Optim 48(3):1377–1404
Article MathSciNet MATH Google Scholar
Nayyar A, Mahajan A, Teneketzis D (2011) Optimal control strategies in delayed sharing information structures. IEEE Trans Autom Control 56(7):1606–1620
Article MathSciNet Google Scholar
Nayyar A, Teneketzis D (2011) Sequential problems in decentralized detection with communication. IEEE Trans Inf Theory 57(8):5410–5435
Article MathSciNet Google Scholar
van Schuppen JH (2011) Control of distributed stochastic systems-introduction, problems, and approaches. In: International proceedings of the IFAC World Congress, vol 44, pp 4446–4452. doi:10.3182/20110828-6-IT-1002.01535
Lessard L, Lall S (2011) A state-space solution to the two-player optimal control problems. In: Proceedings of 49th annual Allerton conference on communication, control and computing, pp 1559–1564
van Schuppen JH, Boutin O, Kempker PL, Komenda J, Masopust T, Pambakian N, Ran ACM (2011) Control of distributed systems: tutorial and overview. Eur J Control 17(5–6):579–602
Mishra A, Langbort C, Dullerud G (2012) A team theoretic approach to decentralized control of systems with stochastic parameters. In: Proceedings of the 51st conference on decision and control, pp 2116–2121
Gattami A, Bernhardsson BM, Rantzer A (2012) Robust team decision theory. IEEE Trans Autom Control 57(3):794–798
Article MathSciNet Google Scholar
van Schuppen JH, Villa T (eds) (2015) Coordination control of distributed systems. LNCIS, vol 456. Springer, Berlin
Marschak J (1955) Elements for a theory of teams. Manag Sci 1(2):127–137
Article MathSciNet MATH Google Scholar
Radner R (1962) Team decision problems. Ann Math Stat 33(3):857–881
Article MathSciNet MATH Google Scholar
Marschak J, Radner R (1972) Economic theory of teams. Yale University Press, New Haven
MATH Google Scholar
Liptser R, Shiryayev A (1977) Statistics of random processes, vol 1. Springer, New York
Book MATH Google Scholar
Sandell NR, Athans M (1974) Solution of some nonclassical LQG stochastic decision problems. IEEE Trans Autom Control 19(2):108–116
Article MathSciNet MATH Google Scholar
Charalambous CD, Ahmed NU (2016) Centralized versus decentralized team games of distributed stochastic differential decision systems with noiseless information structures-Part I: General theory. IEEE Transactions on Automatic Control, p 39 (to appear)
Karatzas I, Shreve S (1991) Brownian motion and stochastic calculus, 2nd edn. Springer, Berlin
MATH Google Scholar
Elliott RJ, Yang H (1991) Control of partially observed diffusions. J Optim Theory Appl 71(3):485–501
Article MathSciNet MATH Google Scholar
Charalambous CD, Ahmed NU (2013) Dynamic team theory of stochastic differential decision systems with decentralized noisy information structures via Girsanov’s measure transformation. Mathematics of Control, Signals, and Systems, p 52 (submitted). October 2013 [Online]. arXiv:1309.1913
Ahmed NU, Charalambous CD (2013) Stochastic minimum principle for partially observed systems subject to continuous and jump diffusion processes and driven by relaxed controls. SIAM J Control Optim 51(4):3235–3257
Article MathSciNet MATH Google Scholar
Elliott RJ, Varaiya P (1979) A Sufficient condition for the optimal control of a partially observed stochastic system. In: Jacobs OLR (ed) Analysis and optimization of stochastic systems. Academic Press, New York
Google Scholar
Elliott RJ (1982) Stochastic calculus and applications. Springer, Berlin
MATH Google Scholar
Charalambous CD, Ahmed NU (2013) Centralized versus decentralized team games of distributed stochastic differential decision systems with noiseless information structures-part II: applications. IEEE Trans Autom Control, p 39 (submitted). February 2013 [Online]. arXiv:1302.3416

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Cyprus, 1678, Nicosia, Cyprus
Charalambos D. Charalambous

Authors

Charalambos D. Charalambous
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Charalambos D. Charalambous.

Appendix

Proof of Theorem 1

Statement (2), ${\mathbb {E}} \Lambda ^u(t)=1, \forall t \in [0,T]$.

First, we show that ${\mathbb {E}}\{\Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2\}<K$. By applying the Itô differential rule

$$\begin{aligned} \mathrm{d} |x(t)|_{{\mathbb {R}}^n}^2= & {} 2\langle x(t), \sigma (t,x(t))\mathrm{d}W(t) \rangle + \hbox {tr} ( a(t,x(t)))\mathrm{d}t, \quad a(t,x) \mathop {=}\limits ^{\triangle }\sigma (t,x) \sigma ^*(t,x) \\ \mathrm{d} \Big (\Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2\Big )= & {} \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2 f^{*}(t,x(t),u(t)) (a(t,x(t)))^{-1} \mathrm{d}x(t) \\&+\, 2 \Lambda ^u(t) \langle x(t), \sigma (t,x(t))\mathrm{d}W(t) \rangle \\&+\, 2 \Lambda ^u(t) \langle x(t), f(t,x(t),u(t)) \rangle \mathrm{d}t + \Lambda ^u(t) \hbox {tr} (a(t,x(t)))\mathrm{d}t. \end{aligned}$$

Then by applying the Itô differential rule once more we have

$$\begin{aligned}&\mathrm{d} \frac{ \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2}{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} = \frac{1}{\Big ( 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2\Big )^2}\nonumber \\&\qquad \times \Big \{ \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2 f^{*}(t,x(t))(a(t,x(t)))^{-1} \mathrm{d}x(t)+2 \Lambda ^u(t) \langle x(t), \sigma (t,x(t))\mathrm{d}W(t) \rangle \nonumber \\&\qquad +\, 2 \Lambda ^u(t) \langle x(t), f(t,x(t),u(t)) \rangle \mathrm{d}t+\Lambda ^u(t) \hbox {tr} ( a(t,x(t)) )\mathrm{d}t \Big \} \\&\qquad -\, \frac{\epsilon (\Lambda ^u(t))^2}{( 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2)^3}\Big \{ \langle ( ( a(t,x(t)))^{-1} f(t,x(t),u(t)) |x(t)|_{{\mathbb {R}}^n}^2\\&\qquad +\,2x(t), a(t,x(t)) ( ( a(t,x(t)))^{-1} f(t,x(t),u(t)) |x(t)|_{{\mathbb {R}}^n}^2+2x(t) )\rangle \Big \}\mathrm{d}t \end{aligned}$$

Integrating over [0, T] and taking the expectation with respect to ${\mathbb {P}}$, and using the fact that ${\mathbb {E}} ( \Lambda ^u(t)) \le 1, \forall t \in [0,T]$, yields

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} {\mathbb {E}} \left\{ \frac{ \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2}{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} \right\} \le {\mathbb {E}} \left\{ \frac{\Lambda ^u(t) \Big [ 2\langle x(t), f(t,x(t),u(t)) \rangle + \hbox {tr} \Big ( a(t,x(t)) \Big ) \Big ] }{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} \right\} .\nonumber \\ \end{aligned}$$

(159)

By Assumptions 1, (A1), (A2), there exists $K>0$ such that

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} {\mathbb {E}} \left\{ \frac{ \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2}{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} \right\} \le K \left( {\mathbb {E}} \left\{ \frac{\Lambda ^u(t) \Big [ |x(t)|_{{\mathbb {R}}^n}^2 + |u(t)|_{{\mathbb {R}}^d}^2\Big ] }{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} \right\} +1 \right) \end{aligned}$$

Since for any $u \in {\mathbb {U}}_A^{(K)}[0, T]$, we have $ {\mathbb {E}} \int _{0}^T \Lambda ^u(t) |u(t)|_{{\mathbb {R}}^d}^2 \mathrm{d}t$ is finite, then it follows from Gronwall inequality that

$$\begin{aligned} {\mathbb {E}}\left\{ \frac{ \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2}{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} \right\} \le C, \quad \forall t \in [0,T]. \end{aligned}$$

(160)

By Fatou’s lemma we obtain ${\mathbb {E}} \{\Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2\}< C, \forall t \in [0,T]$. Consider

$$\begin{aligned} \mathrm{d} \frac{\Lambda ^u(t)}{ 1 + \epsilon \Lambda ^u(t)}= & {} \frac{\Lambda ^u(t) f^*(t,x(t))(a(s,x(s)))^{-1} \mathrm{d}x(t)}{( 1 + \epsilon \Lambda ^u(t))^2} \\&-\, \frac{\epsilon (\Lambda ^u(t))^2 f^*(t,x(t),u(t))(a(t,x(t)))^{-1}f(t,x(t),u(t))}{(1 + \epsilon \Lambda ^u(t))^3}; \end{aligned}$$

then

$$\begin{aligned} {\mathbb {E}} \frac{\Lambda ^u(t)}{ 1 + \epsilon \Lambda ^u(t)} = 1- {\mathbb {E}} \int _{0}^t \frac{\epsilon (\Lambda ^u(s))^2 f^*(s,x(s),u(s))(a(t,x(t)))^{-1} f(s,x(s),u(s))}{(1 + \epsilon \Lambda ^u(s))^3}\mathrm{d}s.\nonumber \\ \end{aligned}$$

(161)

Since

$$\begin{aligned}&\frac{\epsilon (\Lambda ^u(t))^2 f^*(t,x(t),u(t))( a(t,x(t)))^{-1}f(t,x(t),u(t))}{( 1 + \epsilon \Lambda ^u(t))^3} \longrightarrow 0,\\&\quad \hbox {a.e. }t \in [0,T],{\mathbb {P}}\text {-a.s.} \quad \hbox {as }\epsilon \longrightarrow 0 \end{aligned}$$

and by (A7) there exists a constant $C>0$ such that it is bounded by $C \Lambda ^u(t) ( 1+|x(t)|_{{\mathbb {R}}^n}^2+ |u(t)|_{{\mathbb {R}}^d}^2)$, by the Lebesgue’s dominated convergence theorem we have

$$\begin{aligned} {\mathbb {E}} \int _{0}^t \frac{\epsilon (\Lambda ^u(s))^2 f^*(s,x(s),u(s))(a(s,x(s)))^{-1}f(s,x(s),u(s))}{(1 + \epsilon \Lambda ^u(s))^3} \!\longmapsto \! 0 \quad \hbox {as }\epsilon \!\longrightarrow \! 0.\nonumber \\ \end{aligned}$$

(162)

Since ${\mathbb {E}} (\Lambda ^u(t))\le 1, \forall t \in [0,T]$ by using (161) into (162), we obtain ${\mathbb {E}} \frac{\Lambda ^u(t)}{ 1 + \epsilon \Lambda ^u(t)} \longrightarrow {\mathbb {E}} \Lambda ^u(t)$, as $\epsilon \longrightarrow 0$. Hence, we must have ${\mathbb {E}} \Lambda ^u(t)=1, \forall t \in [0,T]$. This completes the derivation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Charalambous, C.D. Decentralized optimality conditions of stochastic differential decision problems via Girsanov’s measure transformation. Math. Control Signals Syst. 28, 19 (2016). https://doi.org/10.1007/s00498-016-0168-3

Download citation

Received: 03 October 2013
Accepted: 21 June 2016
Published: 06 July 2016
DOI: https://doi.org/10.1007/s00498-016-0168-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Decentralized optimality conditions of stochastic differential decision problems via Girsanov’s measure transformation

Abstract

Access this article

Similar content being viewed by others

Sum-of-Squares Relaxations for Information Theory and Variational Inference

{Euclidean, metric, and Wasserstein} gradient flows: an overview

Expected utility theory with probability grids and preference formation

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Decentralized optimality conditions of stochastic differential decision problems via Girsanov’s measure transformation

Abstract

Access this article

Similar content being viewed by others

Sum-of-Squares Relaxations for Information Theory and Variational Inference

{Euclidean, metric, and Wasserstein} gradient flows: an overview

Expected utility theory with probability grids and preference formation

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation