Skip to main content
Log in

Decentralized optimality conditions of stochastic differential decision problems via Girsanov’s measure transformation

  • Original Article
  • Published:
Mathematics of Control, Signals, and Systems Aims and scope Submit manuscript

Abstract

In this paper, we apply two methods to derive necessary and sufficient decentralized optimality conditions for stochastic differential decision problems with multiple Decision Makers (DMs), which aim at optimizing a common pay-off, based on the notions of decentralized global optimality and decentralized person-by-person (PbP) optimality. Method 1: We utilize the stochastic maximum principle to derive necessary and sufficient conditions which consist of forward and backward Stochastic Differential Equations (SDEs), and conditional variational Hamiltonians, conditioned on the information structures of the DMs. The sufficient conditions for decentralized PbP optimality are local conditions, closely related to the necessary conditions for decentralized PbP optimality. However, under certain convexity condition on the Hamiltonian, and a global version of the sufficient conditions for decentralized PbP optimality, we show decentralized global optimality. Method 2: We utilize the value processes of decentralized PbP optimal policies, we relate them to solutions of backward SDEs, we identify sufficient conditions for decentralized PbP optimality, and we show these are precisely those derived via the maximum principle. For both methods, as usual, we utilize Girsanov’s theorem to transform the initial decentralized stochastic optimal decision problems, to equivalent decentralized stochastic optimal decision problems on a reference probability space, in which the controlled process and the information processes which generate part of the information structures of the DMs, are independent of any of the decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Benes VE (1970) Existence of optimal strategies based on specified information, for a class of stochastic decision problems. SIAM J Control Optim 8(2):179–188

  2. Benes VE (1971) Existence of optimal stochastic control law. SIAM J Control Optim 9(3):446–472

    Article  MathSciNet  MATH  Google Scholar 

  3. Davis MHA, Varaiya P (1973) Dynamic programming conditions for partially observable stochastic systems. SIAM J Control Optim 11(2):226–261

  4. Striebel C (1975) Optimal control of discrete-time stochastic systems. Lecture notes in economics and mathematical systems, vol 110. Springer, Berlin

  5. Elliott RJ (1977) The optimal control of a stochastic system. SIAM J Control Optim 15(5):756–778

    Article  MathSciNet  MATH  Google Scholar 

  6. Bismut JM (1978) An introductory approach to duality in optimal stochastic control. SIAM Rev 30:62–78

    Article  MathSciNet  MATH  Google Scholar 

  7. Striebel C (1984) Martingale conditions for the optimal control of continuous time stochastic systems. Stoch Process Appl 18(2):329–347

  8. Haussmann UG (1986) A stochastic minimum principle for optimal control of diffusions. Pitman Logman research notes in mathematics, vol 15. Wiley, New York

  9. Bensoussan A (1982) Stochastic control of partially observable systems. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  10. Ahmed N (1998) Linear and nonlinear filtering for scientists and engineers. World Scientific, Singapore

    Book  MATH  Google Scholar 

  11. Yong J, Zhou XY (1999) Stochastic controls, Hamiltonian systems and HJB equations. Springer, Berlin

    MATH  Google Scholar 

  12. Charalambous CD, Hibey JL (1996) Minimum principle for partially observable nonlinear risk-sensitive control problems using measure-valued decompositions. Stoch Stoch Rep 57:247–288

  13. Witsenhausen HS (1968) A counter example in stochastic optimum control. SIAM J Control Optim 6(1):131–147

    Article  MathSciNet  MATH  Google Scholar 

  14. Kumar PR, van Schuppen JH (1980) On Nash equilibrium solutions in stochastic dynamic games. IEEE Trans Autom Control 25(6):1146–1149

    Article  MathSciNet  MATH  Google Scholar 

  15. Witsenhausen HJ (1971) On information structures, feedback and causality. SIAM J Control Optim 9(2):149–160

    Article  MathSciNet  MATH  Google Scholar 

  16. Witsenhausen H (1988) Equivalent stochastic control problems. Math Control Signals Syst 1:3–11

    Article  MathSciNet  MATH  Google Scholar 

  17. Ho Y-C, Chu K-C (1972) Team decision theory and information structures in optimal control problems-part I. IEEE Trans Autom Control 17(1):15–22

    Article  MathSciNet  MATH  Google Scholar 

  18. Ho Y-C, Chu K-C (1973) On the equivalence of information structures in static and dynamic teams. IEEE Trans Autom Control 18(2):187–188

  19. Kurtaran B-Z, Sivan R (1974) Linear-Quadratic-Gaussian control with one-step-delay sharing pattern. IEEE Trans Autom Control 19(5):571–574

  20. Yoshikawa T (1975) Dynamic programming approach to decentralized stochastic control problems. IEEE Trans Autom Control 20(6):796–797

  21. Kurtaran B-Z (1975) A concise derivation of the LQG one-step-delay sharing problem solution. IEEE Trans Autom Control 20(6):808–810

    Article  MathSciNet  MATH  Google Scholar 

  22. Varaiya P, Walrand J (1977) Decentralized stochastic control. University of California, Berkeley, Technical Report

  23. Varaiya P, Walrand J (1978) On delay sharing patterns. IEEE Trans Autom Control 23(3):443–445

    Article  MathSciNet  MATH  Google Scholar 

  24. Bagghi A, Basar T (1980) Teams decision theory for linear continuous-time systems. IEEE Trans Autom Control 25(6):1154–1161

    Article  Google Scholar 

  25. Ho Y (1980) Team decision theory and information structures. Proc IEEE 68:644–655

    Article  Google Scholar 

  26. Krainak J, Speyer JL, Marcus SI (1982) Static team problems-part I: sufficient conditions and the exponential cost criterion. IEEE Trans Autom Control 27(4):839–848

    Article  MathSciNet  MATH  Google Scholar 

  27. Krainak J, Speyer JL, Marcus SI (1982) Static team problems-part II: affine control laws, projections, algorithms, and the LEGT problem. IEEE Trans Autom Control 27(4):848–859

    Article  MathSciNet  MATH  Google Scholar 

  28. Walrand J, Varaiya P (1983) Optimal causal coding-decoding problems. IEEE Trans Inf Theory 29(3):814–820

    Article  MathSciNet  MATH  Google Scholar 

  29. Walrand JC, Varaiya P (1983) Causal coding and control of Markov chains. Syst Control Lett 3(8):189–192

    MathSciNet  MATH  Google Scholar 

  30. Bansar R (1985) An equilibrium theory for multiperson decision making with multiple probabilistic models. IEEE Trans Autom Control 30(2):118–132

    Article  MathSciNet  Google Scholar 

  31. Bansar R, Basar T (1987) Stochastic teams with nonclassical information revisited: when is an affine law optimal. IEEE Trans Autom Control 32(6):554–559

    Article  MathSciNet  MATH  Google Scholar 

  32. Aicardi M, Davoli F, Minciardi R (1987) Decentralized optimal control of markov chains with a common past information. IEEE Trans Autom Control 32(11):1028–1031

    Article  MathSciNet  MATH  Google Scholar 

  33. de Waal PR, van Schuppen JH (2000) A class of team problems with discrete action spaces: optimality conditions based on multimodularity. SIAM J Control Optim 38(3):875–892

    Article  MathSciNet  MATH  Google Scholar 

  34. Teneketzis D (2006) On the structure of optimal real-time encoders and decoders in noisy communication. IEEE Trans Inf Theory 52(9):4017–4035

    Article  MathSciNet  MATH  Google Scholar 

  35. Mahajan A, Teneketzis D (2009) Optimal design of sequential real-time communication systems. IEEE Trans Inf Theory 55(11):5317–5337

    Article  MathSciNet  Google Scholar 

  36. Mahajan A, Teneketzis D (2009) Optimal performance of networked control systems with non-classical information structures. SIAM J Control Optim 48(3):1377–1404

    Article  MathSciNet  MATH  Google Scholar 

  37. Nayyar A, Mahajan A, Teneketzis D (2011) Optimal control strategies in delayed sharing information structures. IEEE Trans Autom Control 56(7):1606–1620

    Article  MathSciNet  Google Scholar 

  38. Nayyar A, Teneketzis D (2011) Sequential problems in decentralized detection with communication. IEEE Trans Inf Theory 57(8):5410–5435

    Article  MathSciNet  Google Scholar 

  39. van Schuppen JH (2011) Control of distributed stochastic systems-introduction, problems, and approaches. In: International proceedings of the IFAC World Congress, vol 44, pp 4446–4452. doi:10.3182/20110828-6-IT-1002.01535

  40. Lessard L, Lall S (2011) A state-space solution to the two-player optimal control problems. In: Proceedings of 49th annual Allerton conference on communication, control and computing, pp 1559–1564

  41. van Schuppen JH, Boutin O, Kempker PL, Komenda J, Masopust T, Pambakian N, Ran ACM (2011) Control of distributed systems: tutorial and overview. Eur J Control 17(5–6):579–602

  42. Mishra A, Langbort C, Dullerud G (2012) A team theoretic approach to decentralized control of systems with stochastic parameters. In: Proceedings of the 51st conference on decision and control, pp 2116–2121

  43. Gattami A, Bernhardsson BM, Rantzer A (2012) Robust team decision theory. IEEE Trans Autom Control 57(3):794–798

    Article  MathSciNet  Google Scholar 

  44. van Schuppen JH, Villa T (eds) (2015) Coordination control of distributed systems. LNCIS, vol 456. Springer, Berlin

  45. Marschak J (1955) Elements for a theory of teams. Manag Sci 1(2):127–137

    Article  MathSciNet  MATH  Google Scholar 

  46. Radner R (1962) Team decision problems. Ann Math Stat 33(3):857–881

    Article  MathSciNet  MATH  Google Scholar 

  47. Marschak J, Radner R (1972) Economic theory of teams. Yale University Press, New Haven

    MATH  Google Scholar 

  48. Liptser R, Shiryayev A (1977) Statistics of random processes, vol 1. Springer, New York

    Book  MATH  Google Scholar 

  49. Sandell NR, Athans M (1974) Solution of some nonclassical LQG stochastic decision problems. IEEE Trans Autom Control 19(2):108–116

    Article  MathSciNet  MATH  Google Scholar 

  50. Charalambous CD, Ahmed NU (2016) Centralized versus decentralized team games of distributed stochastic differential decision systems with noiseless information structures-Part I: General theory. IEEE Transactions on Automatic Control, p 39 (to appear)

  51. Karatzas I, Shreve S (1991) Brownian motion and stochastic calculus, 2nd edn. Springer, Berlin

    MATH  Google Scholar 

  52. Elliott RJ, Yang H (1991) Control of partially observed diffusions. J Optim Theory Appl 71(3):485–501

    Article  MathSciNet  MATH  Google Scholar 

  53. Charalambous CD, Ahmed NU (2013) Dynamic team theory of stochastic differential decision systems with decentralized noisy information structures via Girsanov’s measure transformation. Mathematics of Control, Signals, and Systems, p 52 (submitted). October 2013 [Online]. arXiv:1309.1913

  54. Ahmed NU, Charalambous CD (2013) Stochastic minimum principle for partially observed systems subject to continuous and jump diffusion processes and driven by relaxed controls. SIAM J Control Optim 51(4):3235–3257

    Article  MathSciNet  MATH  Google Scholar 

  55. Elliott RJ, Varaiya P (1979) A Sufficient condition for the optimal control of a partially observed stochastic system. In: Jacobs OLR (ed) Analysis and optimization of stochastic systems. Academic Press, New York

    Google Scholar 

  56. Elliott RJ (1982) Stochastic calculus and applications. Springer, Berlin

    MATH  Google Scholar 

  57. Charalambous CD, Ahmed NU (2013) Centralized versus decentralized team games of distributed stochastic differential decision systems with noiseless information structures-part II: applications. IEEE Trans Autom Control, p 39 (submitted). February 2013 [Online]. arXiv:1302.3416

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Charalambos D. Charalambous.

Appendix

Appendix

Proof of Theorem 1

Statement (2), \({\mathbb {E}} \Lambda ^u(t)=1, \forall t \in [0,T]\).

First, we show that \({\mathbb {E}}\{\Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2\}<K\). By applying the Itô differential rule

$$\begin{aligned} \mathrm{d} |x(t)|_{{\mathbb {R}}^n}^2= & {} 2\langle x(t), \sigma (t,x(t))\mathrm{d}W(t) \rangle + \hbox {tr} ( a(t,x(t)))\mathrm{d}t, \quad a(t,x) \mathop {=}\limits ^{\triangle }\sigma (t,x) \sigma ^*(t,x) \\ \mathrm{d} \Big (\Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2\Big )= & {} \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2 f^{*}(t,x(t),u(t)) (a(t,x(t)))^{-1} \mathrm{d}x(t) \\&+\, 2 \Lambda ^u(t) \langle x(t), \sigma (t,x(t))\mathrm{d}W(t) \rangle \\&+\, 2 \Lambda ^u(t) \langle x(t), f(t,x(t),u(t)) \rangle \mathrm{d}t + \Lambda ^u(t) \hbox {tr} (a(t,x(t)))\mathrm{d}t. \end{aligned}$$

Then by applying the Itô differential rule once more we have

$$\begin{aligned}&\mathrm{d} \frac{ \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2}{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} = \frac{1}{\Big ( 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2\Big )^2}\nonumber \\&\qquad \times \Big \{ \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2 f^{*}(t,x(t))(a(t,x(t)))^{-1} \mathrm{d}x(t)+2 \Lambda ^u(t) \langle x(t), \sigma (t,x(t))\mathrm{d}W(t) \rangle \nonumber \\&\qquad +\, 2 \Lambda ^u(t) \langle x(t), f(t,x(t),u(t)) \rangle \mathrm{d}t+\Lambda ^u(t) \hbox {tr} ( a(t,x(t)) )\mathrm{d}t \Big \} \\&\qquad -\, \frac{\epsilon (\Lambda ^u(t))^2}{( 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2)^3}\Big \{ \langle ( ( a(t,x(t)))^{-1} f(t,x(t),u(t)) |x(t)|_{{\mathbb {R}}^n}^2\\&\qquad +\,2x(t), a(t,x(t)) ( ( a(t,x(t)))^{-1} f(t,x(t),u(t)) |x(t)|_{{\mathbb {R}}^n}^2+2x(t) )\rangle \Big \}\mathrm{d}t \end{aligned}$$

Integrating over [0, T] and taking the expectation with respect to \({\mathbb {P}}\), and using the fact that \({\mathbb {E}} ( \Lambda ^u(t)) \le 1, \forall t \in [0,T]\), yields

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} {\mathbb {E}} \left\{ \frac{ \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2}{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} \right\} \le {\mathbb {E}} \left\{ \frac{\Lambda ^u(t) \Big [ 2\langle x(t), f(t,x(t),u(t)) \rangle + \hbox {tr} \Big ( a(t,x(t)) \Big ) \Big ] }{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} \right\} .\nonumber \\ \end{aligned}$$
(159)

By Assumptions 1, (A1), (A2), there exists \(K>0\) such that

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} {\mathbb {E}} \left\{ \frac{ \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2}{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} \right\} \le K \left( {\mathbb {E}} \left\{ \frac{\Lambda ^u(t) \Big [ |x(t)|_{{\mathbb {R}}^n}^2 + |u(t)|_{{\mathbb {R}}^d}^2\Big ] }{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} \right\} +1 \right) \end{aligned}$$

Since for any \(u \in {\mathbb {U}}_A^{(K)}[0, T]\), we have \( {\mathbb {E}} \int _{0}^T \Lambda ^u(t) |u(t)|_{{\mathbb {R}}^d}^2 \mathrm{d}t\) is finite, then it follows from Gronwall inequality that

$$\begin{aligned} {\mathbb {E}}\left\{ \frac{ \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2}{ 1+ \epsilon \Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2} \right\} \le C, \quad \forall t \in [0,T]. \end{aligned}$$
(160)

By Fatou’s lemma we obtain \({\mathbb {E}} \{\Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2\}< C, \forall t \in [0,T]\). Consider

$$\begin{aligned} \mathrm{d} \frac{\Lambda ^u(t)}{ 1 + \epsilon \Lambda ^u(t)}= & {} \frac{\Lambda ^u(t) f^*(t,x(t))(a(s,x(s)))^{-1} \mathrm{d}x(t)}{( 1 + \epsilon \Lambda ^u(t))^2} \\&-\, \frac{\epsilon (\Lambda ^u(t))^2 f^*(t,x(t),u(t))(a(t,x(t)))^{-1}f(t,x(t),u(t))}{(1 + \epsilon \Lambda ^u(t))^3}; \end{aligned}$$

then

$$\begin{aligned} {\mathbb {E}} \frac{\Lambda ^u(t)}{ 1 + \epsilon \Lambda ^u(t)} = 1- {\mathbb {E}} \int _{0}^t \frac{\epsilon (\Lambda ^u(s))^2 f^*(s,x(s),u(s))(a(t,x(t)))^{-1} f(s,x(s),u(s))}{(1 + \epsilon \Lambda ^u(s))^3}\mathrm{d}s.\nonumber \\ \end{aligned}$$
(161)

Since

$$\begin{aligned}&\frac{\epsilon (\Lambda ^u(t))^2 f^*(t,x(t),u(t))( a(t,x(t)))^{-1}f(t,x(t),u(t))}{( 1 + \epsilon \Lambda ^u(t))^3} \longrightarrow 0,\\&\quad \hbox {a.e. }t \in [0,T],{\mathbb {P}}\text {-a.s.} \quad \hbox {as }\epsilon \longrightarrow 0 \end{aligned}$$

and by (A7) there exists a constant \(C>0\) such that it is bounded by \(C \Lambda ^u(t) ( 1+|x(t)|_{{\mathbb {R}}^n}^2+ |u(t)|_{{\mathbb {R}}^d}^2)\), by the Lebesgue’s dominated convergence theorem we have

$$\begin{aligned} {\mathbb {E}} \int _{0}^t \frac{\epsilon (\Lambda ^u(s))^2 f^*(s,x(s),u(s))(a(s,x(s)))^{-1}f(s,x(s),u(s))}{(1 + \epsilon \Lambda ^u(s))^3} \!\longmapsto \! 0 \quad \hbox {as }\epsilon \!\longrightarrow \! 0.\nonumber \\ \end{aligned}$$
(162)

Since \({\mathbb {E}} (\Lambda ^u(t))\le 1, \forall t \in [0,T]\) by using (161) into (162), we obtain \({\mathbb {E}} \frac{\Lambda ^u(t)}{ 1 + \epsilon \Lambda ^u(t)} \longrightarrow {\mathbb {E}} \Lambda ^u(t)\), as \(\epsilon \longrightarrow 0\). Hence, we must have \({\mathbb {E}} \Lambda ^u(t)=1, \forall t \in [0,T]\). This completes the derivation.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Charalambous, C.D. Decentralized optimality conditions of stochastic differential decision problems via Girsanov’s measure transformation. Math. Control Signals Syst. 28, 19 (2016). https://doi.org/10.1007/s00498-016-0168-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00498-016-0168-3

Keywords

Navigation