Abstract
In this paper, we apply two methods to derive necessary and sufficient decentralized optimality conditions for stochastic differential decision problems with multiple Decision Makers (DMs), which aim at optimizing a common pay-off, based on the notions of decentralized global optimality and decentralized person-by-person (PbP) optimality. Method 1: We utilize the stochastic maximum principle to derive necessary and sufficient conditions which consist of forward and backward Stochastic Differential Equations (SDEs), and conditional variational Hamiltonians, conditioned on the information structures of the DMs. The sufficient conditions for decentralized PbP optimality are local conditions, closely related to the necessary conditions for decentralized PbP optimality. However, under certain convexity condition on the Hamiltonian, and a global version of the sufficient conditions for decentralized PbP optimality, we show decentralized global optimality. Method 2: We utilize the value processes of decentralized PbP optimal policies, we relate them to solutions of backward SDEs, we identify sufficient conditions for decentralized PbP optimality, and we show these are precisely those derived via the maximum principle. For both methods, as usual, we utilize Girsanov’s theorem to transform the initial decentralized stochastic optimal decision problems, to equivalent decentralized stochastic optimal decision problems on a reference probability space, in which the controlled process and the information processes which generate part of the information structures of the DMs, are independent of any of the decisions.
Similar content being viewed by others
References
Benes VE (1970) Existence of optimal strategies based on specified information, for a class of stochastic decision problems. SIAM J Control Optim 8(2):179–188
Benes VE (1971) Existence of optimal stochastic control law. SIAM J Control Optim 9(3):446–472
Davis MHA, Varaiya P (1973) Dynamic programming conditions for partially observable stochastic systems. SIAM J Control Optim 11(2):226–261
Striebel C (1975) Optimal control of discrete-time stochastic systems. Lecture notes in economics and mathematical systems, vol 110. Springer, Berlin
Elliott RJ (1977) The optimal control of a stochastic system. SIAM J Control Optim 15(5):756–778
Bismut JM (1978) An introductory approach to duality in optimal stochastic control. SIAM Rev 30:62–78
Striebel C (1984) Martingale conditions for the optimal control of continuous time stochastic systems. Stoch Process Appl 18(2):329–347
Haussmann UG (1986) A stochastic minimum principle for optimal control of diffusions. Pitman Logman research notes in mathematics, vol 15. Wiley, New York
Bensoussan A (1982) Stochastic control of partially observable systems. Cambridge University Press, Cambridge
Ahmed N (1998) Linear and nonlinear filtering for scientists and engineers. World Scientific, Singapore
Yong J, Zhou XY (1999) Stochastic controls, Hamiltonian systems and HJB equations. Springer, Berlin
Charalambous CD, Hibey JL (1996) Minimum principle for partially observable nonlinear risk-sensitive control problems using measure-valued decompositions. Stoch Stoch Rep 57:247–288
Witsenhausen HS (1968) A counter example in stochastic optimum control. SIAM J Control Optim 6(1):131–147
Kumar PR, van Schuppen JH (1980) On Nash equilibrium solutions in stochastic dynamic games. IEEE Trans Autom Control 25(6):1146–1149
Witsenhausen HJ (1971) On information structures, feedback and causality. SIAM J Control Optim 9(2):149–160
Witsenhausen H (1988) Equivalent stochastic control problems. Math Control Signals Syst 1:3–11
Ho Y-C, Chu K-C (1972) Team decision theory and information structures in optimal control problems-part I. IEEE Trans Autom Control 17(1):15–22
Ho Y-C, Chu K-C (1973) On the equivalence of information structures in static and dynamic teams. IEEE Trans Autom Control 18(2):187–188
Kurtaran B-Z, Sivan R (1974) Linear-Quadratic-Gaussian control with one-step-delay sharing pattern. IEEE Trans Autom Control 19(5):571–574
Yoshikawa T (1975) Dynamic programming approach to decentralized stochastic control problems. IEEE Trans Autom Control 20(6):796–797
Kurtaran B-Z (1975) A concise derivation of the LQG one-step-delay sharing problem solution. IEEE Trans Autom Control 20(6):808–810
Varaiya P, Walrand J (1977) Decentralized stochastic control. University of California, Berkeley, Technical Report
Varaiya P, Walrand J (1978) On delay sharing patterns. IEEE Trans Autom Control 23(3):443–445
Bagghi A, Basar T (1980) Teams decision theory for linear continuous-time systems. IEEE Trans Autom Control 25(6):1154–1161
Ho Y (1980) Team decision theory and information structures. Proc IEEE 68:644–655
Krainak J, Speyer JL, Marcus SI (1982) Static team problems-part I: sufficient conditions and the exponential cost criterion. IEEE Trans Autom Control 27(4):839–848
Krainak J, Speyer JL, Marcus SI (1982) Static team problems-part II: affine control laws, projections, algorithms, and the LEGT problem. IEEE Trans Autom Control 27(4):848–859
Walrand J, Varaiya P (1983) Optimal causal coding-decoding problems. IEEE Trans Inf Theory 29(3):814–820
Walrand JC, Varaiya P (1983) Causal coding and control of Markov chains. Syst Control Lett 3(8):189–192
Bansar R (1985) An equilibrium theory for multiperson decision making with multiple probabilistic models. IEEE Trans Autom Control 30(2):118–132
Bansar R, Basar T (1987) Stochastic teams with nonclassical information revisited: when is an affine law optimal. IEEE Trans Autom Control 32(6):554–559
Aicardi M, Davoli F, Minciardi R (1987) Decentralized optimal control of markov chains with a common past information. IEEE Trans Autom Control 32(11):1028–1031
de Waal PR, van Schuppen JH (2000) A class of team problems with discrete action spaces: optimality conditions based on multimodularity. SIAM J Control Optim 38(3):875–892
Teneketzis D (2006) On the structure of optimal real-time encoders and decoders in noisy communication. IEEE Trans Inf Theory 52(9):4017–4035
Mahajan A, Teneketzis D (2009) Optimal design of sequential real-time communication systems. IEEE Trans Inf Theory 55(11):5317–5337
Mahajan A, Teneketzis D (2009) Optimal performance of networked control systems with non-classical information structures. SIAM J Control Optim 48(3):1377–1404
Nayyar A, Mahajan A, Teneketzis D (2011) Optimal control strategies in delayed sharing information structures. IEEE Trans Autom Control 56(7):1606–1620
Nayyar A, Teneketzis D (2011) Sequential problems in decentralized detection with communication. IEEE Trans Inf Theory 57(8):5410–5435
van Schuppen JH (2011) Control of distributed stochastic systems-introduction, problems, and approaches. In: International proceedings of the IFAC World Congress, vol 44, pp 4446–4452. doi:10.3182/20110828-6-IT-1002.01535
Lessard L, Lall S (2011) A state-space solution to the two-player optimal control problems. In: Proceedings of 49th annual Allerton conference on communication, control and computing, pp 1559–1564
van Schuppen JH, Boutin O, Kempker PL, Komenda J, Masopust T, Pambakian N, Ran ACM (2011) Control of distributed systems: tutorial and overview. Eur J Control 17(5–6):579–602
Mishra A, Langbort C, Dullerud G (2012) A team theoretic approach to decentralized control of systems with stochastic parameters. In: Proceedings of the 51st conference on decision and control, pp 2116–2121
Gattami A, Bernhardsson BM, Rantzer A (2012) Robust team decision theory. IEEE Trans Autom Control 57(3):794–798
van Schuppen JH, Villa T (eds) (2015) Coordination control of distributed systems. LNCIS, vol 456. Springer, Berlin
Marschak J (1955) Elements for a theory of teams. Manag Sci 1(2):127–137
Radner R (1962) Team decision problems. Ann Math Stat 33(3):857–881
Marschak J, Radner R (1972) Economic theory of teams. Yale University Press, New Haven
Liptser R, Shiryayev A (1977) Statistics of random processes, vol 1. Springer, New York
Sandell NR, Athans M (1974) Solution of some nonclassical LQG stochastic decision problems. IEEE Trans Autom Control 19(2):108–116
Charalambous CD, Ahmed NU (2016) Centralized versus decentralized team games of distributed stochastic differential decision systems with noiseless information structures-Part I: General theory. IEEE Transactions on Automatic Control, p 39 (to appear)
Karatzas I, Shreve S (1991) Brownian motion and stochastic calculus, 2nd edn. Springer, Berlin
Elliott RJ, Yang H (1991) Control of partially observed diffusions. J Optim Theory Appl 71(3):485–501
Charalambous CD, Ahmed NU (2013) Dynamic team theory of stochastic differential decision systems with decentralized noisy information structures via Girsanov’s measure transformation. Mathematics of Control, Signals, and Systems, p 52 (submitted). October 2013 [Online]. arXiv:1309.1913
Ahmed NU, Charalambous CD (2013) Stochastic minimum principle for partially observed systems subject to continuous and jump diffusion processes and driven by relaxed controls. SIAM J Control Optim 51(4):3235–3257
Elliott RJ, Varaiya P (1979) A Sufficient condition for the optimal control of a partially observed stochastic system. In: Jacobs OLR (ed) Analysis and optimization of stochastic systems. Academic Press, New York
Elliott RJ (1982) Stochastic calculus and applications. Springer, Berlin
Charalambous CD, Ahmed NU (2013) Centralized versus decentralized team games of distributed stochastic differential decision systems with noiseless information structures-part II: applications. IEEE Trans Autom Control, p 39 (submitted). February 2013 [Online]. arXiv:1302.3416
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Proof of Theorem 1
Statement (2), \({\mathbb {E}} \Lambda ^u(t)=1, \forall t \in [0,T]\).
First, we show that \({\mathbb {E}}\{\Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2\}<K\). By applying the Itô differential rule
Then by applying the Itô differential rule once more we have
Integrating over [0, T] and taking the expectation with respect to \({\mathbb {P}}\), and using the fact that \({\mathbb {E}} ( \Lambda ^u(t)) \le 1, \forall t \in [0,T]\), yields
By Assumptions 1, (A1), (A2), there exists \(K>0\) such that
Since for any \(u \in {\mathbb {U}}_A^{(K)}[0, T]\), we have \( {\mathbb {E}} \int _{0}^T \Lambda ^u(t) |u(t)|_{{\mathbb {R}}^d}^2 \mathrm{d}t\) is finite, then it follows from Gronwall inequality that
By Fatou’s lemma we obtain \({\mathbb {E}} \{\Lambda ^u(t) |x(t)|_{{\mathbb {R}}^n}^2\}< C, \forall t \in [0,T]\). Consider
then
Since
and by (A7) there exists a constant \(C>0\) such that it is bounded by \(C \Lambda ^u(t) ( 1+|x(t)|_{{\mathbb {R}}^n}^2+ |u(t)|_{{\mathbb {R}}^d}^2)\), by the Lebesgue’s dominated convergence theorem we have
Since \({\mathbb {E}} (\Lambda ^u(t))\le 1, \forall t \in [0,T]\) by using (161) into (162), we obtain \({\mathbb {E}} \frac{\Lambda ^u(t)}{ 1 + \epsilon \Lambda ^u(t)} \longrightarrow {\mathbb {E}} \Lambda ^u(t)\), as \(\epsilon \longrightarrow 0\). Hence, we must have \({\mathbb {E}} \Lambda ^u(t)=1, \forall t \in [0,T]\). This completes the derivation.
Rights and permissions
About this article
Cite this article
Charalambous, C.D. Decentralized optimality conditions of stochastic differential decision problems via Girsanov’s measure transformation. Math. Control Signals Syst. 28, 19 (2016). https://doi.org/10.1007/s00498-016-0168-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00498-016-0168-3