Abstract
We consider risk measurement in controlled partially observable Markov processes in discrete time. We introduce a new concept of conditional stochastic time consistency and we derive the structure of risk measures enjoying this property. We prove that they can be represented by a collection of static law invariant risk measures on the space of function of the observable part of the state. We also derive the corresponding dynamic programming equations. Finally we illustrate the results on a machine deterioration problem.

Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Arlotto A, Gans N, Steele JM (2014) Markov decision problems where means bound variances. Oper Res 62(4):864–875
Artzner P, Delbaen F, Eber J-M, Heath D, Ku H (2007) Coherent multiperiod risk adjusted values and Bellman’s principle. Ann Oper Res 152:5–22
Aubin J-P, Frankowska H (2009) Set-valued analysis. Birkhäuser, Boston
Bäuerle N, Rieder U (2011) Markov decision processes with applications to finance. Universitext. Springer, Heidelberg
Bäuerle N, Rieder U (2013) More risk-sensitive Markov decision processes. Math Oper Res 39(1):105–120
Bäuerle N, Rieder U (2017) Partially observable risk-sensitive Markov decision processes. Math Oper Res 42:1180–1196
Bertsekas DP, Shreve SE (1978) Stochastic optimal control, volume 139 of mathematics in science and engineering. Academic Press, New York
Çavus Ö, Ruszczyński A (2014a) Computational methods for risk-averse undiscounted transient Markov models. Oper Res 62(2):401–417
Çavus Ö, Ruszczyński A (2014b) Risk-averse control of undiscounted transient Markov models. SIAM J Control Optim 52(6):3935–3966
Chen Z, Li G, Zhao Y (2014) Time-consistent investment policies in Markovian markets: a case of mean-variance analysis. J Econ Dyn Control 40:293–316
Cheridito P, Delbaen F, Kupper M (2006) Dynamic monetary risk measures for bounded discrete-time processes. Electron J Probab 11:57–106
Cheridito P, Kupper M (2011) Composition of time-consistent dynamic monetary risk measures in discrete time. Int J Theor Appl Finance 14(01):137–162
Chu S, Zhang Y (2014) Markov decision processes with iterated coherent risk measures. Int J Control 87(11):2286–2293
Coraluppi SP, Marcus SI (1999) Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes. Automatica 35(2):301–309
Dai Pra P, Meneghini L, Runggaldier WJ (1998) Explicit solutions for multivariate, discrete-time control problems under uncertainty. Syst Control Lett 34(4):169–176
Denardo EV, Rothblum UG (1979) Optimal stopping, exponential utility, and linear programming. Math Program 16(2):228–244
Di Masi GB, Stettner Ł (1999) Risk-sensitive control of discrete-time Markov processes with infinite horizon. SIAM J Control Optim 38(1):61–78
Fan J (2017) Process-based risk measures and risk-averse control of observable and partially observable discrete-time systems. Ph.D. Dissertation, Rutgers University
Fan J, Ruszczyński A (2016) Process-based risk measures and risk-averse control of discrete-time systems. arXiv:1411.2675
Feinberg EA, Kasyanov PO, Zgurovsky MZ (2016) Partially observable total-cost Markov decision processes with weakly continuous transition probabilities. Math Oper Res 41(2):656–681
Fernández-Gaucherand E, Marcus SI (1997) Risk-sensitive optimal control of hidden Markov models: structural results. IEEE Trans Autom Control 42(10):1418–1422
Filar JA, Kallenberg LCM, Lee H-M (1989) Variance-penalized Markov decision processes. Math Oper Res 14(1):147–161
Föllmer H, Penner I (2006) Convex risk measures and the dynamics of their penalty functions. Stat Decis 24(1/2006):61–96
Hinderer K (1970) Foundations of non-stationary dynamic programming with discrete time parameter. Springer, Berlin
Howard RA, Matheson JE (1971/72) Risk-sensitive Markov decision processes. Manag Sci. 18:356–369
James MR, Baras JS, Elliott RJ (1994) Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems. IEEE Trans Autom Control 39(4):780–792
Jaquette SC (1973) Markov decision processes with a new optimality criterion: discrete time. Ann Statist 1:496–505
Jaśkiewicz A, Matkowski J, Nowak AS (2013) Persistently optimal policies in stochastic dynamic programming with generalized discounting. Math Oper Res 38(1):108–121
Jobert A, Rogers LCG (2008) Valuations and dynamic convex risk measures. Math Finance 18(1):1–22
Klöppel S, Schweizer M (2007) Dynamic indifference valuation via convex risk measures. Math Finance 17(4):599–627
Kuratowski K, Ryll-Nardzewski C (1965) A general theorem on selectors. Bull Acad Polon Sci Ser Sci Math Astron Phys 13(1):397–403
Levitt S, Ben-Israel A (2001) On modeling risk in Markov decision processes. In: Rubinov A, Glover B (eds) Optimization and related topics . Applied Optimization, vol 47. Springer, Boston, MA, pp 27–40
Lin K, Marcus SI (2013) Dynamic programming with non-convex risk-sensitive measures. In: American control conference (ACC), 2013, IEEE, pp 6778–6783
Mannor S, Tsitsiklis JN (2013) Algorithmic aspects of mean-variance optimization in Markov decision processes. Eur J Oper Res 231(3):645–653
Marcus, SI, Fernández-Gaucherand E, Hernández-Hernández D, Coraluppi S, Fard P (1997) Risk sensitive Markov decision processes. In: Byrnes CI, Datta BN, Martin CF, Gilliam DS (eds) Systems and control in the twenty-first century. Systems & Control: Foundations & Applications, vol 22. Birkhäuser, Boston, MA, pp 263–279
Ogryczak W, Ruszczyński A (1999) From stochastic dominance to mean-risk models: semideviations as risk measures. Eur J Oper Res 116(1):33–50
Ogryczak W, Ruszczyński A (2001) On consistency of stochastic dominance and mean-semideviation models. Math Program 89(2):217–232
Pflug ChG, Römisch W (2007) Modeling, measuring and managing risk. World Scientific, Singapore
Riedel F (2004) Dynamic coherent risk measures. Stoch Process Their Appl 112:185–200
Roorda B, Schumacher JM, Engwerda J (2005) Coherent acceptability measures in multiperiod models. Math Finance 15(4):589–612
Runggaldier WJ (1998) Concepts and methods for discrete and continuous time control under uncertainty. Insur Math Econ 22(1):25–39
Ruszczyński A (2010) Risk-averse dynamic programming for Markov decision processes. Math Program 125(2, Ser. B):235–261
Ruszczyński A, Shapiro A (2006a) Optimization of convex risk functions. Math Oper Res 31:433–542
Ruszczyński A, Shapiro A (2006b) Conditional risk mappings. Math Oper Res 31:544–561
Scandolo G (2003) Risk measures in a dynamic setting. Ph.D. thesis, Università degli Studi di Milano
Shen Y, Stannat W, Obermayer K (2013) Risk-sensitive Markov control processes. SIAM J Control Optim 51(5):3652–3672
White DJ (1988) Mean, variance, and probabilistic criteria in finite Markov decision processes: a review. J Optim Theory Appl 56(1):1–29
Acknowledgements
Funding was provided by the National Science Foundation, Division of Mathematical Sciences (Grant No. 1312016).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fan, J., Ruszczyński, A. Risk measurement and risk-averse control of partially observable discrete-time Markov systems. Math Meth Oper Res 88, 161–184 (2018). https://doi.org/10.1007/s00186-018-0633-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00186-018-0633-5