Abstract
The usual optimization criteria for Markov decision processes (e.g. total discounted reward or mean reward) can be quite insufficient to fully capture the various aspects for a decision maker. It may be preferable to select more sophisticated criteria that also reflect variability-risk features of the problem. To this end we focus attention on risk-sensitive optimality criteria (i.e. the case when expectation of the stream of rewards generated by the Markov processes evaluated by an exponential utility function is considered) and their connections with mean-variance optimality (i.e. the case when a suitable combination of the expected total reward and its variance, usually considered per transition, is selected as a reasonable optimality criterion). The research of risk-sensitive optimality criteria in Markov decision processes was initiated in the seminal paper by Howard and Matheson [6] and followed by many other researchers (see e.g. [1, 2, 3, 5, 4, 8, 9, 14]). In this note we consider a Markov decision chain X = X n, n = 0,1, ... with finite state space \( \mathcal{I} \) = 1,2, ..., N and a finite set \( \mathcal{A}_i \) = 1,2, ..., K i of possible decisions (actions) in state i ∈ \( \mathcal{I} \). Supposing that in state i ∈ \( \mathcal{I} \) action k ∈ \( \mathcal{A}_i \) is selected, then state j is reached in the next transition with a given probability p k ij and one-stage transition reward r ij will be accrued to such transition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bielecki TD, Hernández-Hernández TD, Pliska SR (1999) Risk-sensitive control of finite state Markov chains in discrete time, with application to portfolio management. Math Methods Oper Res 50:167–188
Cavazos-Cadena R, Montes-de-Oca R (2003) The value iteration algorithm in risk-sensitive average Markov decision chains with finite state space. Math Oper Res 28:752–756
Cavazos-Cadena R (2003) Solution to the risk-sensitive average cost optimality equation in a class of Markov decision processes with finite state space. Math Methods Oper Res 57:253–285
Jaquette SA (1976) A utility criterion for Markov decision processes. Manag Sci 23:43–49
Hinderer K, Waldmann KH (2003) The critical discount factor for finite Markovian decision processes with an absorbing set. Math Methods Oper Res 57:1–19
Howard RA, Matheson J (1972) Risk-sensitive Markov decision processes. Manag Sci 23:356–369
Puterman ML (1994) Markov decision processes — discrete stochastic dynamic programming. Wiley, New York
Rothblum UG, Whittle P (1982) Growth optimality for branching Markov decision chains. Math Oper Res 7:582–601
Sladký K (1976) On dynamic programming recursions for multiplicative Markov decision chains. Math Programming Study 6:216–226
Sladký K (1980) Bounds on discrete dynamic programming recursions I. Kybernetika 16: 526–547
Sladký K (1981) On the existence of stationary optimal policies in discrete dynamic programing. Kybernetika 17:489–513
Sladký K (2005) On mean reward variance in semi-Markov processes. Math Methods Oper Res 62:387–397
Whittle P (1983) Optimization over time — dynamic programming and stochastic control. Volume II, Chapter 35, Wiley, Chichester
Zijm WHM (1983) Nonnegative matrices in dynamic programming. Mathematical Centre Tract, Amsterdam
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sladký, K. (2007). Risk-Sensitive Optimality Criteria in Markov Decision Processes. In: Waldmann, KH., Stocker, U.M. (eds) Operations Research Proceedings 2006. Operations Research Proceedings, vol 2006. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69995-8_88
Download citation
DOI: https://doi.org/10.1007/978-3-540-69995-8_88
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69994-1
Online ISBN: 978-3-540-69995-8
eBook Packages: Business and EconomicsBusiness and Management (R0)