Risk-Sensitive Average Optimality in Markov Decision Chains

Sladký, Karel; Montes-de-Oca, Raúl

doi:10.1007/978-3-540-77903-2_11

Karel Sladký² &
Raúl Montes-de-Oca³

Part of the book series: Operations Research Proceedings ((ORP,volume 2007))

1746 Accesses

Abstract

We consider a Markov decision chain X = {X _n, n = 0, 1, ...} with finite state space \( \mathcal{I} \) = {1, 2, ...,N} and a finite set \( \mathcal{A}_i \) = {1, 2, ...,K _i} of possible decisions (actions) in state i ∈ \( \mathcal{I} \). Supposing that in state i ∈ \( \mathcal{I} \) action k ∈ \( \mathcal{A}_i \) is selected, then state j is reached in the next transition with a given probability p ^k_ij and one-stage transition reward r _ij will be accrued to such transition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berman A and Plemmons RJ (1979) Nonnegative Matrices in the Mathematical Sciences. Academic Press, New York
Google Scholar
Cavazos-Cadena R, Montes-de-Oca R (2003) The value iteration algorithm in risk-sensitive average Markov decision chains with finite state space. Math Oper Res 28:752–756
Article Google Scholar
Gantmakher FR (1959) The Theory of Matrices. Chelsea, London
Google Scholar
Howard RA, Matheson J (1972) Risk-sensitive Markov decision processes. Manag Sci 23:356–369
Article Google Scholar
Sladký K (1976) On dynamic programming recursions for multiplicative Markov decision chains. Math Programming Study 6:216–226
Google Scholar
Sladký K (1980) Bounds on discrete dynamic programming recursions I. Kybernetika 16:526–547
Google Scholar
Whittle P (1983) Optimization Over Time — Dynamic Programming and Stochastic Control. Volume II, Chapter 35, Wiley, Chichester
Google Scholar
Zijm WHM (1983) Nonnegative Matrices in Dynamic Programming. Mathematical Centre Tract, Amsterdam
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information Theory and Automation, Pod Vodárenskou věží 4, 18208, Praha 8, Czech Republic
Karel Sladký
Departamento de Matemáticas, Universidad Autónoma Metropolitana, Campus Iztapalapa, Avenido San Rafael, Atlixco # 186, Colonia Vicentina México, 09340, D.F. Mexico
Raúl Montes-de-Oca

Authors

Karel Sladký
View author publications
You can also search for this author in PubMed Google Scholar
Raúl Montes-de-Oca
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Law and Economics, Chair of Operations Research and Logistics, Saarland University, P.O. Box 15 11 50, 66041, Saarbrücken, Germany
Jörg Kalcsics & Stefan Nickel &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sladký, K., Montes-de-Oca, R. (2008). Risk-Sensitive Average Optimality in Markov Decision Chains. In: Kalcsics, J., Nickel, S. (eds) Operations Research Proceedings 2007. Operations Research Proceedings, vol 2007. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77903-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-540-77903-2_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77902-5
Online ISBN: 978-3-540-77903-2
eBook Packages: Business and EconomicsBusiness and Management (R0)

Publish with us

Policies and ethics