Abstract
This paper deals with risk-sensitive piecewise deterministic Markov decision processes, where the expected exponential utility of an infinite-horizon discounted cost is minimized. Both the transition rate and cost rate are allowed to be unbounded. Based on a dynamic programming observation, we introduce an auxiliary function with the time as an additional variable to analyze the problem, which is different from those with the risk-sensitive parameter as an additional variable in previous works. Under suitable assumptions, we derive the associated Feynman-Kac’s formula, and then establish the associated Hamilton–Jacobi–Bellman equation with the time as a differential variable, which leads to the existence of optimal policies depending on the time, explicitly showing that the risk-sensitive discounted optimal policies are not stationary.
Similar content being viewed by others
References
Avrachenkov K, Habachi O, Piunovskiy A, Zhang Y (2015) Infinite horizon impulsive optimal control with applications to Internet congestion control. Int J Control 88:703–716
Bäuerle N, Rieder U (2011) Markov decision processes with applications to finance. Universitext, Springer, Heidelberg
Bäuerle N, Rieder U (2014) More risk-sensitive Markov decision processes. Math Oper Res, 39(1):105–120
Cavazos-Cadena R, Hernández-Hernández D (2018) Vanishing discount approximations in controlled Markov chains with risk-sensitive average criterion. Adv Appl Prob 50(1):204–230
Costa OLV, Dufour F, Piunovskiy AB (2016) Constrained and unconstrained optimal discounted control of piecewise deterministic Markov processes. SIAM J Control Optim 54:1444–1474
Costa OLV, Dufour F (2021) Integro-differential optimality equations for the risk-sensitive control of piecewise deterministic Markov processes. Math Methods Oper Res 93(2):327–357
Davis MHA (1993) Markov models and optimization. Monographs on statistics and applied probability, 49. Chapman & Hall, London
Ghosh MK, Saha S (2014) Risk-sensitive control of continuous time Markov chains. Stochastics 86:655–675
Guo X, Zhang Y (2020) On risk-sensitive piecewise deterministic Markov decision processes. Appl Math Optim 81(3):685–710
Guo XP, Zhang JY (2019) Risk-sensitive continuous-time Markov decision processes with unbounded rates and Borel spaces. Discrete Event Dyn Syst 29(4):445–471
Guo XP, Liao ZW (2019) Risk-sensitive discounted continuous-time Markov decision processes with unbounded rates. SIAM J Control Optim 57(6):3857–3883
Hernández-Lerma O, Lasserre JB (1999) Further topics on discrete-time Markov control processes. Springer-Verlag, New York
Huang YH, Guo XP (2019) Finite-horizon piecewise deterministic Markov decision processes with unbounded transition rates. Stochastics 91:67–95
Huang YH, Lian ZT, Guo XP (2020) Risk-sensitive finite-horizon piecewise deterministic Markov decision processes. Oper Res Lett 48(1):96–103
Jacod J (1975) Multivariate point processes: predictable projection, Radon-Nikodym derivatives, representation of martingales. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 31:235–253
Jaquette SC A utility criterion for Markov decision processes. Management Sci 23(1976)(1):43–49
Piunovskiy A, Zhang Y (2021) Aggregated occupation measures and linear programming approach to constrained impulse control problems. J Math Anal Appl 499(2). Paper No. 125070
Suresh Kumar K, Pal C (2013) Risk-sensitive control of pure jump process on countable space with near monotone cost. Appl Math Optim 68(3):311–331
Zhang Y (2017) Continuous-time Markov decision processes with exponential utility. SIAM J Control Optim 55:2636–2660
Acknowledgements
This research was supported in part by the National Natural Science Foundation of China (Grant No. 11931018), the University of Macau (Grant No. MYRG2019-00031-FBA), and Guangdong Basic and Applied Basic Research Foundation (Grant No. 2021A1515010057). We are also grateful to the anonymous referees for their careful reading and many constructive suggestions that have improved this paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huang, Y., Lian, Z. & Guo, X. Risk-sensitive infinite-horizon discounted piecewise deterministic Markov decision processes. Oper Res Int J 22, 5791–5816 (2022). https://doi.org/10.1007/s12351-022-00726-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12351-022-00726-w
Keywords
- Piecewise deterministic Markov decision processes
- Risk sensitive
- Discounted cost
- HJB equation
- Non-stationarity