Abstract
Markov Decision Processes (MDPs) model problems where a decision-maker makes sequential decisions and the effect of decisions is probabilistic. A particular formulation of MDPs is the Shortest Stochastic Path (SSP), in which the agent seeks to accomplish a goal while reducing the cost of the path to it. Literature introduces some optimality criteria; most of them consider a priority of maximizing probability to accomplish the goal while minimizing some cost measure; such criteria allow a unique trade-off between probability-to-goal and path cost for a decision-maker. Here, we present algorithms to make a trade-off between probability-to-goal and expected cost; based on the Minimum Cost given Maximum Probability (MCMP) criterion, we propose to treat such a trade-off under three different methods: (i) additional constraints for probability-to-goal or expected cost; (ii) a Pareto’s optimality by finding non-dominated policies; and (iii) an efficient preference elicitation process based on non-dominated policies. We report experiments on a toy problem, where probability-to-goal and expected cost trade-off can be observed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Benabbou, N., Leroy, C., Lust, T.: Regret-based elicitation for solving multi-objective knapsack problems with rank-dependent aggregators. In: The 24th European Conference on Artificial Intelligence (ECAI 2020). Saint Jacques de Compostelle, Spain (June 2020). https://hal.sorbonne-universite.fr/hal-02493998
Bertsekas, D.P., Tsitsiklis, J.N.: An analysis of stochastic shortest path problems. Math. Oper. Res. 16(3), 580–595 (1991)
Branke, J., Corrente, S., Greco, S., Gutjahr, W.: Efficient pairwise preference elicitation allowing for indifference. Comput. Oper. Res. 88, 175–186 (2017)
Carpin, S., Chow, Y.L., Pavone, M.: Risk aversion in finite Markov decision processes using total cost criteria and average value at risk. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 335–342. IEEE (2016)
Chow, Y., Tamar, A., Mannor, S., Pavone, M.: Risk-sensitive and robust decision-making: a CVaR optimization approach. In: Advances in Neural Information Systems, pp. 1522–1530 (2015)
Freire, V., Delgado, K.V.: GUBS: a utility-based semantic for goal-directed Markov decision processes. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pp. 741–749 (2017)
Freire, V., Delgado, K.V., Reis, W.A.S.: An exact algorithm to make a trade-off between cost and probability in SSPs. In: Proceedings of the Twenty-Ninth International Conference on Automated Planning and Scheduling, pp. 146–154 (2019)
Kolobov, A., Mausam, Weld, D.S.: A theory of goal-oriented MDPs with dead ends. In: de Freitas, N., Murphy, K.P. (eds.) Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, pp. 438–447. AUAI Press (2012)
Mausam, A.K.: Planning with Markov decision processes: an AI perspective. Synth. Lect. Artif. Intell. Mach. Learn. 6(1), 1–210 (2012)
Regan, K., Boutilier, C.: Robust policy computation in reward-uncertain MDPs using nondominated policies. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, pp. 1127–1133. AAAI Press (2010)
Freire da Silva, V., Reali Costa, A.H.: A geometric approach to find nondominated policies to imprecise reward MDPs. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part I. LNCS (LNAI), vol. 6911, pp. 439–454. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23780-5_38
Silva, V.F.D., Costa, A.H.R., Lima, P.: Inverse reinforcement learning with evaluation. In: IEEE International Conference on Robotics and Automation (ICRA 2006), pp. 4246–4251. IEEE, Orlando (May 2006)
Teichteil-Königsbuch, F.: Stochastic safest and shortest path problems. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI 2012), pp. 1825–1831 (2012)
Teichteil-Königsbuch, F., Vidal, V., Infantes, G.: Extending classical planning heuristics to probabilistic planning with dead-ends. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2011, San Francisco, California, USA, August 7–11, 2011 (2011)
Trevizan, F., Teichteil-Königsbuch, F., Thiébaux, S.: Efficient solutions for stochastic shortest path problems with dead ends. In: Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence (UAI) (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Kuo, I., Freire, V. (2021). Probability-to-Goal and Expected Cost Trade-Off in Stochastic Shortest Path. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2021. ICCSA 2021. Lecture Notes in Computer Science(), vol 12951. Springer, Cham. https://doi.org/10.1007/978-3-030-86970-0_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-86970-0_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86969-4
Online ISBN: 978-3-030-86970-0
eBook Packages: Computer ScienceComputer Science (R0)