Probability-to-Goal and Expected Cost Trade-Off in Stochastic Shortest Path

Kuo, Isabella; Freire, Valdinei

doi:10.1007/978-3-030-86970-0_9

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12951))

Included in the following conference series:

International Conference on Computational Science and Its Applications

1142 Accesses
1 Citations

Abstract

Markov Decision Processes (MDPs) model problems where a decision-maker makes sequential decisions and the effect of decisions is probabilistic. A particular formulation of MDPs is the Shortest Stochastic Path (SSP), in which the agent seeks to accomplish a goal while reducing the cost of the path to it. Literature introduces some optimality criteria; most of them consider a priority of maximizing probability to accomplish the goal while minimizing some cost measure; such criteria allow a unique trade-off between probability-to-goal and path cost for a decision-maker. Here, we present algorithms to make a trade-off between probability-to-goal and expected cost; based on the Minimum Cost given Maximum Probability (MCMP) criterion, we propose to treat such a trade-off under three different methods: (i) additional constraints for probability-to-goal or expected cost; (ii) a Pareto’s optimality by finding non-dominated policies; and (iii) an efficient preference elicitation process based on non-dominated policies. We report experiments on a toy problem, where probability-to-goal and expected cost trade-off can be observed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Benabbou, N., Leroy, C., Lust, T.: Regret-based elicitation for solving multi-objective knapsack problems with rank-dependent aggregators. In: The 24th European Conference on Artificial Intelligence (ECAI 2020). Saint Jacques de Compostelle, Spain (June 2020). https://hal.sorbonne-universite.fr/hal-02493998
Bertsekas, D.P., Tsitsiklis, J.N.: An analysis of stochastic shortest path problems. Math. Oper. Res. 16(3), 580–595 (1991)
Article MathSciNet Google Scholar
Branke, J., Corrente, S., Greco, S., Gutjahr, W.: Efficient pairwise preference elicitation allowing for indifference. Comput. Oper. Res. 88, 175–186 (2017)
Article MathSciNet Google Scholar
Carpin, S., Chow, Y.L., Pavone, M.: Risk aversion in finite Markov decision processes using total cost criteria and average value at risk. In: 2016 IEEE International Conference on Robotics and Automation (ICRA), pp. 335–342. IEEE (2016)
Google Scholar
Chow, Y., Tamar, A., Mannor, S., Pavone, M.: Risk-sensitive and robust decision-making: a CVaR optimization approach. In: Advances in Neural Information Systems, pp. 1522–1530 (2015)
Google Scholar
Freire, V., Delgado, K.V.: GUBS: a utility-based semantic for goal-directed Markov decision processes. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pp. 741–749 (2017)
Google Scholar
Freire, V., Delgado, K.V., Reis, W.A.S.: An exact algorithm to make a trade-off between cost and probability in SSPs. In: Proceedings of the Twenty-Ninth International Conference on Automated Planning and Scheduling, pp. 146–154 (2019)
Google Scholar
Kolobov, A., Mausam, Weld, D.S.: A theory of goal-oriented MDPs with dead ends. In: de Freitas, N., Murphy, K.P. (eds.) Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, pp. 438–447. AUAI Press (2012)
Google Scholar
Mausam, A.K.: Planning with Markov decision processes: an AI perspective. Synth. Lect. Artif. Intell. Mach. Learn. 6(1), 1–210 (2012)
MATH Google Scholar
Regan, K., Boutilier, C.: Robust policy computation in reward-uncertain MDPs using nondominated policies. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2010, pp. 1127–1133. AAAI Press (2010)
Google Scholar
Freire da Silva, V., Reali Costa, A.H.: A geometric approach to find nondominated policies to imprecise reward MDPs. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part I. LNCS (LNAI), vol. 6911, pp. 439–454. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23780-5_38
Chapter Google Scholar
Silva, V.F.D., Costa, A.H.R., Lima, P.: Inverse reinforcement learning with evaluation. In: IEEE International Conference on Robotics and Automation (ICRA 2006), pp. 4246–4251. IEEE, Orlando (May 2006)
Google Scholar
Teichteil-Königsbuch, F.: Stochastic safest and shortest path problems. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence (AAAI 2012), pp. 1825–1831 (2012)
Google Scholar
Teichteil-Königsbuch, F., Vidal, V., Infantes, G.: Extending classical planning heuristics to probabilistic planning with dead-ends. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2011, San Francisco, California, USA, August 7–11, 2011 (2011)
Google Scholar
Trevizan, F., Teichteil-Königsbuch, F., Thiébaux, S.: Efficient solutions for stochastic shortest path problems with dead ends. In: Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence (UAI) (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Escola de Artes, Ciências e Humanidades - Universidade de São Paulo, São Paulo, Brazil
Isabella Kuo & Valdinei Freire

Authors

Isabella Kuo
View author publications
You can also search for this author in PubMed Google Scholar
Valdinei Freire
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Valdinei Freire .

Editor information

Editors and Affiliations

University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Potenza, Italy
Beniamino Murgante
Covenant University, Ota, Nigeria
Sanjay Misra
University of Cagliari, Cagliari, Italy
Chiara Garau
University of Cagliari, Cagliari, Italy
Ivan Blečić
Monash University, Clayton, VIC, Australia
David Taniar
Kyushu Sangyo University, Fukuoka, Japan
Bernady O. Apduhan
University of Minho, Braga, Portugal
Ana Maria A. C. Rocha
Polytechnic University of Bari, Bari, Italy
Eufemia Tarantino
Polytechnic University of Bari, Bari, Italy
Carmelo Maria Torre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kuo, I., Freire, V. (2021). Probability-to-Goal and Expected Cost Trade-Off in Stochastic Shortest Path. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2021. ICCSA 2021. Lecture Notes in Computer Science(), vol 12951. Springer, Cham. https://doi.org/10.1007/978-3-030-86970-0_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-86970-0_9
Published: 11 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86969-4
Online ISBN: 978-3-030-86970-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics