Abstract
In Stochastic Shortest Path (SSP) problems, not always the requirement of having at least one policy with a probability of reaching goals (probability-to-goal) equal to 1 can be met. This is the case when dead ends, states from which the probability-to-goal is equal to 0, are unavoidable for any policy, which demands the definition of alternate methods to handle such cases. The \(\alpha \)-strong probability-to-goal priority is a property that is maintained by a criterion if a necessary condition to optimality is that the ratio between the probability-to-goal values of the optimal policy and any other policy is bound by a value of \(0 \le \alpha \le 1\). This definition is helpful when evaluating the preference of different criteria for SSPs with dead ends. The Min-Cost given Max-Prob (MCMP) criterion is a method that prefers policies that minimize a well-defined cost function in the presence of unavoidable dead ends given policies that maximize probability-to-goal. However, it only guarantees \(\alpha \)-strong priority for \(\alpha = 1\). In this paper, we define \(\alpha \)-MCMP, a criterion based on MCMP with the addition of the guarantee of \(\alpha \)-strong priority for any value \(0 \le \alpha \le 1\). We also perform experiments comparing \(\alpha \)-MCMP and GUBS, the only other criteria known to have \(\alpha \)-strong priority for \(0 \le \alpha \le 1\), to analyze the difference between the probability-to-goal of policies generated by each criterion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Note that not every line representing a value of \(\lambda \) can be seen in the figures, because the values in these lines might be very close to the values in others, which can make them get covered by these other lines.
References
Bertsekas, D.: Dynamic Programming and Optimal Control. Athena Scientific, Belmont, Mass (1995)
Crispino, G.N., Freire, V., Delgado, K.V.: GUBS criterion: arbitrary trade-offs between cost and probability-to-goal in stochastic planning based on expected utility theory. Artif. Intell. 316, 103848 (2023)
d’Epenoux, F.: A probabilistic production and inventory problem. Manage. Sci. 10(1), 98–108 (1963)
Freire, V., Delgado, K.V.: GUBS: a utility-based semantic for goal-directed Markov decision processes. In: Proceedings of the 16th Conference on Autonomous Agents and Multiagent Systems, pp. 741–749 (2017)
Freire, V., Delgado, K.V., Reis, W.A.S.: An exact algorithm to make a trade-off between cost and probability in SSPs. In: Proceedings of the International Conference on Automated Planning and Scheduling, vol. 29, pp. 146–154 (2019)
Kolobov, A., Weld, D., et al.: A theory of goal-oriented MDPs with dead ends. In: Uncertainty in artificial intelligence: proceedings of the Twenty-eighth Conference [on uncertainty in artificial intelligence] (2012), pp. 438–447 (2012)
Kolobov, A., Weld, D.S., Geffner, H.: Heuristic search for generalized stochastic shortest path MDPs. In: Proceedings of the Twenty-First International Conference on International Conference on Automated Planning and Scheduling, pp. 130–137 (2011)
Kuo, I., Freire, V.: Probability-to-goal and expected cost trade-off in stochastic shortest path. In: Gervasi, O., Murgante, B., Misra, S., Garau, C., Blečić, I., Taniar, D., Apduhan, B.O., Rocha, A.M.A.C., Tarantino, E., Torre, C.M. (eds.) ICCSA 2021. LNCS, vol. 12951, pp. 111–125. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86970-0_9
Little, I., Thiebaux, S., et al.: Probabilistic planning vs. replanning. In: ICAPS Workshop on IPC: Past, Present and Future, pp. 1–10 (2007)
Patek, S.D.: On terminating Markov decision processes with a risk-averse objective function. Automatica 37(9), 1379–1386 (2001)
Puterman, M.: Markov decision processes: discrete stochastic dynamic programming. Wiley, New York (1994)
Sanner, S., Yoon, S.: IPPC results presentation. In: International Conference on Automated Planning and Scheduling (2011). http://users.cecs.anu.edu.au/ssanner/IPPC_2011/IPPC_2011_Presentation.pdf
Silver, T., Chitnis, R.: PDDLGym: Gym environments from PDDL problems. In: International Conference on Automated Planning and Scheduling (ICAPS) PRL Workshop, pp. 1–6 (2020). https://github.com/tomsilver/pddlgym
Teichteil-Königsbuch, F.: Stochastic safest and shortest path problems. In: Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, pp. 1825–1831 (2012)
Teichteil-Königsbuch, F., Vidal, V., Infantes, G.: Extending classical planning heuristics to probabilistic planning with dead-ends. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, pp. 1017–1022 (2011)
Trevizan, F.W., Teichteil-Königsbuch, F., Thiébaux, S.: Efficient solutions for stochastic shortest path problems with dead ends. In: Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence (UAI) (2017), pp. 1–10 (2017)
Acknowledgments
This study was supported in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) - Finance Code 001, by the São Paulo Research Foundation (FAPESP) grant \(\#\)2018/11236-9 and the Center for Artificial Intelligence (C4AI-USP), with support by FAPESP (grant #2019/07665-4) and by the IBM Corporation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Crispino, G.N., Freire, V., Delgado, K.V. (2023). \(\alpha \)-MCMP: Trade-Offs Between Probability and Cost in SSPs with the MCMP Criterion. In: Naldi, M.C., Bianchi, R.A.C. (eds) Intelligent Systems. BRACIS 2023. Lecture Notes in Computer Science(), vol 14195. Springer, Cham. https://doi.org/10.1007/978-3-031-45368-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-45368-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-45367-0
Online ISBN: 978-3-031-45368-7
eBook Packages: Computer ScienceComputer Science (R0)