Anytime Algorithms for Solving Possibilistic MDPs and Hybrid MDPs

Bauters, Kim; Liu, Weiru; Godo, Lluís

doi:10.1007/978-3-319-30024-5_2

Kim Bauters¹⁵,
Weiru Liu¹⁵ &
Lluís Godo¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9616))

Included in the following conference series:

FoIKS

957 Accesses

Abstract

The ability of an agent to make quick, rational decisions in an uncertain environment is paramount for its applicability in realistic settings. Markov Decision Processes (MDP) provide such a framework, but can only model uncertainty that can be expressed as probabilities. Possibilistic counterparts of MDPs allow to model imprecise beliefs, yet they cannot accurately represent probabilistic sources of uncertainty and they lack the efficient online solvers found in the probabilistic MDP community. In this paper we advance the state of the art in three important ways. Firstly, we propose the first online planner for possibilistic MDP by adapting the Monte-Carlo Tree Search (MCTS) algorithm. A key component is the development of efficient search structures to sample possibility distributions based on the DPY transformation as introduced by Dubois, Prade, and Yager. Secondly, we introduce a hybrid MDP model that allows us to express both possibilistic and probabilistic uncertainty, where the hybrid model is a proper extension of both probabilistic and possibilistic MDPs. Thirdly, we demonstrate that MCTS algorithms can readily be applied to solve such hybrid models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Planning in Partially Observable Domains with Fuzzy Epistemic States and Probabilistic Dynamics

Domain independent heuristics for online stochastic contingent planning

Article Open access 08 July 2024

Planning in Discrete and Continuous Markov Decision Processes by Probabilistic Programming

Notes

1.
A basic belief assignment, or bba, is a function of the form ${m:2^\mathcal {S}{}\rightarrow [0,1]}$ satisfying $m(\emptyset ) = 0$ and $\sum _{A \in 2^\mathcal {S}{}} m(A) = 1$.
2.
To deal with uncertainty in MCTS, a dual-layered approach is used in the search tree. A decision node, or state, allows us to choose which action to perform. A chance node, or action, has a number of stochastic effects which are outside our control.
3.
An implementation of the algorithm proposed in Algorithm 3 is also available online, at https://github.com/kimbauters/sparsepi.
4.
A common approach in probability theory to try to overcome this problem is to use subjective probabilities. However, in the more general POMDP/MOMDP settings this creates difficulties in its own right as subjective probabilities from the transitions are then combined with objective probabilities from the observation function.
5.
We use the terminology of a neutral elements loosely here to indicate that a reward of 0, and a preference of 1, are the defaults. Indeed, when rewards (resp. preferences) are omitted these are the values MDPs (resp. ${\pi \text {-MDP}}$s) default to.

References

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)
Article MATH Google Scholar
Bellman, R.: A Markovian decision process. Indiana Univ. Math. J. 6, 679–684 (1957)
Article MathSciNet MATH Google Scholar
Drougard, N., Teichteil-Königsbuch, F., Farges, J., Dubois, D.: Qualitative possibilistic mixed-observable MDPs. In: Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (UAI 2013) (2013)
Google Scholar
Drougard, N., Teichteil-Königsbuch, F., Farges, J., Dubois, D.: Structured possibilistic planning using decision diagrams. In: Proceedings of the 28th AI Conference on Artificial Intelligence (AAAI 2014), pp. 2257–2263 (2014)
Google Scholar
Dubois, D., Prade, H.: On several representations of an uncertain body of evidence. In: Gupta, M.M., Sanchez, E. (eds.) Fuzzy Information and Decision Processes, pp. 167–181. North-Holland, Amsterdam (1982)
Google Scholar
Dubois, D., Prade, H.: Unfair coins and necessity measures: towards a possibilistic interpretation of histograms. Fuzzy Sets Syst. 10(1), 15–20 (1983)
Article MathSciNet MATH Google Scholar
Dubois, D., Prade, H.: Possibility theory and its application: where do we stand? Mathware Soft Comput. 18(1), 18–31 (2011)
MathSciNet Google Scholar
Dubois, D., Prade, H., Sandri, S.: On possibility/probability transformation. In: Proceedings of the 4th International Fuzzy Systems Association Congress (IFSA 1991), pp. 50–53 (1991)
Google Scholar
Dubois, D., Prade, H., Smets, P.: New semantics for quantitative possibility theory. In: Benferhat, S., Besnard, P. (eds.) ECSQARU 2001. LNCS (LNAI), vol. 2143, pp. 410–421. Springer, Heidelberg (2001)
Chapter Google Scholar
Kaufmann, A.: La simulation des sous-ensembles flous. In: Table Ronde CNRS-Quelques Applications Concrètes Utilisant les Derniers Perfectionnements de la Théorie du Flou (1980)
Google Scholar
Kearns, M., Mansour, Y., Ng, A.: A sparse sampling algorithm for near-optimal planning in large Markov decision processes. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI 1999), pp. 1324–1231 (1999)
Google Scholar
Keller, T., Eyerich, P.: PROST: probabilistic planning based on UCT. In: Proceedings of the 22nd International Conference on Automated Planning and Scheduling (ICAPS 2012) (2012)
Google Scholar
Klir, G.: A principle of uncertainty and information invariance. Int. J. Gen. Syst. 17(2–3), 249–275 (1990)
Article MATH Google Scholar
Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)
Chapter Google Scholar
Kolobov, A., Mausam, Weld, D.: LRTDP versus UCT for online probabilistic planning. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI 2012) (2012)
Google Scholar
Rao, A., Georgeff, M.: Modeling rational agents within a BDI-architecture. In: Proceedings of the 2nd International Conference on Principles of Knowledge Representation and Reasoning (KR 1991), pp. 473–484 (1991)
Google Scholar
Sabbadin, R.: A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI 1999), pp. 567–574 (1999)
Google Scholar
Sabbadin, R., Fargier, H., Lang, J.: Towards qualitative approaches to multi-stage decision making. Int. J. Approximate Reasoning 19(3), 441–471 (1998)
Article MathSciNet MATH Google Scholar
Shafer, G., et al.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)
MATH Google Scholar
Smets, P.: Constructing the pignistic probability function in a context of uncertainty. In: Proceedings of the 5th Annual Conference on Uncertainty in Artificial Intelligence (UAI 1989), pp. 29–40 (1989)
Google Scholar
Vose, M.: A linear algorithm for generating random numbers with a given distribution. IEEE Trans. Softw. Eng. 17(9), 972–975 (1991)
Article MathSciNet Google Scholar
Yager, R.: Level Sets for Membership Evaluation of Fuzzy Subset, in Fuzzy Sets and Possibility Theory - Recent Developments, pp. 90–97. Pergamon Press, NewYork (1982)
Google Scholar

Download references

Acknowledgements

This work is partially funded by EPSRC PACES project (Ref: EP/J012149/1). Special thanks to Steven Schockaert who read an early version of the paper and provided invaluable feedback. We also like to thank the reviewers for taking the time to read the paper in detail and provide feedback that helped to further improve the quality of the paper.

Author information

Authors and Affiliations

Queen’s University Belfast, Belfast, UK
Kim Bauters & Weiru Liu
IIIA, CSIC, Bellaterra, Spain
Lluís Godo

Authors

Kim Bauters
View author publications
You can also search for this author in PubMed Google Scholar
Weiru Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lluís Godo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kim Bauters .

Editor information

Editors and Affiliations

Faculteit Wetenschappen, Universiteit Hasselt, Hasselt, Belgium
Marc Gyssens
Dept. Ciencias Ingeniería Computación, Universidad Nacional del Sur, Bahía Blanca, Argentina
Guillermo Simari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bauters, K., Liu, W., Godo, L. (2016). Anytime Algorithms for Solving Possibilistic MDPs and Hybrid MDPs. In: Gyssens, M., Simari, G. (eds) Foundations of Information and Knowledge Systems. FoIKS 2016. Lecture Notes in Computer Science(), vol 9616. Springer, Cham. https://doi.org/10.1007/978-3-319-30024-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-30024-5_2
Published: 04 March 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30023-8
Online ISBN: 978-3-319-30024-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics