Skip to main content

Anytime Algorithms for Solving Possibilistic MDPs and Hybrid MDPs

  • Conference paper
  • First Online:
Foundations of Information and Knowledge Systems (FoIKS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9616))

Included in the following conference series:

  • 957 Accesses

Abstract

The ability of an agent to make quick, rational decisions in an uncertain environment is paramount for its applicability in realistic settings. Markov Decision Processes (MDP) provide such a framework, but can only model uncertainty that can be expressed as probabilities. Possibilistic counterparts of MDPs allow to model imprecise beliefs, yet they cannot accurately represent probabilistic sources of uncertainty and they lack the efficient online solvers found in the probabilistic MDP community. In this paper we advance the state of the art in three important ways. Firstly, we propose the first online planner for possibilistic MDP by adapting the Monte-Carlo Tree Search (MCTS) algorithm. A key component is the development of efficient search structures to sample possibility distributions based on the DPY transformation as introduced by Dubois, Prade, and Yager. Secondly, we introduce a hybrid MDP model that allows us to express both possibilistic and probabilistic uncertainty, where the hybrid model is a proper extension of both probabilistic and possibilistic MDPs. Thirdly, we demonstrate that MCTS algorithms can readily be applied to solve such hybrid models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    A basic belief assignment, or bba, is a function of the form \({m:2^\mathcal {S}{}\rightarrow [0,1]}\) satisfying \(m(\emptyset ) = 0\) and \(\sum _{A \in 2^\mathcal {S}{}} m(A) = 1\).

  2. 2.

    To deal with uncertainty in MCTS, a dual-layered approach is used in the search tree. A decision node, or state, allows us to choose which action to perform. A chance node, or action, has a number of stochastic effects which are outside our control.

  3. 3.

    An implementation of the algorithm proposed in Algorithm 3 is also available online, at https://github.com/kimbauters/sparsepi.

  4. 4.

    A common approach in probability theory to try to overcome this problem is to use subjective probabilities. However, in the more general POMDP/MOMDP settings this creates difficulties in its own right as subjective probabilities from the transitions are then combined with objective probabilities from the observation function.

  5. 5.

    We use the terminology of a neutral elements loosely here to indicate that a reward of 0, and a preference of 1, are the defaults. Indeed, when rewards (resp. preferences) are omitted these are the values MDPs (resp. \({\pi \text {-MDP}}\)s) default to.

References

  1. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)

    Article  MATH  Google Scholar 

  2. Bellman, R.: A Markovian decision process. Indiana Univ. Math. J. 6, 679–684 (1957)

    Article  MathSciNet  MATH  Google Scholar 

  3. Drougard, N., Teichteil-Königsbuch, F., Farges, J., Dubois, D.: Qualitative possibilistic mixed-observable MDPs. In: Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (UAI 2013) (2013)

    Google Scholar 

  4. Drougard, N., Teichteil-Königsbuch, F., Farges, J., Dubois, D.: Structured possibilistic planning using decision diagrams. In: Proceedings of the 28th AI Conference on Artificial Intelligence (AAAI 2014), pp. 2257–2263 (2014)

    Google Scholar 

  5. Dubois, D., Prade, H.: On several representations of an uncertain body of evidence. In: Gupta, M.M., Sanchez, E. (eds.) Fuzzy Information and Decision Processes, pp. 167–181. North-Holland, Amsterdam (1982)

    Google Scholar 

  6. Dubois, D., Prade, H.: Unfair coins and necessity measures: towards a possibilistic interpretation of histograms. Fuzzy Sets Syst. 10(1), 15–20 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  7. Dubois, D., Prade, H.: Possibility theory and its application: where do we stand? Mathware Soft Comput. 18(1), 18–31 (2011)

    MathSciNet  Google Scholar 

  8. Dubois, D., Prade, H., Sandri, S.: On possibility/probability transformation. In: Proceedings of the 4th International Fuzzy Systems Association Congress (IFSA 1991), pp. 50–53 (1991)

    Google Scholar 

  9. Dubois, D., Prade, H., Smets, P.: New semantics for quantitative possibility theory. In: Benferhat, S., Besnard, P. (eds.) ECSQARU 2001. LNCS (LNAI), vol. 2143, pp. 410–421. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  10. Kaufmann, A.: La simulation des sous-ensembles flous. In: Table Ronde CNRS-Quelques Applications Concrètes Utilisant les Derniers Perfectionnements de la Théorie du Flou (1980)

    Google Scholar 

  11. Kearns, M., Mansour, Y., Ng, A.: A sparse sampling algorithm for near-optimal planning in large Markov decision processes. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI 1999), pp. 1324–1231 (1999)

    Google Scholar 

  12. Keller, T., Eyerich, P.: PROST: probabilistic planning based on UCT. In: Proceedings of the 22nd International Conference on Automated Planning and Scheduling (ICAPS 2012) (2012)

    Google Scholar 

  13. Klir, G.: A principle of uncertainty and information invariance. Int. J. Gen. Syst. 17(2–3), 249–275 (1990)

    Article  MATH  Google Scholar 

  14. Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  15. Kolobov, A., Mausam, Weld, D.: LRTDP versus UCT for online probabilistic planning. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI 2012) (2012)

    Google Scholar 

  16. Rao, A., Georgeff, M.: Modeling rational agents within a BDI-architecture. In: Proceedings of the 2nd International Conference on Principles of Knowledge Representation and Reasoning (KR 1991), pp. 473–484 (1991)

    Google Scholar 

  17. Sabbadin, R.: A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI 1999), pp. 567–574 (1999)

    Google Scholar 

  18. Sabbadin, R., Fargier, H., Lang, J.: Towards qualitative approaches to multi-stage decision making. Int. J. Approximate Reasoning 19(3), 441–471 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  19. Shafer, G., et al.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)

    MATH  Google Scholar 

  20. Smets, P.: Constructing the pignistic probability function in a context of uncertainty. In: Proceedings of the 5th Annual Conference on Uncertainty in Artificial Intelligence (UAI 1989), pp. 29–40 (1989)

    Google Scholar 

  21. Vose, M.: A linear algorithm for generating random numbers with a given distribution. IEEE Trans. Softw. Eng. 17(9), 972–975 (1991)

    Article  MathSciNet  Google Scholar 

  22. Yager, R.: Level Sets for Membership Evaluation of Fuzzy Subset, in Fuzzy Sets and Possibility Theory - Recent Developments, pp. 90–97. Pergamon Press, NewYork (1982)

    Google Scholar 

Download references

Acknowledgements

This work is partially funded by EPSRC PACES project (Ref: EP/J012149/1). Special thanks to Steven Schockaert who read an early version of the paper and provided invaluable feedback. We also like to thank the reviewers for taking the time to read the paper in detail and provide feedback that helped to further improve the quality of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kim Bauters .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Bauters, K., Liu, W., Godo, L. (2016). Anytime Algorithms for Solving Possibilistic MDPs and Hybrid MDPs. In: Gyssens, M., Simari, G. (eds) Foundations of Information and Knowledge Systems. FoIKS 2016. Lecture Notes in Computer Science(), vol 9616. Springer, Cham. https://doi.org/10.1007/978-3-319-30024-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30024-5_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30023-8

  • Online ISBN: 978-3-319-30024-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics