Skip to main content

Partially observable markov decision processes for artificial intelligence

  • Invited Lectures
  • Conference paper
  • First Online:
Reasoning with Uncertainty in Robotics (RUR 1995)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1093))

Included in the following conference series:

Abstract

In this paper, we bring techniques from operations research to bear on the problem of choosing optimal actions in partially observable stochastic domains. In many cases, we have developed new ways of viewing the problem that are, perhaps, more consistent with the AI perspective. We begin by introducing the theory of Markov decision processes (MDPs) and partially observable Markov decision processes POMDPs. We then outline a novel algorithm for solving POMDPs off line and show how, in many cases, a finite-memory controller can be extracted from the solution to a POMDP. We conclude with a simple example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Astrom, K. J. (1965), Optimal control of Markov decision processes with incomplete state estimation, Journal of Mathematical Analysis and Applications 10, 174–205

    Article  Google Scholar 

  • Boutilier, C. & Dearden, R. (1994), Using abstractions for decision theoretic planning with time constraints, in Proceedings of the Twelfth National Conference on Artificial Intelligence

    Google Scholar 

  • Boutilier, C., Dearden, R. & Goldszmidt, M. (1995), Exploiting structure in policy construction, in Proceedings of the International Joint Conference on Artificial Intelligence

    Google Scholar 

  • Cassandra, A. R. (1994), Algorithms for partially observable Markov decision processes, Technical Report CS-94-14, Brown University, Providence, Rhode Island

    Google Scholar 

  • Cheng, H.-T. (1988), Algorithms for Partially Observable Markov Decision Processes, PhD thesis, University of British Columbia, British Columbia, Canada

    Google Scholar 

  • Dean, T., Kaelbling, L. P., Kirman, J. & Nicholson, A. (To Appear), Planning under time constraints in stochastic domains, Artificial Intelligence

    Google Scholar 

  • Drummond, M. & Bresina, J. (1990), Anytime synthetic projection, in Proceedings of the Eighth National Conference on Artificial Intelligence, Morgan Kaufmann, pp. 138–144

    Google Scholar 

  • Howard, R. A. (1960), Dynamic Programming and Markov Processes, The MIT Press, Cambridge, Massachusetts

    Google Scholar 

  • Kalman, R. E. (1960), A new approach to linear filtering and prediction problems, Transactions of the American Society of Mechanical Engineers Journal of Basic Engineering 82, 35–45

    Google Scholar 

  • Kushmerick, N., Hanks, S. & Weld, D. (1993), An Algorithm for Probabilistic Planning, Technical Report 93-06-03, University of Washington Department of Computer Science and Engineering. To appear in Artificial Intelligence.

    Google Scholar 

  • Littman, M. L. (1994), The witness algorithm: Solving partially observable Markov decision processes, Technical Report CS-94-40, Brown University, Providence, Rhode Island

    Google Scholar 

  • Littman, M. L., Cassandra, A. R. & Kaelbling, L. P. (1995 a), An efficient algorithm for dynamic programming in partially observable Markov decision processes, Technical Report CS-95-19, Brown University, Providence, Rhode Island

    Google Scholar 

  • Littman, M. L., Cassandra, A. R. & Kaelbling, L. P. (1995 b), Learning policies for partially observable environments: Scaling up, in Proceedings of the Twelfth International Conference on Machine Learning, Morgan Kaufmann

    Google Scholar 

  • Lovejoy, W. S. (1991), A survey of algorithmic methods for partially observed Markov decision processes, Annals of Operations Research 28(1), 47–65

    Article  Google Scholar 

  • Monahan, G. E. (1982), A survey of partially observable Markov decision processes: Theory, models, and algorithms, Management Science 28(1), 1–16

    Google Scholar 

  • Moore, R. C. (1985), A formal theory of knowledge and action, in J. R. Hobbs & R. C. Moore (eds.) Formal Theories of the Commonsense World, Ablex Publishing Company, Norwood, New Jersey

    Google Scholar 

  • Puterman, M. L. (1994), Markov Decision Processes, John Wiley & Sons, New York

    Google Scholar 

  • Singh, S. P., Jaakkola, T. & Jordan, M.I. (1994), Model-free reinforcement learning for non-Markovian decision problems, in Machine Learning: Proceedings of the Eleventh International Conference, Morgan Kaufmann, San Francisco, California, pp. 284–292

    Google Scholar 

  • Smallwood, R. D. & Sondik, E. J. (1973), The optimal control of partially observable Markov processes over a finite horizon, Operations Research 21, 1071–1088

    Google Scholar 

  • Sondik, E. J. (1978), The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs, Operations Research 26(2), 282–304

    Google Scholar 

  • Tseng, P. (1990), Solving H-horizon, stationary Markov decision problems in time proportional to log (H), Operations Research Letters 9(5), 287–297

    Article  Google Scholar 

  • White, III, C. C. (1991), Partially observed Markov decision processes: A survey, Annals of Operations Research

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Leo Dorst Michiel van Lambalgen Frans Voorbraak

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kaelbling, L.P., Littman, M.L., Cassandra, A.R. (1996). Partially observable markov decision processes for artificial intelligence. In: Dorst, L., van Lambalgen, M., Voorbraak, F. (eds) Reasoning with Uncertainty in Robotics. RUR 1995. Lecture Notes in Computer Science, vol 1093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0013957

Download citation

  • DOI: https://doi.org/10.1007/BFb0013957

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-61376-3

  • Online ISBN: 978-3-540-68506-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics