Abstract
This paper aims to speed up the pruning procedure that is encountered in the exact value iteration in POMDPs. The value function in POMDPs can be represented by a finite set of vectors over the state space. In each step of the exact value iteration algorithm, the number of possible vectors increases linearly with the cardinality of the action set and exponentially with the cardinality of the observation set. This set of vectors should be pruned to a minimal subset retaining the same value function over the state space. Therefore, pruning procedure in general is the bottleneck of finding the optimal policy for POMDPs. This paper analyses two different linear programming methods, the classical Lark’s algorithm and the recently proposed Skyline algorithm for detecting these useless vectors. We claim that using the information about the support region of the vectors that have already been processed, both algorithms can be drastically improved. We present comparative experiments on both randomly generated problems and POMDP benchmarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cassandra, A.: Tony’s POMDP file repository page (1999). http://www.cs.brown.edu/research/ai/pomdp/examples/index.html
Cassandra, A., Littman, M.L., Zhang, N.L.: Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes. In: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, pp. 54–61. Morgan Kaufmann Publishers Inc. (1997)
Cassandra, A.R.: Exact and approximate algorithms for partially observable Markov decision processes. Brown University (1998)
Feng, Z., Zilberstein, S.: Region-based incremental pruning for POMDPs. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 146–153. AUAI Press (2004)
Harris, P.M.: Pivot selection methods of the Devex LP code. Math. program. 5(1), 1–28 (1973)
Hauskrecht, M.: Value-function approximations for partially observable Markov decision processes. J. Artif. Intell. Res. 13, 33–94 (2000)
Hauskrecht, M., Fraser, H.: Planning treatment of ischemic heart disease with partially observable Markov decision processes. Artif. Intell. Med. 18(3), 221–244 (2000)
Hero, A.O., Castanon, D., Cochran, D., Kastella, K.: Foundations and Applications of Sensor Management. Springer Science & Business Media, New York (2007)
Hoey, J., Poupart, P., von Bertoldi, A., Craig, T., Boutilier, C., Mihailidis, A.: Automated handwashing assistance for persons with dementia using video and a partially observable Markov decision process. Comput. Vis. Image Underst. 114(5), 503–519 (2010)
Littman, M.L.: The Witness algorithm: solving partially observable Markov decision processes. Brown University, Providence (1994)
Mallick, M., Krishnamurthy, V., Vo, B.N.: Integrated Tracking, Classification, and Sensor Management: Theory and Applications. Wiley, Hoboken (2012)
Monahan, G.E.: State of the art - a survey of partially observable Markov decision processes: theory, models, and algorithms. Manage. Sci. 28(1), 1–16 (1982)
Raphael, C., Shani, G.: The Skyline algorithm for POMDP value function pruning. Ann. Math. Artif. Intell. 65(1), 61–77 (2012)
Smallwood, R.D., Sondik, E.J.: The optimal control of partially observable Markov processes over a finite horizon. Oper. Res. 21(5), 1071–1088 (1973)
Temizer, S., Kochenderfer, M.J., Kaelbling, L.P., Lozano-Pérez, T., Kuchar, J.K.: Collision avoidance for unmanned aircraft using Markov decision processes. In: AIAA Guidance, Navigation, and Control Conference, Toronto, Canada (2010)
Zhang, N.L., Liu, W.: Planning in stochastic domains: problem characteristics and approximation. Technical report HKUST-CS96-31, Department of Computer Science, Hong Kong University of Science and Technology (1996)
Zhang, N.L., Zhang, W.: Speeding up the convergence of value iteration in partially observable Markov decision processes. J. Artif. Intell. Res. 14, 29–51 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Özgen, S., Demirekler, M. (2016). A Fast Elimination Method for Pruning in POMDPs. In: Friedrich, G., Helmert, M., Wotawa, F. (eds) KI 2016: Advances in Artificial Intelligence. KI 2016. Lecture Notes in Computer Science(), vol 9904. Springer, Cham. https://doi.org/10.1007/978-3-319-46073-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-46073-4_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46072-7
Online ISBN: 978-3-319-46073-4
eBook Packages: Computer ScienceComputer Science (R0)