Abstract
Markov decision process (MDP) provides the foundations for a number of problems, such as artificial intelligence studying, automated planning and reinforcement learning. MDP can be solved efficiently in theory. However, for large scenarios, more investigations are needed to reveal practical algorithms. Algorithms for solving MDP have a natural concurrency. In this paper, we present parallel algorithms based on dynamic programming. Meanwhile, the cost of computation and communication complexity of this method is analyzed. Moreover, experimental results demonstrate excellent speedups and scalability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Otterlo, M.V.: A Survey of Reinforcement Learning in Relational Domains, Technical Report, TR-CTIT-05-31, ISBN ISSN 1381-3625, CTIT Technical Report Series, Pages: 70 (2005)
Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, San Francisco (1999); Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction. The MIT Press, Cambridge (1998)
Bhulai, S.: Markov Decision Processes the control of high-dimensional system, Dissertation. University press, Amsterdam (2002)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Howard, R.A.: Dynamic Programming and Markov Processes. The MIT Press, Cambridge (1960)
Littman, M.L., Dean, T.L., Kaelbling, L.P.: On the complexity of solving Markov decision problems. In: Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence (UAI 1995) Montreal, Quebec, Canada (1995)
Puterman, M.L.: Markov Decision Processes. John Wiley & Sons, New York (1994)
Coppersmith, D., Winograd, S.: Matrix multiplication via arithmetic progressions. In: Proceedings of 19th Annual ACM Symposium on Theory of Computing, pp. 1–6 (1987)
Kumar, V., Grama, A., Gupta, A., Karypis, G.: Introduction to Parallel Computing: Algorithm Design and Analysis. Benjamin Commings/Addison Wesley, Redwod City (1994)
Guestrin, C.E., Koller, D., Gearhart, C., Kanodia, N.: Generalizing plans to new environments in relational MDPs. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI 2003), Acapulco, Mexico (2003)
Gearhart, C.: Genetic Programming as Policy Search in Markov Decision Processes. In: Genetic Algorithms and Genetic Programming at Stanford 2003, Stanford California, USA, pp. 61–67 (2003)
Bellman, R.E.: Dynamic Programming. Princeton University Press, Princeton (1957)
Watkins, C.J.C.H., Dayan, P.: Q-learning, Machine Learning Journal. Special Issue on Reinforcement Learning 8(3/4) (1992)
Stratagus, http://www.stratagus.org/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, Q., Sun, G., Xu, Y. (2009). Parallel Algorithms for Solving Markov Decision Process. In: Hua, A., Chang, SL. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2009. Lecture Notes in Computer Science, vol 5574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03095-6_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-03095-6_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03094-9
Online ISBN: 978-3-642-03095-6
eBook Packages: Computer ScienceComputer Science (R0)