Abstract
Reinforcement learning addresses the problem of learning optimal policies for sequential decision-making problems involving stochastic operators and numerical reward functions rather than the more traditional deterministic operators and logical goal predicates. In many ways, reinforcement learning research is recapitulating the development of classical research in planning and problem solving. After studying the problem of solving “flat” problem spaces, researchers have recently turned their attention to hierarchical methods that incorporate subroutines and state abstractions. This paper gives an overview of the MAXQ value function decomposition and its support for state abstraction and action abstraction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Bibliography
Bertsekas, D. P., & Tsitsiklis, J. N. (1996). Neuro-Dynamic Programming. Athena Scientific, Belmont, MA.
Crites, R. H., & Barto, A. G. (1995). Improving elevator performance using reinforcement learning. In Advances in Neural Information Processing Systems, Vol. 8, pp. 1017–1023 San Francisco, CA. Morgan Kaufmann.
Dayan, P., & Hinton, G. (1993). Feudal reinforcement learning. In Advances in Neural Information Processing Systems, 5, pp. 271–278. Morgan Kaufmann, San Francisco, CA.
Dean, T., & Lin, S.-H. (1995). Decomposition techniques for planning in stochastic domains. Tech. rep. CS-95-10, Department of Computer Science, Brown University, Providence, Rhode Island.
Dietterich, T. G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research. To appear.
Kaelbling, L. P. (1993). Hierarchical reinforcement learning: Preliminary results. In Proceedings of the Tenth International Conference on Machine Learning, pp. 167–173 San Francisco, CA. Morgan Kaufmann.
Moore, A. W., & Atkeson, C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13, 103.
Parr, R. (1998). Hierarchical control and learning for Markov decision processes. Ph.D. thesis, University of California, Berkeley, California.
Parr, R., & Russell, S. (1998). Reinforcement learning with hierarchies of machines. In Advances in Neural Information Processing Systems, Vol. 10, pp. 1043–1049 Cambridge, MA. MIT Press.
Singh, S., Jaakkola, T., Littman, M. L., & Szepesvári, C. (1998). Convergence results for single-step on-policy reinforcement-learning algorithms. Tech. rep., University of Colorado, Department of Computer Science, Boulder, CO. To appear in Machine Learning.
Singh, S. P. (1992). Transfer of learning by composing solutions of elemental sequential tasks. Machine Learning, 8, 323.
Sutton, R., & Barto, A. G. (1998). Introduction to Reinforcement Learning. MIT Press, Cambridge, MA.
Sutton, R. S., Precup, D., & Singh, S. (1998). Between MDPs and Semi-MDPs: Learning, planning, and representing knowledge at multiple temporal scales. Tech. rep., University of Massachusetts, Department of Computer and Information Sciences, Amherst, MA. To appear in Artificial Intelligence.
Tesauro, G. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 28(3), 58–68.
Zhang, W., & Dietterich, T. G. (1995). A reinforcement learning approach to job-shop scheduling. In 1995 International Joint Conference on Artificial Intelligence, pp. 1114–1120. Morgan Kaufmann, San Francisco, CA.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dietterich, T.G. (2000). An Overview of MAXQ Hierarchical Reinforcement Learning. In: Choueiry, B.Y., Walsh, T. (eds) Abstraction, Reformulation, and Approximation. SARA 2000. Lecture Notes in Computer Science(), vol 1864. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44914-0_2
Download citation
DOI: https://doi.org/10.1007/3-540-44914-0_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67839-7
Online ISBN: 978-3-540-44914-0
eBook Packages: Springer Book Archive