Abstract
In recent years hierarchical concepts of temporal abstraction have been integrated in the reinforcement learning framework to improve scalability. However, existing approaches are limited to domains where a decomposition into subtasks is known a priori. In this paper we propose the concept of explicitly selecting time scale related actions if no subgoal-related abstract actions are available. This is realised with multi-step actions on different time scales that are combined in one single action set. The special structure of the action set is exploited in the MSA-Q-learning algorithm. By learning on different explicitly specified time scales simultaneously, a considerable improvement of learning speed can be achieved. This is demonstrated on two benchmark problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
T. G. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, 13:227–303, 2000.
A. McGovern, R.S. Sutton, and A.H. Fagg. Roles of macro-actions in accelerating reinforcement learning. In Grace Hopper Celebration of Women in Computing, 1997.
S. Pareigis. Adaptive choice of grid and time in reinforcement learning. In Advances in Neural Information Processing Systems, volume 10. MIT Press, 1998.
R. E. Parr. Hierarchical Control and Learning for Markov Decision Processes. PhD thesis, University of California, Berkeley, CA, 1998.
T. J. Perkins and D. Precup. Using options for knowledge transfer in reinforcement learning. Technical report, University of Massachusetts, Amherst, 1999.
R. S. Sutton, D. Precup, and S. Singh. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181–211, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schoknecht, R., Riedmiller, M. (2002). Speeding-up Reinforcement Learning with Multi-step Actions. In: Dorronsoro, J.R. (eds) Artificial Neural Networks — ICANN 2002. ICANN 2002. Lecture Notes in Computer Science, vol 2415. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46084-5_132
Download citation
DOI: https://doi.org/10.1007/3-540-46084-5_132
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44074-1
Online ISBN: 978-3-540-46084-8
eBook Packages: Springer Book Archive