Abstract
In recent years hierarchical concepts of temporal abstraction have been integrated in the reinforcement learning framework to improve scalability. However, existing approaches are limited to domains where a decomposition into subtasks is known a priori. In this article we propose the concept of explicitly selecting time scale related abstract actions if no subgoal related abstract actions are available. This concept is realised with multi-step actions on different time scales that are combined in one single action set. We exploit the special structure of the action set in the MSA-Q-learning algorithm. This approach is suited for learning optimal policies in “unstructured” domains where a decomposition into subtasks is not known in advance or does not exist at all. By learning different explicitly specified time scales simultaneously, we achieve a considerable improvement of learning speed, which we demonstrate on several benchmark problems.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Acknowledgments.
This work is part of the project “Reinforcement-Lernen auf unterschiedlichen Zeitskalen” which is supported by the Deutsche Forschungsgemeinschaft (DFG)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Schoknecht, R., Riedmiller, M. Reinforcement learning on explicitly specified time scales. Neural Comput&Applic 12, 61–80 (2003). https://doi.org/10.1007/s00521-003-0368-x
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s00521-003-0368-x