Abstract
We present a method for automatically creating a set of useful temporally-extended actions, or skills, in reinforcement learning. Our method identifies states that allow the agent to transition to a different region of the state space—for example, a doorway between two rooms—and generates temporally-extended actions that efficiently take the agent to these states. In identifying such states we use the concept of relative novelty, a measure of how much short-term novelty a state introduces to the agent. The resulting algorithm is simple, has low computational complexity, and is shown to improve performance in a number of problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barto, A.G., Singh, S., Chentanez, N.: Intrinsically motivated learning of hierarchical collections of skills. In: Proceedings of the Third International Conference on Developmental Learning (2004)
Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)
Digney, B.: Learning hierarchical control structure for multiple tasks and changing environments. In: From Animals to Animats 5: The Fifth Conference on the Simulation of Adaptive Behaviour. The MIT Press, Cambridge (1998)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001)
Hengst, B.: Discovering hierarchy in reinforcement learning with HEXQ. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 243–250. Morgan Kaufmann, San Francisco (2002)
Kakade, S., Dayan, P.: Dopamine bonuses. In: Advances in Neural Information Processing Systems, vol. 13, pp. 131–137. MIT Press, Cambridge (2001)
Lin, L.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning 8, 293–321 (1992)
Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the Twenty-First International Conference on Machine Learning (2004)
McGovern, A., Barto, A.G.: Automatic discovery of subgoals in reinforcement learning using diverse density. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 361–368. Morgan Kaufmann, San Francisco (2001)
Menache, I., Mannor, S., Shimkin, N.: Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 295–306. Springer, Heidelberg (2002)
Şimşek, Ö., Barto, A.G.: Using relative novelty to identify useful temporal abstractions in reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, pp. 751–758. ACM Press, New York (2004)
Şimşek, Ö., Wolfe, A.P., Barto, A.G.: Identifying useful subgoals in reinforcement learning by local graph partitioning. In: Proceedings of the Twenty-Second International Conference on Machine Learning (to appear)
Parr, B.R.: Hierarchical Control and Learning for Markov Decision Processes. PhD thesis, Computer Science Division, University of California, Berkeley (1998)
Pickett, M., Barto, A.G.: PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 506–513. Morgan Kaufmann, San Francisco (2002)
Precup, D.: Temporal abstraction in reinforcement learning. PhD thesis, University of Massachusetts Amherst (2000)
Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2), 181–211 (1999)
Thrun, S., Schwartz, A.: Finding structure in reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 7, pp. 385–392. MIT Press, Cambridge (1995)
White, R.W.: Motivation reconsidered: The concept of competence. Psychological Review 66, 297–333 (1959)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Şimşek, Ö., Barto, A.G. (2005). Learning Skills in Reinforcement Learning Using Relative Novelty. In: Zucker, JD., Saitta, L. (eds) Abstraction, Reformulation and Approximation. SARA 2005. Lecture Notes in Computer Science(), vol 3607. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527862_36
Download citation
DOI: https://doi.org/10.1007/11527862_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27872-6
Online ISBN: 978-3-540-31882-8
eBook Packages: Computer ScienceComputer Science (R0)