Learning Skills in Reinforcement Learning Using Relative Novelty

Şimşek, Özgür; Barto, Andrew G.

doi:10.1007/11527862_36

Özgür Şimşek²⁰ &
Andrew G. Barto²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3607))

Included in the following conference series:

International Symposium on Abstraction, Reformulation, and Approximation

1024 Accesses
3 Citations

Abstract

We present a method for automatically creating a set of useful temporally-extended actions, or skills, in reinforcement learning. Our method identifies states that allow the agent to transition to a different region of the state space—for example, a doorway between two rooms—and generates temporally-extended actions that efficiently take the agent to these states. In identifying such states we use the concept of relative novelty, a measure of how much short-term novelty a state introduces to the agent. The resulting algorithm is simple, has low computational complexity, and is shown to improve performance in a number of problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barto, A.G., Singh, S., Chentanez, N.: Intrinsically motivated learning of hierarchical collections of skills. In: Proceedings of the Third International Conference on Developmental Learning (2004)
Google Scholar
Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)
MATH MathSciNet Google Scholar
Digney, B.: Learning hierarchical control structure for multiple tasks and changing environments. In: From Animals to Animats 5: The Fifth Conference on the Simulation of Adaptive Behaviour. The MIT Press, Cambridge (1998)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001)
MATH Google Scholar
Hengst, B.: Discovering hierarchy in reinforcement learning with HEXQ. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 243–250. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Kakade, S., Dayan, P.: Dopamine bonuses. In: Advances in Neural Information Processing Systems, vol. 13, pp. 131–137. MIT Press, Cambridge (2001)
Google Scholar
Lin, L.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning 8, 293–321 (1992)
Google Scholar
Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the Twenty-First International Conference on Machine Learning (2004)
Google Scholar
McGovern, A., Barto, A.G.: Automatic discovery of subgoals in reinforcement learning using diverse density. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 361–368. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Menache, I., Mannor, S., Shimkin, N.: Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 295–306. Springer, Heidelberg (2002)
Chapter Google Scholar
Şimşek, Ö., Barto, A.G.: Using relative novelty to identify useful temporal abstractions in reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, pp. 751–758. ACM Press, New York (2004)
Google Scholar
Şimşek, Ö., Wolfe, A.P., Barto, A.G.: Identifying useful subgoals in reinforcement learning by local graph partitioning. In: Proceedings of the Twenty-Second International Conference on Machine Learning (to appear)
Google Scholar
Parr, B.R.: Hierarchical Control and Learning for Markov Decision Processes. PhD thesis, Computer Science Division, University of California, Berkeley (1998)
Google Scholar
Pickett, M., Barto, A.G.: PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 506–513. Morgan Kaufmann, San Francisco (2002)
Google Scholar
Precup, D.: Temporal abstraction in reinforcement learning. PhD thesis, University of Massachusetts Amherst (2000)
Google Scholar
Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2), 181–211 (1999)
Article MATH MathSciNet Google Scholar
Thrun, S., Schwartz, A.: Finding structure in reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 7, pp. 385–392. MIT Press, Cambridge (1995)
Google Scholar
White, R.W.: Motivation reconsidered: The concept of competence. Psychological Review 66, 297–333 (1959)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Massachusetts, Amherst, MA, 01003-9264, USA
Özgür Şimşek & Andrew G. Barto

Authors

Özgür Şimşek
View author publications
You can also search for this author in PubMed Google Scholar
Andrew G. Barto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

UR 079 GEODES, IRD, 32 avenue Henri Varagnat, 93143, Bondy, France
Jean-Daniel Zucker
Dip. di Informatica, Università del Piemonte Orientale, Via Bellini 25/G, 15100, Alessandria, Italy
Lorenza Saitta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Şimşek, Ö., Barto, A.G. (2005). Learning Skills in Reinforcement Learning Using Relative Novelty. In: Zucker, JD., Saitta, L. (eds) Abstraction, Reformulation and Approximation. SARA 2005. Lecture Notes in Computer Science(), vol 3607. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527862_36

Download citation

DOI: https://doi.org/10.1007/11527862_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27872-6
Online ISBN: 978-3-540-31882-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics