Skip to main content

Learning Skills in Reinforcement Learning Using Relative Novelty

  • Conference paper
Book cover Abstraction, Reformulation and Approximation (SARA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3607))

Abstract

We present a method for automatically creating a set of useful temporally-extended actions, or skills, in reinforcement learning. Our method identifies states that allow the agent to transition to a different region of the state space—for example, a doorway between two rooms—and generates temporally-extended actions that efficiently take the agent to these states. In identifying such states we use the concept of relative novelty, a measure of how much short-term novelty a state introduces to the agent. The resulting algorithm is simple, has low computational complexity, and is shown to improve performance in a number of problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barto, A.G., Singh, S., Chentanez, N.: Intrinsically motivated learning of hierarchical collections of skills. In: Proceedings of the Third International Conference on Developmental Learning (2004)

    Google Scholar 

  2. Dietterich, T.G.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)

    MATH  MathSciNet  Google Scholar 

  3. Digney, B.: Learning hierarchical control structure for multiple tasks and changing environments. In: From Animals to Animats 5: The Fifth Conference on the Simulation of Adaptive Behaviour. The MIT Press, Cambridge (1998)

    Google Scholar 

  4. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001)

    MATH  Google Scholar 

  5. Hengst, B.: Discovering hierarchy in reinforcement learning with HEXQ. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 243–250. Morgan Kaufmann, San Francisco (2002)

    Google Scholar 

  6. Kakade, S., Dayan, P.: Dopamine bonuses. In: Advances in Neural Information Processing Systems, vol. 13, pp. 131–137. MIT Press, Cambridge (2001)

    Google Scholar 

  7. Lin, L.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning 8, 293–321 (1992)

    Google Scholar 

  8. Mannor, S., Menache, I., Hoze, A., Klein, U.: Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the Twenty-First International Conference on Machine Learning (2004)

    Google Scholar 

  9. McGovern, A., Barto, A.G.: Automatic discovery of subgoals in reinforcement learning using diverse density. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 361–368. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  10. Menache, I., Mannor, S., Shimkin, N.: Q-Cut - Dynamic discovery of sub-goals in reinforcement learning. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) ECML 2002. LNCS (LNAI), vol. 2430, pp. 295–306. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Şimşek, Ö., Barto, A.G.: Using relative novelty to identify useful temporal abstractions in reinforcement learning. In: Proceedings of the Twenty-First International Conference on Machine Learning, pp. 751–758. ACM Press, New York (2004)

    Google Scholar 

  12. Şimşek, Ö., Wolfe, A.P., Barto, A.G.: Identifying useful subgoals in reinforcement learning by local graph partitioning. In: Proceedings of the Twenty-Second International Conference on Machine Learning (to appear)

    Google Scholar 

  13. Parr, B.R.: Hierarchical Control and Learning for Markov Decision Processes. PhD thesis, Computer Science Division, University of California, Berkeley (1998)

    Google Scholar 

  14. Pickett, M., Barto, A.G.: PolicyBlocks: An algorithm for creating useful macro-actions in reinforcement learning. In: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 506–513. Morgan Kaufmann, San Francisco (2002)

    Google Scholar 

  15. Precup, D.: Temporal abstraction in reinforcement learning. PhD thesis, University of Massachusetts Amherst (2000)

    Google Scholar 

  16. Sutton, R.S., Precup, D., Singh, S.P.: Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112(1-2), 181–211 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  17. Thrun, S., Schwartz, A.: Finding structure in reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 7, pp. 385–392. MIT Press, Cambridge (1995)

    Google Scholar 

  18. White, R.W.: Motivation reconsidered: The concept of competence. Psychological Review 66, 297–333 (1959)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Şimşek, Ö., Barto, A.G. (2005). Learning Skills in Reinforcement Learning Using Relative Novelty. In: Zucker, JD., Saitta, L. (eds) Abstraction, Reformulation and Approximation. SARA 2005. Lecture Notes in Computer Science(), vol 3607. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527862_36

Download citation

  • DOI: https://doi.org/10.1007/11527862_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27872-6

  • Online ISBN: 978-3-540-31882-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics