Hierarchical Reinforcement Learning

Hengst, Bernhard

doi:10.1007/978-1-4899-7687-1_363

Bernhard Hengst³

615 Accesses
1 Citations

Definition

Hierarchical reinforcement learning (HRL) decomposes a reinforcement learningproblem into a hierarchy of subproblems or subtasks such that higher-level parent-tasks invoke lower-level child tasks as if they were primitive actions. A decomposition may have multiple levels of hierarchy. Some or all of the subproblems can themselves be reinforcement learning problems. When a parent-task is formulated as a reinforcement learning problem it is commonly formalized as a semi-Markov decision problem because its actions are child-tasks that persist for an extended period of time. The advantage of hierarchical decomposition is a reduction in computational complexity if the overall problem can be represented more compactly and reusable subtasks learned or provided independently. While the solution to a HRL problem is optimal given the constraints of the hierarchy there are no guarantees in general that the decomposed solution is an optimal solution to the original reinforcement...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 699.99; Price excludes VAT (USA)

Hardcover Book: USD 949.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Ashby R (1956) Introduction to cybernetics. Chapman & Hall, London
Book MATH Google Scholar
Barto A, Mahadevan S (2003) Recent advances in hiearchical reinforcement learning. Spec Issue Reinf Learn Discret Event Syst J 13:41–77
Article MATH Google Scholar
Dayan P, Hinton GE (1992) Feudal reinforcement learning. In: Advances in neural information processing systems 5 NIPS conference, Denver, 2–5 Dec 1991. Morgan Kaufmann, San Francisco
Google Scholar
Dietterich TG (2000) Hierarchical reinforcement learning with the MAXQ value function decomposition. J Artif Intell Res 13:227–303
MathSciNet MATH Google Scholar
Digney BL (1998) Learning hierarchical control structures for multiple tasks and changing environments. In: From animals to animats 5: proceedings of the fifth international conference on simulation of adaptive behaviour, SAB 98, Zurich, 17–21 Aug 1998. MIT, Cambridge
Google Scholar
Ghavamzadeh M, Mahadevan S (2002) Hierarchically optimal average reward reinforcement learning. In: Sammut C, Achim Hoffmann (eds) Proceedings of the nineteenth international conference on machine learning, Sydney. Morgan-Kaufman, San Francisco, pp 195–202
Google Scholar
Hauskrecht M, Meuleau N, Kaelbling LP, Dean T, Boutilier C (1998) Hierarchical solution of Markov decision processes using macro-actions. In: Fourteenth annual conference on uncertainty in artificial intelligence, Madison, pp 220–229
Google Scholar
Hengst B (2008) Partial order hierarchical reinforcement learning. In: Australasian conference on artificial intelligence, Auckland, Dec 2008. Springer, Berlin, pp 138–149
Chapter Google Scholar
Jonsson A, Barto A (2006) Causal graph based decomposition of factored MDPs. J Mach Learn Res 7:2259–2301
MathSciNet MATH Google Scholar
Kaelbling LP (1993) Hierarchical learning in stochastic domains: preliminary results. In: Proceedings of the tenth international conference on machine learning. Morgan Kaufmann, San Mateo, pp 167–173
Google Scholar
Konidaris G, Barto A (2009) Skill discovery in continuous reinforcement learning domains using skill chaining. In: Bengio Y, Schuurmans D, Lafferty J, Williams CKI, Culotta A (eds) Advances in neural information processing systems 22, Vancouver, pp 1015–1023
Google Scholar
McGovern A (2002) Autonomous discovery of abstractions through interaction with an environment. In: SARA. Springer, London, pp 338–339
Google Scholar
Moore A, Baird L, Kaelbling LP (1999) Multi-value functions: efficient automatic action hierarchies for multiple goal MDPs. In: Proceedings of the international joint conference on artificial intelligence, Stockholm. Morgan Kaufmann, San Francisco, pp 1316–1323
Google Scholar
Parr R, Russell SJ (1997) Reinforcement learning with hierarchies of machines. In: NIPS, Denver
Google Scholar
Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming. New York, Wiley
Book MATH Google Scholar
Ryan MRK, Reid MD (2000) Using ILP to improve planning in hierarchical reinforcement learning. In: Proceedings of the tenth international conference on inductive logic programming, ILP 2000, London. Springer, London
Google Scholar
Singh S (1992) Reinforcement learning with a hierarchy of abstract models. In: Proceedings of the tenth national conference on artificial intelligence, San Jose
Google Scholar
Sutton RS, Precup D, Singh SP (1999) Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif Intell 112(1–2): 181–211
Article MathSciNet MATH Google Scholar
Watkins CJCH (1989) Learning from delayed rewards. PhD thesis, King’s College
Google Scholar

Download references

Author information

Authors and Affiliations

University of New South Wales, Sydney, NSW, Australia
Bernhard Hengst

Authors

Bernhard Hengst
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The University of New South Wales, Sydney, NSW, Australia
Claude Sammut
Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
Geoffrey I. Webb

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Hengst, B. (2017). Hierarchical Reinforcement Learning. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_363

Download citation

DOI: https://doi.org/10.1007/978-1-4899-7687-1_363
Published: 14 April 2017
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics