Episodic task learning in Markov decision processes

Lin, Yong; Makedon, Fillia; Xu, Yurong

doi:10.1007/s10462-011-9204-3

Episodic task learning in Markov decision processes

Published: 17 February 2011

Volume 36, pages 87–98, (2011)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Yong Lin¹,
Fillia Makedon¹ &
Yurong Xu²

137 Accesses
2 Citations
Explore all metrics

Abstract

Hierarchical algorithms for Markov decision processes have been proved to be useful for the problem domains with multiple subtasks. Although the existing hierarchical approaches are strong in task decomposition, they are weak in task abstraction, which is more important for task analysis and modeling. In this paper, we propose a task-oriented design to strengthen the task abstraction. Our approach learns an episodic task model from the problem domain, with which the planner obtains the same control effect, with concise structure and much improved performance than the original model. According to our analysis and experimental evaluation, our approach has better performance than the existing hierarchical algorithms, such as MAXQ and HEXQ.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Boutilier C, Dearden R, Goldszmidt M (1995) Exploiting structure in policy construction. In: Proceedings of IJCAI, pp 1104–1113
Deák F, Kovács A, Váncza J, Dobrowiecki TP (2001) Hierarchical knowledge-based process planning in manufacturing. In: Proceedings of the IFIP 11 international PROLAMAT conference on digital enterprise, pp 428–439
Dietterich TG (1998) The maxq method for hierarchical reinforcement learning. In: ICML, San Francisco, CA, USA, pp 118–126
Dietterich TG (2000) Hierarchical reinforcement learning with the MAXQ value function decomposition. J Artif Intell Res 13: 227–303
MATH MathSciNet Google Scholar
Hansen EA, Zhou R (2003) Synthesis of hierarchical finite-state controllers for POMDPs. In: Proceedings of ICAPS, AAAI, pp 113–122
Hengst B (2002) Discovering hierarchy in reinforcement learning with hexq. In: ICML ’02: proceedings of the nineteenth international conference on machine learning, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 243–250
Jonsson A, Barto A (2005) A causal approach to hierarchical decomposition of factored mdps. In: ICML ’05: proceedings of the 22nd international conference on machine learning, ACM, New York, NY, USA, pp 401–408, http://doi.acm.org/10.1145/1102351.1102402
Pineau J, Roy N, Thrun S (2001) A hierarchical approach to pomdp planning and execution. In: Workshop on hierarchy and memory in reinforcement learning (ICML)
Potts D, Hengst B (2004) Discovering multiple levels of a task hierarchy concurrently. Rob Auton Syst 49(1-2): 43–55
Article Google Scholar
Smith T, Simmons RG (2004) Heuristic search value iteration for POMDPs. In: Proceedings of UAI
Sutton RS, Precup D, Singh S (1999) Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artif Intell 112(1-2):181–211. http://dx.doi.org/10.1016/S0004-3702(99)00052-1

Download references

Author information

Authors and Affiliations

Computer Science & Engineering, Arlington, TX, USA
Yong Lin & Fillia Makedon
Oracle Corporation, Redwood Shores, CA, USA
Yurong Xu

Authors

Yong Lin
View author publications
You can also search for this author in PubMed Google Scholar
Fillia Makedon
View author publications
You can also search for this author in PubMed Google Scholar
Yurong Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yong Lin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, Y., Makedon, F. & Xu, Y. Episodic task learning in Markov decision processes. Artif Intell Rev 36, 87–98 (2011). https://doi.org/10.1007/s10462-011-9204-3

Download citation

Published: 17 February 2011
Issue Date: August 2011
DOI: https://doi.org/10.1007/s10462-011-9204-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Episodic task learning in Markov decision processes

Abstract

Access this article

Similar content being viewed by others

Hierarchical Reinforcement Learning with Unlimited Recursive Subroutine Calls

Bayes-adaptive hierarchical MDPs

Hierarchical Plan-Based Control in Open-Ended Environments: Considering Knowledge Acquisition Opportunities

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Episodic task learning in Markov decision processes

Abstract

Access this article

Similar content being viewed by others

Hierarchical Reinforcement Learning with Unlimited Recursive Subroutine Calls

Bayes-adaptive hierarchical MDPs

Hierarchical Plan-Based Control in Open-Ended Environments: Considering Knowledge Acquisition Opportunities

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation