skip to main content
10.1145/1553374.1553529acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Discovering options from example trajectories

Published: 14 June 2009 Publication History

Abstract

We present a novel technique for automated problem decomposition to address the problem of scalability in reinforcement learning. Our technique makes use of a set of near-optimal trajectories to discover options and incorporates them into the learning process, dramatically reducing the time it takes to solve the underlying problem. We run a series of experiments in two different domains and show that our method offers up to 30 fold speedup over the baseline.

References

[1]
Dietterich, T. G. (1998). The MAXQ method for hierarchical reinforcement learning. Intl. Conf. on Machine Learning (pp. 118--126).
[2]
Hengst, B. (2002). Discovering hierarchy in reinforcement learning with hexq. Intl. Conf. on Machine Learning (pp. 243--250).
[3]
Jonsson, A., & Barto, A. (2006). Causal graph based decomposition of factored mdps. Journal of Machine Learning Research, 7, 2259--2301.
[4]
Mannor, S., Menache, I., Hoze, A., & Klein, U. (2004). Dynamic abstraction in reinforcement learning via clustering. Intl. Conf. on Machine Learning (pp. 560--567).
[5]
McGovern, A., & Barto, A. G. (2001). Automatic discovery of subgoals in reinforcement learning using diverse density. Intl. Conf. on Machine Learning (pp. 361--368).
[6]
Mehta, N., Ray, S., Tadepalli, P., & Dietterich, T. (2008). Automatic discovery and transfer of maxq hierarchies. Intl. Conf. on Machine Learning (pp. 648--655).
[7]
Parr, R., & Russell, S. (1997). Reinforcement learning with hierarchies of machines. Advances in Neural Information Processing Systems (pp. 1043--1049).
[8]
Pickett, M., & Barto, A. G. (2002). Policyblocks: An algorithm for creating useful macro-actions in reinforcement learning. Intl. Conf. on Machine Learning (pp. 506--513).
[9]
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
[10]
Sutton, R. S., Precup, D., & Singh, S. P. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112, 181--211.
[11]
Ukkonen., E. (1992). Constructing suffix-trees on-line in linear time. Algorithms, 1, 484--492.

Cited By

View all
  • (2020)ANALYSIS OF HIERARCHICAL LEARNING WITH REINFORCEMENT FOR THE IMPLEMENTATION OF BEHAVIORAL STRATEGIES OF INTELLIGENT AGENTSVestnik komp'iuternykh i informatsionnykh tekhnologii10.14489/vkit.2020.09.pp.035-045(35-45)Online publication date: Sep-2020
  • (2020)Evaluating skills in hierarchical reinforcement learningInternational Journal of Machine Learning and Cybernetics10.1007/s13042-020-01141-3Online publication date: 18-May-2020
  • (2019)Automatic construction and evaluation of macro-actions in reinforcement learningApplied Soft Computing10.1016/j.asoc.2019.105574(105574)Online publication date: Jun-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
June 2009
1331 pages
ISBN:9781605585161
DOI:10.1145/1553374

Sponsors

  • NSF
  • Microsoft Research: Microsoft Research
  • MITACS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2009

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ICML '09
Sponsor:
  • Microsoft Research

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2020)ANALYSIS OF HIERARCHICAL LEARNING WITH REINFORCEMENT FOR THE IMPLEMENTATION OF BEHAVIORAL STRATEGIES OF INTELLIGENT AGENTSVestnik komp'iuternykh i informatsionnykh tekhnologii10.14489/vkit.2020.09.pp.035-045(35-45)Online publication date: Sep-2020
  • (2020)Evaluating skills in hierarchical reinforcement learningInternational Journal of Machine Learning and Cybernetics10.1007/s13042-020-01141-3Online publication date: 18-May-2020
  • (2019)Automatic construction and evaluation of macro-actions in reinforcement learningApplied Soft Computing10.1016/j.asoc.2019.105574(105574)Online publication date: Jun-2019
  • (2018)Learning Options From Demonstrations: A Pac-Man Case StudyIEEE Transactions on Games10.1109/TCIAIG.2017.265865910:1(91-96)Online publication date: Mar-2018
  • (2012)Automatic task decomposition and state abstraction from demonstrationProceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 110.5555/2343576.2343645(483-490)Online publication date: 4-Jun-2012
  • (2011)Robot learning from demonstration by constructing skill treesThe International Journal of Robotics Research10.1177/027836491142865331:3(360-375)Online publication date: 5-Dec-2011
  • (2011)Human-like action segmentation for option learning2011 RO-MAN10.1109/ROMAN.2011.6005277(455-460)Online publication date: Jul-2011
  • (2011)Automatic construction of temporally extended actions for MDPs using bisimulation metricsProceedings of the 9th European conference on Recent Advances in Reinforcement Learning10.1007/978-3-642-29946-9_16(140-152)Online publication date: 9-Sep-2011
  • (2010)Improving reinforcement learning by using sequence treesMachine Language10.1007/s10994-010-5182-y81:3(283-331)Online publication date: 1-Dec-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media