research-article

Discovering options from example trajectories

Authors:

Peng Zang,

Peng Zhou,

David Minnen,

Charles IsbellAuthors Info & Claims

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

Pages 1217 - 1224

https://doi.org/10.1145/1553374.1553529

Published: 14 June 2009 Publication History

Get Access

Abstract

We present a novel technique for automated problem decomposition to address the problem of scalability in reinforcement learning. Our technique makes use of a set of near-optimal trajectories to discover options and incorporates them into the learning process, dramatically reducing the time it takes to solve the underlying problem. We run a series of experiments in two different domains and show that our method offers up to 30 fold speedup over the baseline.

References

[1]

Dietterich, T. G. (1998). The MAXQ method for hierarchical reinforcement learning. Intl. Conf. on Machine Learning (pp. 118--126).

Digital Library

Google Scholar

[2]

Hengst, B. (2002). Discovering hierarchy in reinforcement learning with hexq. Intl. Conf. on Machine Learning (pp. 243--250).

Digital Library

Google Scholar

[3]

Jonsson, A., & Barto, A. (2006). Causal graph based decomposition of factored mdps. Journal of Machine Learning Research, 7, 2259--2301.

Digital Library

Google Scholar

[4]

Mannor, S., Menache, I., Hoze, A., & Klein, U. (2004). Dynamic abstraction in reinforcement learning via clustering. Intl. Conf. on Machine Learning (pp. 560--567).

Digital Library

Google Scholar

[5]

McGovern, A., & Barto, A. G. (2001). Automatic discovery of subgoals in reinforcement learning using diverse density. Intl. Conf. on Machine Learning (pp. 361--368).

Digital Library

Google Scholar

[6]

Mehta, N., Ray, S., Tadepalli, P., & Dietterich, T. (2008). Automatic discovery and transfer of maxq hierarchies. Intl. Conf. on Machine Learning (pp. 648--655).

Digital Library

Google Scholar

[7]

Parr, R., & Russell, S. (1997). Reinforcement learning with hierarchies of machines. Advances in Neural Information Processing Systems (pp. 1043--1049).

Digital Library

Google Scholar

[8]

Pickett, M., & Barto, A. G. (2002). Policyblocks: An algorithm for creating useful macro-actions in reinforcement learning. Intl. Conf. on Machine Learning (pp. 506--513).

Digital Library

Google Scholar

[9]

Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.

Digital Library

Google Scholar

[10]

Sutton, R. S., Precup, D., & Singh, S. P. (1999). Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112, 181--211.

Digital Library

Google Scholar

[11]

Ukkonen., E. (1992). Constructing suffix-trees on-line in linear time. Algorithms, 1, 484--492.

Digital Library

Google Scholar

Cited By

View all

Dubenko YDyshkant YGura D(2020)ANALYSIS OF HIERARCHICAL LEARNING WITH REINFORCEMENT FOR THE IMPLEMENTATION OF BEHAVIORAL STRATEGIES OF INTELLIGENT AGENTSVestnik komp'iuternykh i informatsionnykh tekhnologii10.14489/vkit.2020.09.pp.035-045(35-45)Online publication date: Sep-2020
https://doi.org/10.14489/vkit.2020.09.pp.035-045
Davoodabadi Farahani MMozayani N(2020)Evaluating skills in hierarchical reinforcement learningInternational Journal of Machine Learning and Cybernetics10.1007/s13042-020-01141-3Online publication date: 18-May-2020
https://doi.org/10.1007/s13042-020-01141-3
Farahani MMozayani N(2019)Automatic construction and evaluation of macro-actions in reinforcement learningApplied Soft Computing10.1016/j.asoc.2019.105574(105574)Online publication date: Jun-2019
https://doi.org/10.1016/j.asoc.2019.105574
Show More Cited By

Index Terms

Discovering options from example trajectories
1. Computing methodologies
  1. Machine learning
  2. Modeling and simulation
    1. Model development and analysis
      1. Model verification and validation
      2. Modeling methodologies

Recommendations

Volatility Risks and Growth Options

We propose to measure growth opportunities by firms’ exposure to idiosyncratic volatility news. Theoretically, we show that the value of a growth option increases in idiosyncratic volatility but its response to volatility of aggregate shocks can be either ...
The volatility edge in options trading: new technical strategies for investing in unstable markets
American Options Under Stochastic Volatility

The problem of pricing an American option written on an underlying asset with constant price volatility has been studied extensively in literature. Real-world data, however, demonstrate that volatility is not constant, and stochastic volatility models ...

Comments

Information & Contributors

Information

Published In

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

June 2009

1331 pages

ISBN:9781605585161

DOI:10.1145/1553374

General Chair:
Andrea Danyluk
Williams College
,
Program Chairs:
Léon Bottou
NEC Laboratories America
,
Michael Littman
Rutgers University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2009

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

ICML '09

Sponsor:

Microsoft Research

ICML '09: The 26th Annual International Conference on Machine Learning held in conjunction with the 2007 International Conference on Inductive Logic Programming

June 14 - 18, 2009

Quebec, Montreal, Canada

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
196
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Dubenko YDyshkant YGura D(2020)ANALYSIS OF HIERARCHICAL LEARNING WITH REINFORCEMENT FOR THE IMPLEMENTATION OF BEHAVIORAL STRATEGIES OF INTELLIGENT AGENTSVestnik komp'iuternykh i informatsionnykh tekhnologii10.14489/vkit.2020.09.pp.035-045(35-45)Online publication date: Sep-2020
https://doi.org/10.14489/vkit.2020.09.pp.035-045
Davoodabadi Farahani MMozayani N(2020)Evaluating skills in hierarchical reinforcement learningInternational Journal of Machine Learning and Cybernetics10.1007/s13042-020-01141-3Online publication date: 18-May-2020
https://doi.org/10.1007/s13042-020-01141-3
Farahani MMozayani N(2019)Automatic construction and evaluation of macro-actions in reinforcement learningApplied Soft Computing10.1016/j.asoc.2019.105574(105574)Online publication date: Jun-2019
https://doi.org/10.1016/j.asoc.2019.105574
Tamassia MZambetta FRaffe WMueller FLi X(2018)Learning Options From Demonstrations: A Pac-Man Case StudyIEEE Transactions on Games10.1109/TCIAIG.2017.265865910:1(91-96)Online publication date: Mar-2018
https://doi.org/10.1109/TCIAIG.2017.2658659
Cobo LIsbell CThomaz Avan der Hoek WPadgham LConitzer VWinikoff M(2012)Automatic task decomposition and state abstraction from demonstrationProceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 110.5555/2343576.2343645(483-490)Online publication date: 4-Jun-2012
https://dl.acm.org/doi/10.5555/2343576.2343645
Konidaris GKuindersma SGrupen RBarto A(2011)Robot learning from demonstration by constructing skill treesThe International Journal of Robotics Research10.1177/027836491142865331:3(360-375)Online publication date: 5-Dec-2011
https://doi.org/10.1177/0278364911428653
Shim JThomaz A(2011)Human-like action segmentation for option learning2011 RO-MAN10.1109/ROMAN.2011.6005277(455-460)Online publication date: Jul-2011
https://doi.org/10.1109/ROMAN.2011.6005277
Castro PPrecup D(2011)Automatic construction of temporally extended actions for MDPs using bisimulation metricsProceedings of the 9th European conference on Recent Advances in Reinforcement Learning10.1007/978-3-642-29946-9_16(140-152)Online publication date: 9-Sep-2011
https://dl.acm.org/doi/10.1007/978-3-642-29946-9_16
Girgin SPolat FAlhajj R(2010)Improving reinforcement learning by using sequence treesMachine Language10.1007/s10994-010-5182-y81:3(283-331)Online publication date: 1-Dec-2010
https://dl.acm.org/doi/10.1007/s10994-010-5182-y

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Volatility Risks and Growth Options

The volatility edge in options trading: new technical strategies for investing in unstable markets

American Options Under Stochastic Volatility

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations