research-article

IFSA: incremental feature-set augmentation for reinforcement learning tasks

Authors:

Matthew E. Taylor,

Peter StoneAuthors Info & Claims

AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems

Article No.: 186, Pages 1 - 8

https://doi.org/10.1145/1329125.1329351

Published: 14 May 2007 Publication History

Abstract

Reinforcement learning is a popular and successful framework for many agent-related problems because only limited environmental feedback is necessary for learning. While many algorithms exist to learn effective policies in such problems, learning is often used to solve real world problems, which typically have large state spaces, and therefore suffer from the "curse of dimensionality." One effective method for speeding-up reinforcement learning algorithms is to leverage expert knowledge. In this paper, we propose a method for dynamically augmenting the agent's feature set in order to speed up value-function-based reinforcement learning. The domain expert divides the feature set into a series of subsets such that a novel problem concept can be learned from each successive subset. Domain knowledge is also used to order the feature subsets in order of their importance for learning. Our algorithm uses the ordered feature subsets to learn tasks significantly faster than if the entire feature set is used from the start. Incremental Feature-Set Augmentation (IFSA) is fully implemented and tested in three different domains: Gridworld, Blackjack and RoboCup Soccer Keepaway. All experiments show that IFSA can significantly speed up learning and motivates the applicability of this novel RL method.

References

[1]

L. C. Baird and A. W. Moore. Gradient descent for general reinforcement learning. In M. J. Kearns, S. A. Solla, and D. A. Cohn, editors, Advances in Neural Information Processing Systems, volume 11, pages 968--974. The MIT Press, 1999.

Digital Library

[2]

R. H. Crites and A. G. Barto. Improving elevator performance using reinforcement learning. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 1017--1023, Cambridge, MA, 1996. MIT Press.

[3]

T. G. Dietterich. The MAXQ method for hierarchical reinforcement learning. In Proceedings of the Fifteenth International Conference on Machine Learning. Morgan Kaufmann, 1998.

Digital Library

[4]

T. Edmunds. An optimum agent for the discrete triathlon. In Reinforcement Learning Benchmarks and Bake-offs II A workshop at the 2005 NIPS conference, 2005.

[5]

B. Hengst. Discovering hierarchy in reinforcement learning with HEXQ. In Proc. 19th International Conf. on Machine Learning, pages 243--250, 2002.

Digital Library

[6]

G. Kuhlmann, P. Stone, R. Mooney, and J. Shavlik. Guiding a reinforcement learner with natural language advice: Initial results in RoboCup soccer. In The AAAI-2004 Workshop on Supervisory Control of Learning and Adaptive Systems, July 2004.

[7]

M. G. Lagoudakis and R. Parr. Least-squares policy iteration. Journal of Machine Learning Research, 4:1107--1149, 2003.

Digital Library

[8]

R. Maclin and J. W. Shavlik. Creating advice-taking reinforcement learners. Machine Learning, 22:251--282, 1996.

Digital Library

[9]

A. Y. Ng, H. J. Kim, M. I. Jordan, and S. Sastry. Autonomous helicopter flight via reinforcement learning. In Advances in Neural Information Processing Systems 17. MIT Press, 2004. To Appear.

[10]

G. A. Rummery and M. Niranjan. On-line Q-learning using connectionist systems. Technical Report CUED/F-INFENG-RT 116, Engineering Department, Cambridge University, 1994.

[11]

S. P. Singh and R. S. Sutton. Reinforcement learning with replaceing eligibility traces. Machine Learning, 22:123--158, 1996.

Digital Library

[12]

P. Stone. Layered Learning in Multiagent Systems: A Winning Approach to Robotic Soccer. MIT Press, 2000.

Digital Library

[13]

P. Stone, G. Kuhlmann, M. E. Taylor, and Y. Liu. Keepaway soccer: From machine learning testbed to benchmark. In I. Noda, A. Jacoff, A. Bredenfeld, and Y. Takahashi, editors, RoboCup-2005: Robot Soccer World Cup IX, volume 4020, pages 93--105. Springer Verlag, Berlin, 2006.

Digital Library

[14]

P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3):165--188, 2005.

[15]

R. Sutton, D. Precup, and S. Singh. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181--211, 1999.

Digital Library

[16]

R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.

Digital Library

[17]

G. Tesauro. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2):215--219, 1994.

Digital Library

Cited By

Thierauf CScheutz M(2024)Fixing symbolic plans with reinforcement learning in object-based action spaces2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS58592.2024.10801362(12363-12369)Online publication date: 14-Oct-2024
https://doi.org/10.1109/IROS58592.2024.10801362
Chowdhary KChowdhary K(2020)Machine LearningFundamentals of Artificial Intelligence10.1007/978-81-322-3972-7_13(375-413)Online publication date: 5-Apr-2020
https://doi.org/10.1007/978-81-322-3972-7_13
Geramifard AWalsh TTellex SChowdhary GRoy NHow J(2013)A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement LearningFoundations and Trends® in Machine Learning10.1561/22000000426:4(375-451)Online publication date: 19-Dec-2013
https://dl.acm.org/doi/10.1561/2200000042
Show More Cited By

Recommendations

Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...
Similarity of learned helplessness in human being and fuzzy reinforcement learning algorithms
Computational intelligence models for image processing and information reasoning

The sequential and uncontrolled punishments in social life may lead to what psychologists call learned helplessness or depression. Like learning in social life, agents based on Fuzzy Reinforcement Learning FRL sometimes cannot learn well. Experiments ...
Using Transfer Learning to Speed-Up Reinforcement Learning: A Cased-Based Approach
LARS '10: Proceedings of the 2010 Latin American Robotics Symposium and Intelligent Robotics Meeting

Reinforcement Learning (RL) is a well-known technique for the solution of problems where agents need to act with success in an unknown environment, learning through trial and error. However, this technique is not efficient enough to be used in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems

May 2007

1585 pages

ISBN:9788190426275

DOI:10.1145/1329125

Conference Chairs:
Edmund Durfee
University of Michigan
,
Makoto Yokoo
Kyushu University
,
Program Chairs:
Michael Huhns
University of South Carolina
,
Onn Shehory
IBM Haifa Research Lab, Israel

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

IFAAMAS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tag

reinforcement learning

Qualifiers

Research-article

Funding Sources

Conference

AAMAS07

Sponsor:

AAMAS07: International Conference on Autonomous Agents and Mulitagent Systems

May 14 - 18, 2007

Hawaii, Honolulu

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
266
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Thierauf CScheutz M(2024)Fixing symbolic plans with reinforcement learning in object-based action spaces2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS58592.2024.10801362(12363-12369)Online publication date: 14-Oct-2024
https://doi.org/10.1109/IROS58592.2024.10801362
Chowdhary KChowdhary K(2020)Machine LearningFundamentals of Artificial Intelligence10.1007/978-81-322-3972-7_13(375-413)Online publication date: 5-Apr-2020
https://doi.org/10.1007/978-81-322-3972-7_13
Geramifard AWalsh TTellex SChowdhary GRoy NHow J(2013)A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement LearningFoundations and Trends® in Machine Learning10.1561/22000000426:4(375-451)Online publication date: 19-Dec-2013
https://dl.acm.org/doi/10.1561/2200000042
Noda I(2009)Recursive adaptation of stepsize parameter for non-stationary environmentsProceedings of the Second international conference on Adaptive and Learning Agents10.1007/978-3-642-11814-2_5(74-90)Online publication date: 12-May-2009
https://dl.acm.org/doi/10.1007/978-3-642-11814-2_5
Noda I(2009)Recursive Adaptation of Stepsize Parameter for Non-stationary EnvironmentsProceedings of the 12th International Conference on Principles of Practice in Multi-Agent Systems10.1007/978-3-642-11161-7_38(525-533)Online publication date: 15-Dec-2009
https://dl.acm.org/doi/10.1007/978-3-642-11161-7_38

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten