skip to main content
10.1145/1329125.1329351acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

IFSA: incremental feature-set augmentation for reinforcement learning tasks

Published: 14 May 2007 Publication History

Abstract

Reinforcement learning is a popular and successful framework for many agent-related problems because only limited environmental feedback is necessary for learning. While many algorithms exist to learn effective policies in such problems, learning is often used to solve real world problems, which typically have large state spaces, and therefore suffer from the "curse of dimensionality." One effective method for speeding-up reinforcement learning algorithms is to leverage expert knowledge. In this paper, we propose a method for dynamically augmenting the agent's feature set in order to speed up value-function-based reinforcement learning. The domain expert divides the feature set into a series of subsets such that a novel problem concept can be learned from each successive subset. Domain knowledge is also used to order the feature subsets in order of their importance for learning. Our algorithm uses the ordered feature subsets to learn tasks significantly faster than if the entire feature set is used from the start. Incremental Feature-Set Augmentation (IFSA) is fully implemented and tested in three different domains: Gridworld, Blackjack and RoboCup Soccer Keepaway. All experiments show that IFSA can significantly speed up learning and motivates the applicability of this novel RL method.

References

[1]
L. C. Baird and A. W. Moore. Gradient descent for general reinforcement learning. In M. J. Kearns, S. A. Solla, and D. A. Cohn, editors, Advances in Neural Information Processing Systems, volume 11, pages 968--974. The MIT Press, 1999.
[2]
R. H. Crites and A. G. Barto. Improving elevator performance using reinforcement learning. In D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, editors, Advances in Neural Information Processing Systems 8, pages 1017--1023, Cambridge, MA, 1996. MIT Press.
[3]
T. G. Dietterich. The MAXQ method for hierarchical reinforcement learning. In Proceedings of the Fifteenth International Conference on Machine Learning. Morgan Kaufmann, 1998.
[4]
T. Edmunds. An optimum agent for the discrete triathlon. In Reinforcement Learning Benchmarks and Bake-offs II A workshop at the 2005 NIPS conference, 2005.
[5]
B. Hengst. Discovering hierarchy in reinforcement learning with HEXQ. In Proc. 19th International Conf. on Machine Learning, pages 243--250, 2002.
[6]
G. Kuhlmann, P. Stone, R. Mooney, and J. Shavlik. Guiding a reinforcement learner with natural language advice: Initial results in RoboCup soccer. In The AAAI-2004 Workshop on Supervisory Control of Learning and Adaptive Systems, July 2004.
[7]
M. G. Lagoudakis and R. Parr. Least-squares policy iteration. Journal of Machine Learning Research, 4:1107--1149, 2003.
[8]
R. Maclin and J. W. Shavlik. Creating advice-taking reinforcement learners. Machine Learning, 22:251--282, 1996.
[9]
A. Y. Ng, H. J. Kim, M. I. Jordan, and S. Sastry. Autonomous helicopter flight via reinforcement learning. In Advances in Neural Information Processing Systems 17. MIT Press, 2004. To Appear.
[10]
G. A. Rummery and M. Niranjan. On-line Q-learning using connectionist systems. Technical Report CUED/F-INFENG-RT 116, Engineering Department, Cambridge University, 1994.
[11]
S. P. Singh and R. S. Sutton. Reinforcement learning with replaceing eligibility traces. Machine Learning, 22:123--158, 1996.
[12]
P. Stone. Layered Learning in Multiagent Systems: A Winning Approach to Robotic Soccer. MIT Press, 2000.
[13]
P. Stone, G. Kuhlmann, M. E. Taylor, and Y. Liu. Keepaway soccer: From machine learning testbed to benchmark. In I. Noda, A. Jacoff, A. Bredenfeld, and Y. Takahashi, editors, RoboCup-2005: Robot Soccer World Cup IX, volume 4020, pages 93--105. Springer Verlag, Berlin, 2006.
[14]
P. Stone, R. S. Sutton, and G. Kuhlmann. Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3):165--188, 2005.
[15]
R. Sutton, D. Precup, and S. Singh. Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181--211, 1999.
[16]
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, 1998.
[17]
G. Tesauro. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2):215--219, 1994.

Cited By

View all
  • (2024)Fixing symbolic plans with reinforcement learning in object-based action spaces2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS58592.2024.10801362(12363-12369)Online publication date: 14-Oct-2024
  • (2020)Machine LearningFundamentals of Artificial Intelligence10.1007/978-81-322-3972-7_13(375-413)Online publication date: 5-Apr-2020
  • (2013)A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement LearningFoundations and Trends® in Machine Learning10.1561/22000000426:4(375-451)Online publication date: 19-Dec-2013
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
AAMAS '07: Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
May 2007
1585 pages
ISBN:9788190426275
DOI:10.1145/1329125
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • IFAAMAS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2007

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. reinforcement learning

Qualifiers

  • Research-article

Funding Sources

Conference

AAMAS07
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Fixing symbolic plans with reinforcement learning in object-based action spaces2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS58592.2024.10801362(12363-12369)Online publication date: 14-Oct-2024
  • (2020)Machine LearningFundamentals of Artificial Intelligence10.1007/978-81-322-3972-7_13(375-413)Online publication date: 5-Apr-2020
  • (2013)A Tutorial on Linear Function Approximators for Dynamic Programming and Reinforcement LearningFoundations and Trends® in Machine Learning10.1561/22000000426:4(375-451)Online publication date: 19-Dec-2013
  • (2009)Recursive adaptation of stepsize parameter for non-stationary environmentsProceedings of the Second international conference on Adaptive and Learning Agents10.1007/978-3-642-11814-2_5(74-90)Online publication date: 12-May-2009
  • (2009)Recursive Adaptation of Stepsize Parameter for Non-stationary EnvironmentsProceedings of the 12th International Conference on Principles of Practice in Multi-Agent Systems10.1007/978-3-642-11161-7_38(525-533)Online publication date: 15-Dec-2009

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media