Composing functions to speed up reinforcement learning in a changing world

Drummond, Chris

doi:10.1007/BFb0026708

Chris Drummond¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1398))

Included in the following conference series:

European Conference on Machine Learning

389 Accesses

Abstract

This paper presents a system that transfers the results of prior learning to speed up reinforcement learning in a changing world. Often, even when the change to the world is relatively small an extensive relearning effort is required. The new system exploits strong features in the multi-dimensional function produced by reinforcement learning. The features generate a partitioning of the state space. The partition is represented as a graph. This is used to index and compose functions stored in a case base to form a close approximation to the solution of the new task. The experimental results investigate one important example of a changing world, a new goal position. In this situation, there is close to a two orders of magnitude increase in learning rate over using a basic reinforcement learning algorithm.

Download to read the full chapter text

Chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

A. D. Christiansen. Learning to predict in uncertain continuous tasks. ICML pp 72–81, 1992.
Google Scholar
L. D. Cohen and Isaac Cohen. Finite element methods for active contour models and balloons for 2-d and 3-d images. PAMI 15(11):1131–1147, Nov 1993.
Google Scholar
E. W. Dijkstra. A note on two problems in connection with graphs. Numer. Math. 1:269–271, 1959.
Article Google Scholar
C. Drummond. Using a case-base of surfaces to speed-up reinforcement learning. LNAI volume 1266, pp 435–444, 1997.
Google Scholar
K. J. Hammond. Case-based planning: A framework for planning from experience. Journal of Cognitive Science 14(3):85–443, July 1990.
Google Scholar
A. MacDonald. Graphs: Notes on symetries, imbeddings, decompositions. Elec. Eng. Dept. TR-92-10-AJM, Brunel University, Uxbridge, Middx, U. K., Oct 1992.
Google Scholar
S. Mahadevan and J. Connell. Automatic programming of behavior-based robots using reinforcement learning. Artificial Intelligence 55:311–365, 1992.
Article Google Scholar
J. Peng. Efficient memory-based dynamic programming. ICML pp 438–439 1995.
Google Scholar
D. Precup and R. S. Sutton. Multi-time models for temporally abstract planning. NIPS 10 1997.
Google Scholar
J. W. Sheppard and S. L. Salzberg. A teaching strategy for memory-based control. Artificial Intelligence Review 11:343–370, 1997.
Article Google Scholar
S. P. Singh. Reinforcement learning with a hierarchy of abstract models. AAAI pp 202–207, 1992.
Google Scholar
P. Suetens, P. Fua, and A. Hanson. Computational strategies for object recognition. Computing surveys 4(1):5–61, 1992.
Article Google Scholar
R. S. Sutton. Generalization in reinforcement learning: Successful examples using sparse coarse coding. NIPS 8 pp 1038–1044, 1996.
Google Scholar
R. S. Sutton. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. ICML pp 216–224, 1990.
Google Scholar
P. Tadepalli and D. Ok. Scaling up average reward reinforcement learning by approximating learning by approximating. ICML pp 471–479, 1996.
Google Scholar
S. Thrun and A. Schwartz. Finding structure in reinforcement learning. NIPS 7 pp 385–392 1994.
Google Scholar
M. M. Veloso and J. G. Carbonell. Derivational analogy in prodigy: Automating case acquisition, storage and utilization. Machine Learning, 10(3):249–278, 1993.
Article Google Scholar
C. J. C. H. Watkins and P. Dayan. Technical note: Q-learning. Machine Learning, 8(3–4):279–292, May 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Ottawa, K1N 6N5, Ottawa, Ontario, Canada
Chris Drummond

Authors

Chris Drummond
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Claire Nédellec Céline Rouveirol

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Drummond, C. (1998). Composing functions to speed up reinforcement learning in a changing world. In: Nédellec, C., Rouveirol, C. (eds) Machine Learning: ECML-98. ECML 1998. Lecture Notes in Computer Science, vol 1398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026708

Download citation

DOI: https://doi.org/10.1007/BFb0026708
Published: 16 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64417-0
Online ISBN: 978-3-540-69781-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics