Abstraction and Generalization in Reinforcement Learning: A Summary and Framework

Ponsen, Marc; Taylor, Matthew E.; Tuyls, Karl

doi:10.1007/978-3-642-11814-2_1

Marc Ponsen²¹,
Matthew E. Taylor²² &
Karl Tuyls²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5924))

Included in the following conference series:

International Workshop on Adaptive and Learning Agents

979 Accesses
17 Citations
1 Altmetric

Abstract

In this paper we survey the basics of reinforcement learning, generalization and abstraction. We start with an introduction to the fundamentals of reinforcement learning and motivate the necessity for generalization and abstraction. Next we summarize the most important techniques available to achieve both generalization and abstraction in reinforcement learning. We discuss basic function approximation techniques and delve into hierarchical, relational and transfer learning. All concepts and techniques are illustrated with examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Reinforcement Learning

Reinforcement Learning Algorithms: Categorization and Structural Properties

A Gentle Introduction to Reinforcement Learning

References

Albus, J.S.: Brains, Behavior, and Robotics. Byte Books, Peterborough (1981)
Google Scholar
Barto, A., Mahadevan, S.: Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems: Theory and Application 13(4), 341–379 (2003)
Article MathSciNet MATH Google Scholar
Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
MATH Google Scholar
Bengio, Y., Collobert, J.L.R., Weston, J.: Curriculum learning. In: Proceedings of the Twenty-Sixth International Conference on Machine Learning (June 2009)
Google Scholar
Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: Safely approximating the value function. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Advances in Neural Information Processing Systems, vol. 7, pp. 369–376. MIT Press, Cambridge (1995)
Google Scholar
Brafman, R.I., Tennenholtz, M.: R-max - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research 3, 213–231 (2003)
MathSciNet MATH Google Scholar
Brooks, R.A.: Intelligence without representation. Artificial Intelligence (47), 139–159 (1991)
Google Scholar
Caruana, R.: Multitask learning. Machine Learning 28, 41–75 (1997)
Article Google Scholar
Crites, R.H., Barto, A.G.: Improving elevator performance using reinforcement learning. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems, vol. 8, pp. 1017–1023. MIT Press, Cambridge (1996)
Google Scholar
Croonenborghs, T., Driessens, K., Bruynooghe, M.: Learning relational options for inductive transfer in relational reinforcement learning. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 88–97. Springer, Heidelberg (2008)
Chapter Google Scholar
Dean, T., Givan, R.: Model minimization in Markov decision processes. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 106–111 (1997)
Google Scholar
Dietterich, T.: An overview of MAXQ hierarchical reinforcement learning. In: Choueiry, B.Y., Walsh, T. (eds.) SARA 2000. LNCS (LNAI), vol. 1864, pp. 26–44. Springer, Heidelberg (2000)
Chapter Google Scholar
Driessens, K.: Relational Reinforcement Learning. PhD thesis, DEPTCW (2004), http://www.cs.kuleuven.be/publicaties/doctoraten/cw/CW2004_05.abs.html
Džeroski, S., De Raedt, L., Driessens, K.: Relational reinforcement learning. Machine Learning 43, 7–52 (2001)
Article MATH Google Scholar
Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503–556 (2005)
MathSciNet MATH Google Scholar
Giunchiglia, F., Walsh, T.: A theory of abstraction. Artificial Intelligence 57(2-3), 323–389 (1992)
Article MathSciNet MATH Google Scholar
Holte, R.C., Chouiery, B.Y.: Abstraction and reformulation in ai. Philosophical transactions of the Royal Society of London 358(1435:1), 197–204 (2003)
Google Scholar
Howard, R.A.: Dynamic Programming and Markov Processes. MIT Press, Cambridge (1960)
MATH Google Scholar
Jong, N.K., Stone, P.: Model-based exploration in continuous state spaces. In: The Seventh Symposium on Abstraction, Reformulation, and Approximation (July 2007)
Google Scholar
Kearns, M., Singh, S.: Near-optimal reinforcement learning in polynomial time. In: Proc. 15th International Conf. on Machine Learning, pp. 260–268. Morgan Kaufmann, San Francisco (1998)
Google Scholar
Konidaris, G., Barto, A.: Autonomous shaping: Knowledge transfer in reinforcement learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 489–496 (2006)
Google Scholar
Konidaris, G., Barto, A.G.: Building portable options: Skill transfer in reinforcement learning. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence, pp. 895–900 (2007)
Google Scholar
Lazaric, A., Restelli, M., Bonarini, A.: Transfer of samples in batch reinforcement learning. In: Proceedings of the 25th Annual ICML, pp. 544–551 (2008)
Google Scholar
Li, L., Walsh, T.J., Littman, M.L.: Towards a unified theory of state abstraction for MDPs. In: Proceedings of the Ninth International Symposium on Artificial Intelligence and Mathematics, pp. 531–539 (2006)
Google Scholar
Mahadevan, S., Maggioni, M.: Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes. Journal of Machine Learning Research 8, 2169–2231 (2007)
MathSciNet MATH Google Scholar
Muggleton, S., De Raedt, L.: Inductive logic programming: Theory and methods. Journal of Logic Programming 19(20), 629–679 (1994)
Article MathSciNet MATH Google Scholar
Noda, I., Matsubara, H., Hiraki, K., Frank, I.: Soccer server: A tool for research on multiagent systems. Applied Artificial Intelligence 12, 233–250 (1998)
Article Google Scholar
Ponsen, M., Croonenborghs, T., Ramon, J., Tuyls, K., Driessens, K., van den Herik, J., Postma, E.: Learning with whom to communicate using relational reinforcement learning. In: International Conference on Autonomous Agents and Multi Agent Systems, AAMAS (2009)
Google Scholar
Puterman, M.: Markov decision processes: Discrete stochastic dynamic programming. John Wiley and Sons, New York (1994)
Book MATH Google Scholar
Pyeatt, L.D., Howe, A.E.: Decision tree function approximation in reinforcement learning. In: Proceedings of the Third International Symposium on Adaptive Systems: Evolutionary Computation & Probabilistic Graphical Models, pp. 70–77 (2001)
Google Scholar
Ravindran, B., Barto, A.: Model minimization in hierarchical reinforcement learning. In: Proceedings of the Fifth Symposium on Abstraction, Reformulation and Approximation (2002)
Google Scholar
Ravindran, B., Barto, A.: An algebraic approach to abstraction in reinforcement learning. In: Twelfth Yale Workshop on Adaptive and Learning Systems, pp. 109–114 (2003)
Google Scholar
Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. Technical report, Cambridge University Engineering Department (1994)
Google Scholar
Skinner, B.F.: Science and Human Behavior. Colliler-Macmillian (1953)
Google Scholar
Soni, V., Singh, S.: Using homomorphisms to transfer options across continuous reinforcement learning domains. In: Proceedings of the Twenty First National Conference on Artificial Intelligence (July 2006)
Google Scholar
Sorg, J., Singh, S.: Transfer via soft homomorphisms. In: Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems, May 2009, pp. 741–748 (2009)
Google Scholar
Stone, P., Kuhlmann, G., Taylor, M.E., Liu, Y.: Keepaway soccer: From machine learning testbed to benchmark. In: Bredenfeld, A., Jacoff, A., Noda, I., Takahashi, Y. (eds.) RoboCup 2005. LNCS (LNAI), vol. 4020, pp. 93–105. Springer, Heidelberg (2006)
Chapter Google Scholar
Sutton, R.: Dyna, an integrated architecture for learning, planning, and reacting. SIGART Bulletin 2, 160–163 (1991)
Article Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning: an introduction. MIT Press, Cambridge (1998)
Google Scholar
Sutton, R., Precup, D., Singh, S.: Between mdps and semi-mdps: a framework for temporal abstraction in reinforcement learning. Artificial Intelligence 112, 181–211 (1999)
Article MathSciNet MATH Google Scholar
Taylor, M.E.: Assisting transfer-enabled machine learning algorithms: Leveraging human knowledge for curriculum design. In: The AAAI 2009 Spring Symposium on Agents that Learn from Human Teachers (March 2009)
Google Scholar
Taylor, M.E., Jong, N.K., Stone, P.: Transferring instances for model-based reinforcement learning. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 488–505. Springer, Heidelberg (2008)
Chapter Google Scholar
Taylor, M.E., Stone, P.: Cross-domain transfer for reinforcement learning. In: Proceedings of the Twenty-Fourth International Conference on Machine Learning (June 2007)
Google Scholar
Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: A survey. Journal of Machine Learning Research 10(1), 1633–1685 (2009)
MathSciNet MATH Google Scholar
Taylor, M.E., Stone, P., Liu, Y.: Transfer learning via inter-task mappings for temporal difference learning. Journal of Machine Learning Research 8(1), 2125–2167 (2007)
MathSciNet MATH Google Scholar
Tesauro, G.: TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation 6(2), 215–219 (1994)
Article Google Scholar
Thorndike, E., Woodworth, R.: The influence of improvement in one mental function upon the efficiency of other functions. Psychological Review 8, 247–261 (1901)
Article Google Scholar
Thrun, S.: Is learning the n-th thing any easier than learning the first? In: Advances in Neural Information Processing Systems, vol. 8, pp. 640–646 (1996)
Google Scholar
Torrey, L., Shavlik, J.W., Walker, T., Maclin, R.: Relational macros for transfer in reinforcement learning. In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 254–268. Springer, Heidelberg (2008)
Chapter Google Scholar
Watkins, C.: Learning with Delayed Rewards. PhD thesis, Cambridge University (1989)
Google Scholar
Weiss, G.: A multiagent variant of dyna-q. In: Proceedings of the 4th International Conference on Multi-Agent Systems (ICMAS 2000), pp. 461–462 (2000)
Google Scholar
Wiering, M.: Explorations in Efficient Reinforcement Learning. PhD thesis, Universiteit van Amsterdam (1999)
Google Scholar
Zucker, J.D.: A grounded theory of abstraction in artificial intelligence. Philosophical transactions of the Royal Society of London 358(1435:1), 293–309 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Universiteit Maastricht, Maastricht, The Netherlands
Marc Ponsen & Karl Tuyls
The University of Southern California, Los Angeles, CA
Matthew E. Taylor

Authors

Marc Ponsen
View author publications
You can also search for this author in PubMed Google Scholar
Matthew E. Taylor
View author publications
You can also search for this author in PubMed Google Scholar
Karl Tuyls
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The University of Southern California, Los Angeles, CA, USA
Matthew E. Taylor
Maastricht University, 6200 MD, Maastricht, The Netherlands
Karl Tuyls

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ponsen, M., Taylor, M.E., Tuyls, K. (2010). Abstraction and Generalization in Reinforcement Learning: A Summary and Framework. In: Taylor, M.E., Tuyls, K. (eds) Adaptive and Learning Agents. ALA 2009. Lecture Notes in Computer Science(), vol 5924. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11814-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-642-11814-2_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11813-5
Online ISBN: 978-3-642-11814-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics