Abstract
The application of Reinforcement Learning (RL) algorithms to learn tasks for robots is often limited by the large dimension of the state space, which may make prohibitive its application on a tabular model. In this paper, we describe LEAP (Learning Entities Adaptive Partitioning), a model-free learning algorithm that uses overlapping partitions which are dynamically modified to learn near-optimal policies with a small number of parameters. Starting from a coarse aggregation of the state space, LEAP generates refined partitions whenever it detects an incoherence between the current action values and the actual rewards from the environment. Since in highly stochastic problems the adaptive process can lead to over-refinement, we introduce a mechanism that prunes the macrostates without affecting the learned policy. Through refinement and pruning, LEAP builds a multi-resolution state representation specialized only where it is actually needed. In the last section, we present some experimental evaluation on a grid world and a complex simulated robotic soccer task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anderson, C., Crawford-Hines, S.: Multigrid q-learning. Technical Report CS-94-121, Colorado State University, Fort Collins (1994)
Bonarini, A., Lazaric, A., Restelli, M.: Learning in complex environments through multiple adaptive partitions. In: Proceedings of the ECAI 2006 workshop PLMUDW, pp. 7–12 (2006)
Chapman, D., Kaelbling, L.P.: Input generalization in delayed reinfrocement learning: an algorithm and performance comparisons. In: Proc. of IJCAI (1991)
Dayan, P., Hinton, G.E.: Feudal reinforcement learning. In: NIPS, vol. 5 (1993)
Gordon, G.J.: Reinforcement learning with function approximation converges to a region. In: NIPS, vol. 12 (2000)
Munos, R., Moore, A.: Variable resolution discretization in optimal control. Machine Learning 49(2/3), 291–323 (2002)
Reynolds, S.I.: Decision boundary partitioning: Variable resolution model-free reinforcement learning. In: Proceedings of ICML, pp. 783–790 (2000)
Reynolds, S.I.: Reinforcement Learning with Exploration. PhD thesis, University of Birmingham (2002)
Singh, S.P., Jaakkola, T., Jordan, M.I.: Reinforcement learning with soft state aggregation. In: NIPS, vol. 7 (1995)
Stone, P., Sutton, R.S., Kuhlmann, G.: Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior (13), 165–188 (2005)
Sutton, R.S.: Generalization in reinfrocement learning: Successful examples using sparse coarse coding. In: NIPS, vol. 8 (1995)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Sutton, R.S., White, A., Lee, M.: UofA Reinforcement Learning Library. University of Alberta, Department of Computing Science (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bonarini, A., Lazaric, A., Restelli, M. (2007). Reinforcement Learning in Complex Environments Through Multiple Adaptive Partitions. In: Basili, R., Pazienza, M.T. (eds) AI*IA 2007: Artificial Intelligence and Human-Oriented Computing. AI*IA 2007. Lecture Notes in Computer Science(), vol 4733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74782-6_46
Download citation
DOI: https://doi.org/10.1007/978-3-540-74782-6_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74781-9
Online ISBN: 978-3-540-74782-6
eBook Packages: Computer ScienceComputer Science (R0)