Abstract
This work is concerned with practical issues surrounding the application of reinforcement learning to a mobile robot. The robot’s task is to navigate in a controlled environment and to collect objects using its gripper. Our aim is to build a control system that enables the robot to learn incrementally and to adapt to changes in the environment. The former is known as multi-task learning, the latter is usually referred to as continual ‘lifelong’ learning. First, we emphasize the connection between adaptive state-space quantisation and continual learning. Second, we describe a novel method for multi-task learning in reinforcement environments. This method is based on constructive neural networks and uses instance-based learning and dynamic programming to compute a task-dependent agent-internal state space. Third, we describe how the learning system is integrated with the control architecture of the robot. Finally, we investigate the capabilities of the learning algorithm with respect to the transfer of information between related reinforcement learning tasks, like navigation tasks in different environments. It is hoped that this method will lead to a speed-up in reinforcement learning and enable an autonomous robot to adapt its behaviour as the environment changes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
A. G. Barto, S. J. Bradtke, and S. P. Singh. Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1):81–138, 1995.
W. Burgard, D. Fox, and D. Hennig. Fast grid-based position tracking for mobile robots. In G. Brewka, C. Habel, and B. Nebel, editors, Proceedings of the 21st Annual German Conference on Artificial Intelligence (KI-97): Advances in Artificial Intelligence, pages 289–300. Springer, 1997.
R. A. Caruana. Learning many related tasks at the same time with backpropagation. In G. Tesauro, D. Touretzky, and T. Leen, editors, Advances in Neural Information Processing Systems, volume 7, pages 657–664. MIT Press, 1995.
R. A. Caruana. Algorithms and applications for multitask learning. In Proceeding of the Thirteenth International Conference on Machine Learning (ICML-96), pages 87–95, Bari, Italy, 1996. Morgan Kaufmann.
D. Fox, W. Burgard, F. Dellaert, and S. Thrun. Monte Carlo localisation: Efficient position estimation for mobile robots. In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-99), 1999.
B. Fritzke. Growing cell structures — a self-organizing network for unsupervised and supervised learning. Technical Report TR-93-026, International Computer Science Institute, Berkeley, CA, USA, 1993.
B. Fritzke. Unsupervised ontogenic networks. In E. Fiesler and R. Beale, editors, Handbook of Neural Computation, chapter C 2.4. Institute of Physics and Oxford University Press, 1997.
A. Großmann and R. Poli. Robust mobile robot localisation from sparse and noisy proximity readings. In Proceedings of the IJCAI-99 Workshop on Reasoning with Uncertainty in Robot Navigation (RUR-99), 1999.
L. P. Kaelbling, M. L. Littman, and A. R. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101:99–134, 1998.
L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Artificial Intelligence Research, 4:237–285, 1996.
Z. Kalmár, C. Szepesvári, and A. Lőrincz. Module based reinforcement learning for a real robot. Presented at the Sixth European Workshop on Learning Robots, Brighton (EWLR-6), UK, 1997.
T. Kohonen. Self-organising formation of topologically correct feature maps. Biological Cybernetics, 43(1):59–69, 1982.
K. Konolige. Saphira Software Manual. Version 6.1. ActivMedia, Inc., Peterborough, NH, USA, 1997.
V. F. Leavers. Shape Detection in Computer Vision Using the Hough Transform. Springer, London, UK, 1992.
J. K. Leonard and H. F. Durrant-Whyte. Directed Sonar Sensing for Mobile Robot Navigation. Kluwer Academic, 1992.
S. Mahadevan and J. Connell. Automatic programming of behaviour-based robots using reinforcement learning. Artificial Intelligence, 55(2–3):311–365, 1992.
M. J. Matarić. Reward functions for accelerated learning. In Proceedings of the Eleventh International Conference on Machine Learning, pages 181–189. Morgan Kaufmann, 1994.
A. K. McCallum. Reinforcement learning with selective perception and hidden state. PhD thesis, Department of Computer Science, University of Rochester, Rochester, NY, USA, 1995.
L. A. Meeden. An incremental approach to developing intelligent neural network controllers for robots. IEEE Transactions on Systems, Man, and Cybernetics, 26, 1996. Special Issue on Learning Autonomous Robots.
A. W. Moore. Efficient memory-based learning for robot control. PhD thesis, Computer Laboratory, University of Cambridge, Cambridge, UK, 1990.
A. W. Moore. Variable Resolution Dynamic Programming: Efficiently learning action maps in multivariate real-valued state spaces. In L. A. Birnbaum and G. C. Collins, editors, Machine Learning: Proceedings of the Eighth International Workshop (ML-91). Morgan Kaufmann, 1991.
A. W. Moore. The parti-game algorithm for variable resolution reinforcement learning in multi-dimensional state-spaces. In J. D. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing Systems, volume 6, pages 711–718. Morgan Kaufmann, 1994.
A. W. Moore, C. Atkeson, and S. Schaal. Memory-based learning for control. Technical Report CMU-RI-TR-95-18, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA, 1995.
A. W. Moore and C. G. Atkeson. Prioritised sweeping: Reinforcement learning with less data and less real time. Machine Learning, 13, 1993.
M. B. Ring. Continual learning in reinforcement environments. PhD thesis, University of Texas, Austin, TX, USA. Available via the URL http://wwwset.gmd.de/~ring/Diss/, 1994.
A. Saffioti, E. Ruspini, and K. Konolige. Blending reactivity and goal-directedness in a fuzzy controller. In Proceedings of the IEEE International Conference on Fuzzy Systems, pages 134–139, 1993.
S. Schaal and C. G. Atkeson. Robot juggling: An implementation of memory-based learning. Control Systems Magazine, 14, 1994.
R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9–44, 1988.
R. S. Sutton and A. G. Barto. Reinforcement learning: An Introduction. MIT Press, 1998.
S. Thrun. Explanation-based neural network learning: A lifelong learning approach. Kluwer Academic, Norwell, MA, USA, 1996.
S. Thrun and L. Pratt, editors. Learning to Learn. Kluwer Academic, Norwell, MA, USA, 1998.
C. J. C. H. Watkins. Learning with delayed rewards. PhD thesis, University of Cambridge, Cambridge, UK, 1989.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Großmann, A., Poli, R. (2000). Learning a Navigation Task in Changing Environments by Multi-task Reinforcement Learning. In: Wyatt, J., Demiris, J. (eds) Advances in Robot Learning. EWLR 1999. Lecture Notes in Computer Science(), vol 1812. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40044-3_2
Download citation
DOI: https://doi.org/10.1007/3-540-40044-3_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41162-8
Online ISBN: 978-3-540-40044-8
eBook Packages: Springer Book Archive