Learning a Navigation Task in Changing Environments by Multi-task Reinforcement Learning

Großmann, Axel; Poli, Riccardo

doi:10.1007/3-540-40044-3_2

Axel Großmann³ &
Riccardo Poli³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1812))

Included in the following conference series:

European Workshop on Learning Robots

264 Accesses

Abstract

This work is concerned with practical issues surrounding the application of reinforcement learning to a mobile robot. The robot’s task is to navigate in a controlled environment and to collect objects using its gripper. Our aim is to build a control system that enables the robot to learn incrementally and to adapt to changes in the environment. The former is known as multi-task learning, the latter is usually referred to as continual ‘lifelong’ learning. First, we emphasize the connection between adaptive state-space quantisation and continual learning. Second, we describe a novel method for multi-task learning in reinforcement environments. This method is based on constructive neural networks and uses instance-based learning and dynamic programming to compute a task-dependent agent-internal state space. Third, we describe how the learning system is integrated with the control architecture of the robot. Finally, we investigate the capabilities of the learning algorithm with respect to the transfer of information between related reinforcement learning tasks, like navigation tasks in different environments. It is hoped that this method will lead to a speed-up in reinforcement learning and enable an autonomous robot to adapt its behaviour as the environment changes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. G. Barto, S. J. Bradtke, and S. P. Singh. Learning to act using real-time dynamic programming. Artificial Intelligence, 72(1):81–138, 1995.
Article Google Scholar
W. Burgard, D. Fox, and D. Hennig. Fast grid-based position tracking for mobile robots. In G. Brewka, C. Habel, and B. Nebel, editors, Proceedings of the 21st Annual German Conference on Artificial Intelligence (KI-97): Advances in Artificial Intelligence, pages 289–300. Springer, 1997.
Google Scholar
R. A. Caruana. Learning many related tasks at the same time with backpropagation. In G. Tesauro, D. Touretzky, and T. Leen, editors, Advances in Neural Information Processing Systems, volume 7, pages 657–664. MIT Press, 1995.
Google Scholar
R. A. Caruana. Algorithms and applications for multitask learning. In Proceeding of the Thirteenth International Conference on Machine Learning (ICML-96), pages 87–95, Bari, Italy, 1996. Morgan Kaufmann.
Google Scholar
D. Fox, W. Burgard, F. Dellaert, and S. Thrun. Monte Carlo localisation: Efficient position estimation for mobile robots. In Proceedings of the Sixteenth National Conference on Artificial Intelligence (AAAI-99), 1999.
Google Scholar
B. Fritzke. Growing cell structures — a self-organizing network for unsupervised and supervised learning. Technical Report TR-93-026, International Computer Science Institute, Berkeley, CA, USA, 1993.
Google Scholar
B. Fritzke. Unsupervised ontogenic networks. In E. Fiesler and R. Beale, editors, Handbook of Neural Computation, chapter C 2.4. Institute of Physics and Oxford University Press, 1997.
Google Scholar
A. Großmann and R. Poli. Robust mobile robot localisation from sparse and noisy proximity readings. In Proceedings of the IJCAI-99 Workshop on Reasoning with Uncertainty in Robot Navigation (RUR-99), 1999.
Google Scholar
L. P. Kaelbling, M. L. Littman, and A. R. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101:99–134, 1998.
Article MATH MathSciNet Google Scholar
L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Artificial Intelligence Research, 4:237–285, 1996.
Google Scholar
Z. Kalmár, C. Szepesvári, and A. Lőrincz. Module based reinforcement learning for a real robot. Presented at the Sixth European Workshop on Learning Robots, Brighton (EWLR-6), UK, 1997.
Google Scholar
T. Kohonen. Self-organising formation of topologically correct feature maps. Biological Cybernetics, 43(1):59–69, 1982.
Article MATH MathSciNet Google Scholar
K. Konolige. Saphira Software Manual. Version 6.1. ActivMedia, Inc., Peterborough, NH, USA, 1997.
Google Scholar
V. F. Leavers. Shape Detection in Computer Vision Using the Hough Transform. Springer, London, UK, 1992.
Google Scholar
J. K. Leonard and H. F. Durrant-Whyte. Directed Sonar Sensing for Mobile Robot Navigation. Kluwer Academic, 1992.
Google Scholar
S. Mahadevan and J. Connell. Automatic programming of behaviour-based robots using reinforcement learning. Artificial Intelligence, 55(2–3):311–365, 1992.
Article Google Scholar
M. J. Matarić. Reward functions for accelerated learning. In Proceedings of the Eleventh International Conference on Machine Learning, pages 181–189. Morgan Kaufmann, 1994.
Google Scholar
A. K. McCallum. Reinforcement learning with selective perception and hidden state. PhD thesis, Department of Computer Science, University of Rochester, Rochester, NY, USA, 1995.
Google Scholar
L. A. Meeden. An incremental approach to developing intelligent neural network controllers for robots. IEEE Transactions on Systems, Man, and Cybernetics, 26, 1996. Special Issue on Learning Autonomous Robots.
Google Scholar
A. W. Moore. Efficient memory-based learning for robot control. PhD thesis, Computer Laboratory, University of Cambridge, Cambridge, UK, 1990.
Google Scholar
A. W. Moore. Variable Resolution Dynamic Programming: Efficiently learning action maps in multivariate real-valued state spaces. In L. A. Birnbaum and G. C. Collins, editors, Machine Learning: Proceedings of the Eighth International Workshop (ML-91). Morgan Kaufmann, 1991.
Google Scholar
A. W. Moore. The parti-game algorithm for variable resolution reinforcement learning in multi-dimensional state-spaces. In J. D. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing Systems, volume 6, pages 711–718. Morgan Kaufmann, 1994.
Google Scholar
A. W. Moore, C. Atkeson, and S. Schaal. Memory-based learning for control. Technical Report CMU-RI-TR-95-18, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, USA, 1995.
Google Scholar
A. W. Moore and C. G. Atkeson. Prioritised sweeping: Reinforcement learning with less data and less real time. Machine Learning, 13, 1993.
Google Scholar
M. B. Ring. Continual learning in reinforcement environments. PhD thesis, University of Texas, Austin, TX, USA. Available via the URL http://wwwset.gmd.de/~ring/Diss/, 1994.
Google Scholar
A. Saffioti, E. Ruspini, and K. Konolige. Blending reactivity and goal-directedness in a fuzzy controller. In Proceedings of the IEEE International Conference on Fuzzy Systems, pages 134–139, 1993.
Google Scholar
S. Schaal and C. G. Atkeson. Robot juggling: An implementation of memory-based learning. Control Systems Magazine, 14, 1994.
Google Scholar
R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3:9–44, 1988.
Google Scholar
R. S. Sutton and A. G. Barto. Reinforcement learning: An Introduction. MIT Press, 1998.
Google Scholar
S. Thrun. Explanation-based neural network learning: A lifelong learning approach. Kluwer Academic, Norwell, MA, USA, 1996.
MATH Google Scholar
S. Thrun and L. Pratt, editors. Learning to Learn. Kluwer Academic, Norwell, MA, USA, 1998.
MATH Google Scholar
C. J. C. H. Watkins. Learning with delayed rewards. PhD thesis, University of Cambridge, Cambridge, UK, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, The University of Birmingham, Birmingham, B15 2TT, UK
Axel Großmann & Riccardo Poli

Authors

Axel Großmann
View author publications
You can also search for this author in PubMed Google Scholar
Riccardo Poli
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, University of Birmingham, Birmingham, B15 2TT, UK
Jeremy Wyatt
Division of Informatics Institute of Perception, Action and Behaviour, University of Edinburgh, Scotland, UK
John Demiris

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Großmann, A., Poli, R. (2000). Learning a Navigation Task in Changing Environments by Multi-task Reinforcement Learning. In: Wyatt, J., Demiris, J. (eds) Advances in Robot Learning. EWLR 1999. Lecture Notes in Computer Science(), vol 1812. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40044-3_2

Download citation

DOI: https://doi.org/10.1007/3-540-40044-3_2
Published: 09 November 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41162-8
Online ISBN: 978-3-540-40044-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics