Actor-Critic Learning for Platform-Independent Robot Navigation

Muse, David; Wermter, Stefan

doi:10.1007/s12559-009-9021-z

Actor-Critic Learning for Platform-Independent Robot Navigation

Published: 24 July 2009

Volume 1, pages 203–220, (2009)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

David Muse¹ &
Stefan Wermter¹

233 Accesses
6 Citations
Explore all metrics

Abstract

This article describes an approach in the field of reinforcement learning for robot control and a new Modular Actor-Critic architecture which supports platform-independent robot control. The architecture is tested on a landmark approaching task using movable pan/tilt cameras which successfully control both a large PeopleBot and a small Sony Aibo robot to perform the navigation task, with no retraining required. The architecture provides insight into the skills transfer between different robotic platforms and the modularisation of the architecture derived from splitting the control tasks into their component parts. The architecture and underlying principles could be used in rapid prototyping of new robotic platforms, where an already functioning control system can be used to allow more sophisticated navigation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

A practical guide to multi-objective reinforcement learning and planning

Article Open access 13 April 2022

Conor F. Hayes, Roxana Rădulescu, … Diederik M. Roijers

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

The Ugly Truth About Ourselves and Our Robot Creations: The Problem of Bias and Social Inequity

Article 21 September 2017

Ayanna Howard & Jason Borenstein

References

Busquets D, Mantaras RL, Sierra C, Ditterich TG. Reinforcement learning for landmark-based robot navigation. Proceedings of the International Conference on Autonomous Agents and Multiagent Systems; 2002.
Hafner R, Riedmiller M. Reinforcement learning on an omni-directional mobile robot. IEEE/RSJ International Conference on Intelligent Robots and Systems for Human Security, Health, and Prosperity; 2003.
Kondo T, Ito K. A reinforcement learning with evolutionary state recruitment strategy for autonomous mobile robot control. Robot Auton Syst. 2004;46:11–124.
Article Google Scholar
Lee ISK, Lau HYK. Adaptive state space partitioning for reinforcement learning. Eng Appl Artif Intell. 2004;17:577–88.
Article Google Scholar
Weber C, Muse D, Elshaw M, Wermter S. A camera-direction dependent visual-motor coordinate transformation for a visually guided neural robot. Applications and Innovations in Intelligent Systems XIII—International Conference on Innovative Techniques and Applications of Artificial Intelligence; 2005. p. 151–64.
Weber C, Muse D, Wermter S. Robot docking based on omni-directional vision and reinforcement learning. Research and Development in Intelligent Systems XXII—International Conference on Innovative Techniques and Applications of Artificial Intelligence; 2005. p. 23–36.
Wermter S, Palm G, Elshaw M. Biomimetic neural learning for intelligent robots. New York: Springer; 2005.
Google Scholar
Wermter S, Page M, Knowles M, Gallese V, Pulvermüller F, Taylor J. Multimodal communication in animals, humans and robots: an introduction to perspectives in brain-inspired informatics. Neural Netw. 2009;22:111–5.
Article PubMed CAS Google Scholar
Filliat D, Meyer JA. Map-based navigation in mobile robots. I. A review of localization strategies. J Cogn Syst Res. 2003;4(4):243–82.
Article Google Scholar
Filliat D, Meyer JA. Map-based navigation in mobile robots. II. A review of map-learning and path-planning strategies. J Cogn Syst Res. 2003;4(4):283–317.
Article Google Scholar
Sutton RS, Barto AG. Reinforcement learning an introduction. Cambridge, MA: MIT Press; 1998.
Google Scholar
Wörgötter F. Actor-Critic models of animal control—a critique of reinforcement learning. Proceeding of Fourth International ICSC Symposium on Engineering of Intelligent Systems; 2004.
Sierra C, Mantaras RL, Busquets D. Multiagent bidding bechanisms for robot qualitative navigation. Lect Notes Comput Sci. 2002;1986:198–205.
Article Google Scholar
Gaskett C, Fletcher L, Zelinsky A. Reinforcement learning for visual servoing of a mobile robot. Proceedings of the Australian Conference on Robotics and Automation; 2000.
Bellman R. Adaptive control process: a guided tour. Princeton: Princeton University Press; 1961.
Google Scholar
Lighthill J. Artificial intelligence: a general survey. Artificial Intelligence: A Paper Symposium. Science Research Council; 1973.
Weber C, Wermter S, Zochios A. Robot docking with neural vision and reinforcement. Knowl Based Syst. 2004;12(2–4):165–72.
Article Google Scholar
Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: a survey. J Artif Intell Res. 1996;4:237–85.
Google Scholar
Pavlov IP. Conditioned reflexes: an investigation of the physiological activity of the cerebral cortex; 1927. http://psychclassics.yorku.ca/Pavlov/.
Barto AG, Mahadevan S. Recent advances in hierarchical reinforcement learning. Discrete Event Dynamcal Systems: Theory Appl. 2003;13:341–79.
Article Google Scholar
Stringer SM, Rolls ET, Taylor P. Learning movement sequences with a delayed reward signal in a hierarchical model of motor function. Neural Netw. 2007;20:172–81.
Article PubMed CAS Google Scholar
Tham CK. Reinforcement learning of multiple tasks using a hierarchical CMAC architecture. Robot Auton Syst. 1995;15:247–74.
Article Google Scholar
Morimoto J, Doya K. Acquisition of stand-up behaviour by a real robot using hierarchical reinforcement learning. Robot Auton Syst. 2001;36(1):37–51.
Article Google Scholar
Singh S, Barto A, Chentanez N. Intrinsically motivated reinforcement learning. Proceedings of Neural Image Processing Systems Foundation; 2005.
Konidaris GD, Barto AG. Autonomous shaping: knowledge transfer in reinforcement learning. Proceedings of the Twenty-Third International Conference on Machine Learning; 2006. p. 489–96.
Smart WD, Kaelbling LP. Reinforcement learning for robot control. Proc SPIE: Mobile Robots XVI. 2001;4573:92–103.
Wolpert DM, Ghahramani Z, Flanagan JR. Perspectives and problems in motor learning. Trends Cogn Sci. 2001;5(11):487–94.
Article PubMed Google Scholar
Mitchell RJ, Keating DA, Goodhew ICB, Bishop JM. Multiple neural network control of simple mobile robot. Proceedings of the 4th IEEE Mediterranean Symposium on New Directions in Control and Automation; 1996. p. 271–5.
Walter WG. A machine that learns. Sci Am. 1951;184(8):60–3.
Google Scholar
Foster DJ, Morris RGN, Dayan P. A model of hippocampally dependent navigation, using the temporal learning rule. Hippocampus. 2000;10:1–16.
Article PubMed CAS Google Scholar
Singh SS, Tadic VB, Doucet A. A policy gradient method for semi-Markov decision processes with application to call admission control. Eur J Oper Res. 2007;178:808–18.
Article Google Scholar

Download references

Acknowledgements

Early stages of this study were supported partially by the MirrorBot project and NestCom projects coordinated by Prof. Wermter. Thanks go to Kim Forster for her constant support and encouragement, Dr. Kevin Burn for discussions and Chris Rowan who assisted in the setup of the robots and experiments.

Author information

Authors and Affiliations

Department of Computing, Engineering and Technology, Centre for Hybrid Intelligent Systems, University of Sunderland, St. Peters Way, Sunderland, SR6 0DD, UK
David Muse & Stefan Wermter

Authors

David Muse
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Wermter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stefan Wermter.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Muse, D., Wermter, S. Actor-Critic Learning for Platform-Independent Robot Navigation. Cogn Comput 1, 203–220 (2009). https://doi.org/10.1007/s12559-009-9021-z

Download citation

Received: 16 April 2009
Accepted: 06 July 2009
Published: 24 July 2009
Issue Date: September 2009
DOI: https://doi.org/10.1007/s12559-009-9021-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Actor-Critic Learning for Platform-Independent Robot Navigation

Abstract

Access this article

Similar content being viewed by others

A practical guide to multi-objective reinforcement learning and planning

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

The Ugly Truth About Ourselves and Our Robot Creations: The Problem of Bias and Social Inequity

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation