Abstract
Reinforcement Learning (RL) is thought to be an appropriate paradigm for acquiring control policies in mobile robotics. However, in its standard formulation (tabula rasa) RL must explore and learn everything from scratch, which is neither realistic nor effective in real-world tasks. In this article we use a new strategy, called Supervised Reinforcement Learning (SRL), that allows the inclusion of external knowledge within this type of learning. We validate it by learning a wall-following behaviour and testing it on a Nomad 200 robot. We show that SRL is able to take advantage of multiple sources of knowledge and even from partially erroneous advice, features that allow a SRL agent to make use of a wide range of prior knowledge without the need for a complex or time-consuming elaboration.
This work was supported by Xunta de Galicia’s project PGIDIT04TIC206011PR. David L. Moreno’s research was supported by MECD grant FPU-AP2001-3350.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Schaal, S., Atkeson, C.G.: Robot juggling: Implementation of memory-based learning. IEEE Control Systems 14, 57–71 (1994)
Wyatt, J.: Issues in putting reinforcement learning onto robots. In: 10th Biennal Conference of the AISB, Sheffield, UK (1995)
Iglesias, R., Regueiro, C.V., Correa, J., Barro, S.: Supervised reinforcement learning: Application to a wall following behaviour in a mobile robot. In: Mira, J., Moonis, A., de Pobil, A.P. (eds.) IEA/AIE 1998. LNCS, vol. 1416, pp. 300–309. Springer, Heidelberg (1998)
Watkins, C.: Learning from Delayed Rewards. PhD thesis, Cambridge University (1989)
Bridle, J.S.: Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. In: Touretzky, D. (ed.) Advances in Neural Information Processing Systems: Proc. 1989 Conf., pp. 211–217. Morgan Kaufmann, San Francisco (1990)
Regueiro, C.V., Rodríguez, M., Correa, J., Moreno, D.L., Iglesias, R., Barro, S.: A control architecture for mobile robotics based on specialists. In: Intelligent Systems: Technology and Applications, vol. 6, pp. 337–360. CRC Press, Boca Raton (2002)
Moreno, D.L., Regueiro, C.V., Iglesias, R., Barro, S.: Using prior knowledge to improve reinforcement learning in mobile robotics. In: TAROS 2004, UK (2004)
Mataric’, M.J.: Reward functions for accelerated learning. In: Int. Conf. on Machine Learning, pp. 181–189 (1994)
Hailu, G.: Symbolic structures in numeric reinforcement for learning optimum robot trajectory. Robotics and Autonomous Systems 37, 53–68 (2001)
Millán, J.R., Posenato, D., Dedieu, E.: Continuous-action Q-Learning. Machine Learning 49, 247–265 (2002)
Lin, L.J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning 8, 293–321 (1992)
Clouse, J.A., Utgoff, P.E.: A teaching method for reinforcement learning. In: Machine Learning. Proc. 9th Int. Workshop (ML 1992), pp. 92–101. Morgan Kaufmann, San Francisco (1992)
Dixon, K.R., Malak, R.J., Khosla, P.K.: Incorporating prior knowledge and previously learned information into reinforcement learning agents. Technical report, Carnegie Mellon University (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Moreno, D.L., Regueiro, C.V., Iglesias, R., Barro, S. (2005). Making Use of Unelaborated Advice to Improve Reinforcement Learning: A Mobile Robotics Approach. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds) Pattern Recognition and Data Mining. ICAPR 2005. Lecture Notes in Computer Science, vol 3686. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551188_10
Download citation
DOI: https://doi.org/10.1007/11551188_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28757-5
Online ISBN: 978-3-540-28758-2
eBook Packages: Computer ScienceComputer Science (R0)