Batch Reinforcement Learning for Robotic Soccer Using the Q-Batch Update-Rule

Cunha, João; Serra, Rui; Lau, Nuno; Lopes, Luís Seabra; Neves, Antóio J. R.

doi:10.1007/s10846-014-0171-1

Batch Reinforcement Learning for Robotic Soccer Using the Q-Batch Update-Rule

Published: 09 January 2015

Volume 80, pages 385–399, (2015)
Cite this article

Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

João Cunha^1,2,
Rui Serra¹,
Nuno Lau^1,2,
Luís Seabra Lopes^1,2 &
…
Antóio J. R. Neves^1,2

303 Accesses
8 Citations
Explore all metrics

Abstract

Reinforcement Learning is increasingly becoming a valuable alternative to tackle many of the challenges existing in a semi-structured, non-deterministic and adversarial environment such as robotic soccer. Batch Reinforcement Learning is a class of Reinforcement Learning methods characterized by processing a batch of interactions. By storing all past interactions, Batch RL methods are extremely data-efficient which makes this class of methods very appealing for robotics applications, specially when learning directly on physical robotic platforms.This paper presents the application of Batch Reinforcement Learning to obtain efficient robotic soccer controllers on physical platforms. To learn the controllers we propose the application of Q-Batch, a novel update-rule that exploits the episodic nature of the interactions in Batch Reinforcement Learning. The approach was validated in three different tasks with increasing difficulty. Results show the proposed approach is able to outperform hand-coded policies, for all the tasks, in a reduced amount of time. Additionally, for one of the tasks, a comparison between Q-Batch and Q-learning is carried out, and results show that, Q-Batch obtains better policies than Q-learning for the same amount of interaction time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

rSoccer: A Framework for Studying Reinforcement Learning in Small and Very Small Size Robot Soccer

Learning to Run Faster in a Humanoid Robot Soccer Environment Through Reinforcement Learning

Toward Real-Time Decentralized Reinforcement Learning Using Finite Support Basis Functions

References

Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., Osawa, E., Robocup, H.M.: A challenge problem for ai. AI mag. 18(1), 73 (1997)
Google Scholar
Riedmiller, M., Gabel, T., Hafner, R., Lange, S.: Reinforcement learning for robot soccer. Auton. Robot. 27(1), 55–73 (2009)
Article Google Scholar
Bonarini, A., Caccia, C., Lazaric, A., Restelli, M.: Batch reinforcement learning for controlling a mobile wheeled pendulum robot. In: Bramer, M. (ed.) Artificial Intelligence in Theory and Practice II, IFIP 20th World Computer Congress, vol. 276 of IFIP, pp. 151–160 Milano, Italy, Springer. (2008)
Lauer, M.: A case study on learning a steering controller from scratch with reinforcement learning. In: Intelligent Vehicles Symposium (IV), 2011 IEEE, pp. 260–265. IEEE (2011)
Hafner, R., Riedmiller, M.: Reinforcement learning in feedback control. Mach. Learn. 84, 137–169 (2011)
Article MathSciNet Google Scholar
Neves, A.J.R., Azevedo, J.L., Cunha, B., Lau, N., Silva, J., Santos, F., Corrente, G., Martins, D.A., Figueiredo, N., Pereira, A., Almeida, L., Lopes, L.S., Pinho, A.J., Rodrigues, J.M.O.S., Pedreiras, P.: Robot Soccer, chapter CAMBADA soccer team: from robot architecture to multiagent coordination, pp. 19–45. I-Tech Education and Publishing, Vienna (2010)
Google Scholar
Cunha, J., Serra, R., Lau, N., Lopes, L.S., Neves, A.J.R.: Learning robotic soccer controllers with the q-batch update-rule. In: Proceedings of International Conference on Autonomous Robot Systems and Competitions (ICARSC 2014), pp. 134–139. Espinho, Portugal (2014)
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)
Google Scholar
Wiering, M.A., van Otterlo, M. (eds.).: Reinforcement Learning: State of the Art, volume 12 of Adaptation, Learning, and Optimization. Springer, Berlin (2012)
Szepesvári, C.: Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool (2010)
Watkins, C.J.C.H.: Learning from Delayed Rewards PhD thesis. University of Cambridge, Cambridge (1989)
Google Scholar
Lange, S., Gabel, T., Riedmiller, M.: Batch reinforcement learning. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning: State of the Art, chapter 2, pp. 45–74. Springer, Berlin (2012)
Chapter Google Scholar
Riedmiller, M., Braun, H.: A direct adaptive method for faster backropagation learning: the RPROP algorithm. In: Ruspini, H. (ed.) Proceedings of the IEEE International Conference on Neural Networks, pp. 586–591. San Francisco, CA (1993)
Riedmiller, M.: Neural fitted Q iterationfirst experiences with a data efficient neural reinforcement learning method. In: Gama, J., Camacho, R., Brazdil, P., Jorge, A., Torgo, L. (eds.) Proceedings of the european conference on machine learning, vol. 3720 of lecture notes in computer science, pp. 317–328, Springer (2005)
Gordon, G., Prieditis, A., Russel, S.: Stable function approximation in dynamic programming. In: Proceedings of the 12th Internation Conference on Machine Learning (ICML 1995), pp. 261–268, Tahoe City, USA (1995)
Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. J. Mach. Learn. Res. 6, 503–556 (2005)
MATH MathSciNet Google Scholar
Lin, L.-J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3-4), 293–321 (1992)
Article Google Scholar
Tesauro, G., Galperin, G.R.: On-line policy improvement using Monte Carlo search. In: Neural information processing systems (NIPS), pp. 206–221, Denver (1996)
Cunha, J., Lau, N., Neves, A.J.R.: Q-Batch: initial results with a novel update rule for Batch Reinforcement Learning. In: Advances in Artificial Intelligence - Local Proceedings, XVI Portuguese Conference on Artificial Intelligence, Azores, Portugal, pp. 240–251 (September 2013)
Lauer, M., Langue, S., Riedmiller, M.: Motion estimation of moving objects for autonomous mobile robots. In: Kunstliche Intelligenz, vol. 20, pp. 11–17 (2006)
Cunha, J., Lau, N., Rodrigues, J.M.O.S., Cunha, B., Azevedo, J.: Predictive control for behavior generation of omni-directional robots. In: Progress in Artificial Intelligence, 14th Portuguese Conference on Artificial Intelligence, vol. 5816 of Lecture Notes in Artificial Intelligence, pp. 275–286, Aveiro, Portugal. Springer-Verlag Berlin /Heidelberg. (2009)
Riedmiller, M.: 10 steps and some tricks to set up neural reinforcement controllers. In: Neural Networks: Tricks of the Trade (2nd ed.), pp. 735–757 (2012)
Corrente, G., Cunha, J., Sequeira, R., Lau, N.: Cooperative Robotics: Passes in robotic soccer. In: Proceedings of 13th International Conference on Autonomous Robot Systems and Competitions, pp. 82–87. Lisbon, Portugal (2013)
Ng, A.Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., Liang, E.: Autonomous inverted helicopter flight via reinforcement learning. In: International Symposium on Experimental Robotics, pp. 363–372. Springer, Singapore (2004)
Peters, J., Schaal, S.: Policy gradient methods for robotics. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems. IEEE Press, Beijing, China (2006)
Google Scholar
Riedmiller, M., Montemerlo, M., Dahlkamp, H.: Learning to drive in 20 minutes. In: Proceedings of the FBIT 2007 conference. Springer, Jeju, Korea (2007)
Google Scholar
Hester, T., Quinlan, M., Stone, P.: Generalized model learning for reinforcement learning on a humanoid robot. In: IEEE International Conference on Robotics and Automation (ICRA) (2010)
Jens Kober, J., Bagnel, A., Peters, J.: Reinforcement learning in robotics: A survey . Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics, Telecomunications, and Informatics, University of Aveiro, Aveiro, Portugal
João Cunha, Rui Serra, Nuno Lau, Luís Seabra Lopes & Antóio J. R. Neves
Institute of Electronics and Telematics Engineering of Aveiro, University of Aveiro, Aveiro, Portugal
João Cunha, Nuno Lau, Luís Seabra Lopes & Antóio J. R. Neves

Authors

João Cunha
View author publications
You can also search for this author in PubMed Google Scholar
Rui Serra
View author publications
You can also search for this author in PubMed Google Scholar
Nuno Lau
View author publications
You can also search for this author in PubMed Google Scholar
Luís Seabra Lopes
View author publications
You can also search for this author in PubMed Google Scholar
Antóio J. R. Neves
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to João Cunha.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cunha, J., Serra, R., Lau, N. et al. Batch Reinforcement Learning for Robotic Soccer Using the Q-Batch Update-Rule. J Intell Robot Syst 80, 385–399 (2015). https://doi.org/10.1007/s10846-014-0171-1

Download citation

Received: 07 July 2014
Accepted: 17 November 2014
Published: 09 January 2015
Issue Date: December 2015
DOI: https://doi.org/10.1007/s10846-014-0171-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Batch Reinforcement Learning for Robotic Soccer Using the Q-Batch Update-Rule

Abstract

Access this article

Similar content being viewed by others

rSoccer: A Framework for Studying Reinforcement Learning in Small and Very Small Size Robot Soccer

Learning to Run Faster in a Humanoid Robot Soccer Environment Through Reinforcement Learning

Toward Real-Time Decentralized Reinforcement Learning Using Finite Support Basis Functions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Batch Reinforcement Learning for Robotic Soccer Using the Q-Batch Update-Rule

Abstract

Access this article

Similar content being viewed by others

rSoccer: A Framework for Studying Reinforcement Learning in Small and Very Small Size Robot Soccer

Learning to Run Faster in a Humanoid Robot Soccer Environment Through Reinforcement Learning

Toward Real-Time Decentralized Reinforcement Learning Using Finite Support Basis Functions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation