Skip to main content
Log in

Batch Reinforcement Learning for Robotic Soccer Using the Q-Batch Update-Rule

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Reinforcement Learning is increasingly becoming a valuable alternative to tackle many of the challenges existing in a semi-structured, non-deterministic and adversarial environment such as robotic soccer. Batch Reinforcement Learning is a class of Reinforcement Learning methods characterized by processing a batch of interactions. By storing all past interactions, Batch RL methods are extremely data-efficient which makes this class of methods very appealing for robotics applications, specially when learning directly on physical robotic platforms.This paper presents the application of Batch Reinforcement Learning to obtain efficient robotic soccer controllers on physical platforms. To learn the controllers we propose the application of Q-Batch, a novel update-rule that exploits the episodic nature of the interactions in Batch Reinforcement Learning. The approach was validated in three different tasks with increasing difficulty. Results show the proposed approach is able to outperform hand-coded policies, for all the tasks, in a reduced amount of time. Additionally, for one of the tasks, a comparison between Q-Batch and Q-learning is carried out, and results show that, Q-Batch obtains better policies than Q-learning for the same amount of interaction time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., Osawa, E., Robocup, H.M.: A challenge problem for ai. AI mag. 18(1), 73 (1997)

    Google Scholar 

  2. Riedmiller, M., Gabel, T., Hafner, R., Lange, S.: Reinforcement learning for robot soccer. Auton. Robot. 27(1), 55–73 (2009)

    Article  Google Scholar 

  3. Bonarini, A., Caccia, C., Lazaric, A., Restelli, M.: Batch reinforcement learning for controlling a mobile wheeled pendulum robot. In: Bramer, M. (ed.) Artificial Intelligence in Theory and Practice II, IFIP 20th World Computer Congress, vol. 276 of IFIP, pp. 151–160 Milano, Italy, Springer. (2008)

  4. Lauer, M.: A case study on learning a steering controller from scratch with reinforcement learning. In: Intelligent Vehicles Symposium (IV), 2011 IEEE, pp. 260–265. IEEE (2011)

  5. Hafner, R., Riedmiller, M.: Reinforcement learning in feedback control. Mach. Learn. 84, 137–169 (2011)

    Article  MathSciNet  Google Scholar 

  6. Neves, A.J.R., Azevedo, J.L., Cunha, B., Lau, N., Silva, J., Santos, F., Corrente, G., Martins, D.A., Figueiredo, N., Pereira, A., Almeida, L., Lopes, L.S., Pinho, A.J., Rodrigues, J.M.O.S., Pedreiras, P.: Robot Soccer, chapter CAMBADA soccer team: from robot architecture to multiagent coordination, pp. 19–45. I-Tech Education and Publishing, Vienna (2010)

    Google Scholar 

  7. Cunha, J., Serra, R., Lau, N., Lopes, L.S., Neves, A.J.R.: Learning robotic soccer controllers with the q-batch update-rule. In: Proceedings of International Conference on Autonomous Robot Systems and Competitions (ICARSC 2014), pp. 134–139. Espinho, Portugal (2014)

  8. Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  9. Wiering, M.A., van Otterlo, M. (eds.).: Reinforcement Learning: State of the Art, volume 12 of Adaptation, Learning, and Optimization. Springer, Berlin (2012)

  10. Szepesvári, C.: Algorithms for Reinforcement Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool (2010)

  11. Watkins, C.J.C.H.: Learning from Delayed Rewards PhD thesis. University of Cambridge, Cambridge (1989)

    Google Scholar 

  12. Lange, S., Gabel, T., Riedmiller, M.: Batch reinforcement learning. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning: State of the Art, chapter 2, pp. 45–74. Springer, Berlin (2012)

    Chapter  Google Scholar 

  13. Riedmiller, M., Braun, H.: A direct adaptive method for faster backropagation learning: the RPROP algorithm. In: Ruspini, H. (ed.) Proceedings of the IEEE International Conference on Neural Networks, pp. 586–591. San Francisco, CA (1993)

  14. Riedmiller, M.: Neural fitted Q iterationfirst experiences with a data efficient neural reinforcement learning method. In: Gama, J., Camacho, R., Brazdil, P., Jorge, A., Torgo, L. (eds.) Proceedings of the european conference on machine learning, vol. 3720 of lecture notes in computer science, pp. 317–328, Springer (2005)

  15. Gordon, G., Prieditis, A., Russel, S.: Stable function approximation in dynamic programming. In: Proceedings of the 12th Internation Conference on Machine Learning (ICML 1995), pp. 261–268, Tahoe City, USA (1995)

  16. Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. J. Mach. Learn. Res. 6, 503–556 (2005)

    MATH  MathSciNet  Google Scholar 

  17. Lin, L.-J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3-4), 293–321 (1992)

    Article  Google Scholar 

  18. Tesauro, G., Galperin, G.R.: On-line policy improvement using Monte Carlo search. In: Neural information processing systems (NIPS), pp. 206–221, Denver (1996)

  19. Cunha, J., Lau, N., Neves, A.J.R.: Q-Batch: initial results with a novel update rule for Batch Reinforcement Learning. In: Advances in Artificial Intelligence - Local Proceedings, XVI Portuguese Conference on Artificial Intelligence, Azores, Portugal, pp. 240–251 (September 2013)

  20. Lauer, M., Langue, S., Riedmiller, M.: Motion estimation of moving objects for autonomous mobile robots. In: Kunstliche Intelligenz, vol. 20, pp. 11–17 (2006)

  21. Cunha, J., Lau, N., Rodrigues, J.M.O.S., Cunha, B., Azevedo, J.: Predictive control for behavior generation of omni-directional robots. In: Progress in Artificial Intelligence, 14th Portuguese Conference on Artificial Intelligence, vol. 5816 of Lecture Notes in Artificial Intelligence, pp. 275–286, Aveiro, Portugal. Springer-Verlag Berlin /Heidelberg. (2009)

  22. Riedmiller, M.: 10 steps and some tricks to set up neural reinforcement controllers. In: Neural Networks: Tricks of the Trade (2nd ed.), pp. 735–757 (2012)

  23. Corrente, G., Cunha, J., Sequeira, R., Lau, N.: Cooperative Robotics: Passes in robotic soccer. In: Proceedings of 13th International Conference on Autonomous Robot Systems and Competitions, pp. 82–87. Lisbon, Portugal (2013)

  24. Ng, A.Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., Liang, E.: Autonomous inverted helicopter flight via reinforcement learning. In: International Symposium on Experimental Robotics, pp. 363–372. Springer, Singapore (2004)

  25. Peters, J., Schaal, S.: Policy gradient methods for robotics. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems. IEEE Press, Beijing, China (2006)

    Google Scholar 

  26. Riedmiller, M., Montemerlo, M., Dahlkamp, H.: Learning to drive in 20 minutes. In: Proceedings of the FBIT 2007 conference. Springer, Jeju, Korea (2007)

    Google Scholar 

  27. Hester, T., Quinlan, M., Stone, P.: Generalized model learning for reinforcement learning on a humanoid robot. In: IEEE International Conference on Robotics and Automation (ICRA) (2010)

  28. Jens Kober, J., Bagnel, A., Peters, J.: Reinforcement learning in robotics: A survey . Int. J. Robot. Res. 32(11), 1238–1274 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to João Cunha.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cunha, J., Serra, R., Lau, N. et al. Batch Reinforcement Learning for Robotic Soccer Using the Q-Batch Update-Rule. J Intell Robot Syst 80, 385–399 (2015). https://doi.org/10.1007/s10846-014-0171-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10846-014-0171-1

Keywords

Navigation