Reinforcement learning for robot soccer

Riedmiller, Martin; Gabel, Thomas; Hafner, Roland; Lange, Sascha

doi:10.1007/s10514-009-9120-4

Reinforcement learning for robot soccer

Published: 15 May 2009

Volume 27, pages 55–73, (2009)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Martin Riedmiller¹,
Thomas Gabel¹,
Roland Hafner¹ &
…
Sascha Lange¹

2130 Accesses
160 Citations
3 Altmetric
Explore all metrics

Abstract

Batch reinforcement learning methods provide a powerful framework for learning efficiently and effectively in autonomous robots. The paper reviews some recent work of the authors aiming at the successful application of reinforcement learning in a challenging and complex domain. It discusses several variants of the general batch learning framework, particularly tailored to the use of multilayer perceptrons to approximate value functions over continuous state spaces. The batch learning framework is successfully used to learn crucial skills in our soccer-playing robots participating in the RoboCup competitions. This is demonstrated on three different case studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

Asada, M., Uchibe, E., & Hosoda, K. (1999). Cooperative behavior acquisition for mobile robots in dynamically changing real worlds via vision-based reinforcement learning and development. Artificial Intelligence, 110(2), 275–292.
Article MATH Google Scholar
Bagnell, J., & Schneider, J. (2001). Autonomous helicopter control using reinforcement learning policy search methods. In Proceedings of the 2001 IEEE international conference on robotics and automation (ICRA 2001) (pp. 1615–1620), Seoul, South Korea. New York: IEEE Press.
Google Scholar
Behnke, S., Egorova, A., Gloye, A., Rojas, R., & Simon, M. (2003). Predicting away robot control latency. In D. Polani, B. Browning, A. Bonarini, & K. Yoshida (Eds.), LNCS. RoboCup 2003: robot soccer world cup VII (pp. 712–719), Padua, Italy. Berlin: Springer.
Google Scholar
Bellman, R. (1957). Dynamic programming. Princeton: Princeton University Press.
Google Scholar
Bertsekas, D., & Tsitsiklis, J. (1996). Neuro dynamic programming. Belmont: Athena Scientific.
MATH Google Scholar
Chernova, S., & Veloso, M. (2004). An evolutionary approach to gait learning for four-legged robots. In Proceedings of the 2004 IEEE/RSJ international conference on intelligent robots and systems (IROS 2004), Sendai, Japan. New York: IEEE Press.
Google Scholar
Crites, R., & Barto, A. (1995). Improving elevator performance using reinforcement learning. In Advances in neural information processing systems 8 (NIPS 1995) (pp. 1017–1023), Denver, USA. Cambridge: MIT Press.
Google Scholar
Ernst, D., Geurts, P., & Wehenkel, L. (2006). Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6(1), 503–556.
MathSciNet Google Scholar
Gabel, T., & Riedmiller, M. (2007). Adaptive reactive job-shop scheduling with learning agents. International Journal of Information Technology and Intelligent Computing, 2(4).
Gabel, T., Hafner, R., Lange, S., Lauer, M., & Riedmiller, M. (2006). Bridging the gap: learning in the RoboCup simulation and midsize league. In Proceedings of the 7th Portuguese conference on automatic control (Controlo 2006), Porto, Portugal.
Gabel, T., Riedmiller, M., & Trost, F. (2008). A case study on improving defense behavior in soccer simulation 2D: the NeuroHassle approach. In Iocchi, L., Matsubara, H., Weitzenfeld, A., & Zhou, C. (Eds.), LNCS. RoboCup 2008: robot soccer world cup XII, Suzhou, China. Berlin: Springer.
Google Scholar
Gordon, G., Prieditis, A., & Russell, S. (1995). Stable function approximation in dynamic programming. In Proceedings of the twelfth international conference on machine learning (ICML 1995) (pp. 261–268), Tahoe City, USA. San Mateo: Morgan Kaufmann.
Google Scholar
Hafner, R., & Riedmiller, M. (2007). Neural reinforcement learning controllers for a real robot application. In Proceedings of the IEEE international conference on robotics and automation (ICRA 07), Rome, Italy. New York: IEEE Press.
Google Scholar
Kaufmann, U., Mayer, G., Kraetzschmar, G., & Palm, G. (2004). Visual robot detection in RoboCup using neural networks. In D. Nardi, M. Riedmiller, C. Sammut, & J. Santos-Victor (Eds.), LNCS. RoboCup 2004: robot soccer world cup VIII (pp. 310–322), Porto, Portugal. Berlin: Springer.
Google Scholar
Kitano, H. (Ed.). (1997). RoboCup-97: robot soccer world cup I. Berlin: Springer.
Google Scholar
Kober, J., Mohler, B., & Peters, J. (2008). Learning perceptual coupling for motor primitives. In Proceedings of the 2008 IEEE/RSJ international conference on intelligent robots and systems (IROS 2008) (pp. 834–839), Nice, France. New York: IEEE Press.
Google Scholar
Lagoudakis, M., & Parr, R. (2003). Least-squares policy iteration. Journal of Machine Learning Research, 4, 1107–1149.
Article MathSciNet Google Scholar
Lauer, M., Lange, S., & Riedmiller, M. (2005). Calculating the perfect match: an efficient and accurate approach for robot self-localization. In A. Bredenfeld, A. Jacoff, I. Noda, & Y. Takahashi (Eds.), LNCS. RoboCup 2005: robot soccer world cup IX (pp. 142–153), Osaka, Japan. Berlin: Springer.
Google Scholar
Lauer, M., Lange, S., & Riedmiller, M. (2006). Motion estimation of moving objects for autonomous mobile robots. Kunstliche Intelligenz, 20(1), 11–17.
Google Scholar
Li, B., Hu, H., & Spacek, L. (2003). An adaptive color segmentation algorithm for Sony legged robots. In The 21st IASTED international multi-conference on applied informatics (AI 2003) (pp. 126–131), Innsbruck, Austria. New York: IASTED/ACTA Press.
Google Scholar
Lin, L. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning, 8(3), 293–321.
Google Scholar
Ma, J., & Cameron, S. (2008). Combining policy search with planning in multi-agent cooperation. In L. Iocchi, H. Matsubara, A. Weitzenfeld, & C. Zhou (Eds.), LNAI. RoboCup 2008: robot soccer world cup XII, Suzhou, China. Berlin: Springer.
Google Scholar
Nakashima, T., Takatani, M., Udo, M., Ishibuchi, H., & Nii, M. (2005). Performance evaluation of an evolutionary method for RoboCup soccer strategies. In A. Bredenfeld, A. Jacoff, I. Noda, & Y. Takahashi (Eds.), LNAI. RoboCup 2005: robot soccer world cup IX, Osaka, Japan. Berlin: Springer.
Google Scholar
Ng, A., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., & Liang, E. (2004). Autonomous inverted helicopter flight via reinforcement learning. In Experimental robotics IX, the 9th international symposium on experimental robotics (ISER) (pp. 363–372), Singapore, China. Berlin: Springer.
Google Scholar
Noda, I., Matsubara, H., Hiraki, K., & Frank, I. (1998). Soccer server: a tool for research on multi-agent systems. Applied Artificial Intelligence, 12(2–3), 233–250.
Google Scholar
Ogino, M., Katoh, Y., Aono, M., Asada, M., & Hosoda, K. (2004). Reinforcement learning of humanoid rhythmic walking parameters based on visual information. Advanced Robotics, 18(7), 677–697.
Article Google Scholar
Oubbati, M., Schanz, M., & Levi, P. (2005). Kinematic and dynamic adaptive control of a nonholonomic mobile robot using a RNN. In Proceedings of the 20005 IEEE international symposium on computational intelligence in robotics and automation (CIRA 2005) (pp. 27–33). New York: IEEE Press.
Chapter Google Scholar
Peters, J., & Schaal, S. (2006). Policy gradient methods for robotics. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (IROS), Beijing, China. New York: IEEE Press.
Google Scholar
Peters, J., & Schaal, S. (2008a). Learning to control in operational space. The International Journal of Robotics Research, 27(2), 197–212.
Article Google Scholar
Peters, J., & Schaal, S. (2008b). Reinforcement learning of motor skills with policy gradients. Neural Networks, 21(4), 682–697.
Article Google Scholar
Puterman, M. (2005). Markov decision processes: discrete stochastic dynamic programming. New York: Wiley-Interscience.
Google Scholar
Riedmiller, M. (1997). Generating continuous control signals for reinforcement controllers using dynamic output elements. In Proceedings of the European symposium on artificial neural networks (ESANN 1997), Bruges, Belgium.
Riedmiller, M. (2005). Neural fitted Q iteration—first experiences with a data efficient neural reinforcement learning method. In Machine learning: ECML 2005, 16th European conference on machine learning, Porto, Portugal. Berlin: Springer.
Google Scholar
Riedmiller, M., & Braun, H., (1993). A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In H. Ruspini (Ed.), Proceedings of the IEEE international conference on neural networks (ICNN) (pp. 586–591), San Francisco.
Riedmiller, M., & Merke, A. (2003). Using machine learning techniques in complex multi-agent domains. In I. Stamatescu, W. Menzel, M. Richter, & U. Ratsch (Eds.), Adaptivity and learning. Berlin: Springer.
Google Scholar
Riedmiller, M., Montemerlo, M., & Dahlkamp, H. (2007). Learning to drive in 20 minutes. In Proceedings of the FBIT 2007 conference, Jeju, Korea. Berlin: Springer.
Google Scholar
Röfer, T. (2004). Evolutionary gait-optimization using a fitness function based on proprioception. In Nardi, D., Riedmiller, M., Sammut, C., & Santos-Victor, J. (Eds.), LNCS. RoboCup 2004: robot soccer world cup VIII (pp. 310–322), Porto, Portugal. Berlin: Springer.
Google Scholar
Stone, P., Sutton, R., & Kuhlmann, G. (2005). Reinforcement learning for RoboCup-soccer keepaway. Adaptive Behavior, 13(3), 165–188.
Article Google Scholar
Sutton, R., & Barto, A. (1998). Reinforcement learning. An introduction. Cambridge: MIT Press/A Bradford Book.
Google Scholar
Sutton, R., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems 12 (NIPS 1999) (pp. 1057–1063), Denver, USA. Cambridge: MIT Press.
Google Scholar
Tesauro, G., & Galpering, G. (1995). On-line policy improvement using Monte Carlo search. In Neural information processing systems (NIPS 1996) (pp. 206–221), Denver, USA. Berlin: Springer.
Google Scholar
Tesauro, G., & Sejnowski, T. (1989). A parallel network that learns to play backgammon. Artificial Intelligence, 39(3), 357–390.
Article MATH Google Scholar
Treptow, A., & Zell, A. (2004). Real-time object tracking for soccer-robots without color information. Robotics and Autonomous Systems, 48(1), 41–48.
Article Google Scholar
Watkins, C., & Dayan, P. (1992). Q-learning. Machine Learning, 8, 279–292.
MATH Google Scholar
Wehenkel, L., Glavic, M., & Ernst, D. (2005). New developments in the application of automatic learning to power system control. In Proceedings of the 15th power systems computation conference (PSCC05), Liege, Belgium.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Albert-Ludwigs-Universität Freiburg, Freiburg, Germany
Martin Riedmiller, Thomas Gabel, Roland Hafner & Sascha Lange

Authors

Martin Riedmiller
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Gabel
View author publications
You can also search for this author in PubMed Google Scholar
Roland Hafner
View author publications
You can also search for this author in PubMed Google Scholar
Sascha Lange
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Riedmiller.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Riedmiller, M., Gabel, T., Hafner, R. et al. Reinforcement learning for robot soccer. Auton Robot 27, 55–73 (2009). https://doi.org/10.1007/s10514-009-9120-4

Download citation

Received: 07 November 2008
Accepted: 04 May 2009
Published: 15 May 2009
Issue Date: July 2009
DOI: https://doi.org/10.1007/s10514-009-9120-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Reinforcement learning for robot soccer

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

A review of cooperative multi-agent deep reinforcement learning

A Survey on Deep Transfer Learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Reinforcement learning for robot soccer

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

A review of cooperative multi-agent deep reinforcement learning

A Survey on Deep Transfer Learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation