Skip to main content

10 Steps and Some Tricks to Set up Neural Reinforcement Controllers

  • Chapter

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7700))

Abstract

The paper discusses the steps necessary to set up a neural reinforcement controller for successfully solving typical (real world) control tasks. The major intention is to provide a code of practice of crucial steps that show how to transform control task requirements into the specification of a reinforcement learning task. Thereby, we do not necessarily claim that the way we propose is the only one (this would require a lot of empirical work, which is beyond the scope of the paper), but wherever possible we try to provide insights why we do it the one way or the other. Our procedure of setting up a neural reinforcement learning system worked well for a large range of real, realistic or benchmark-style control applications.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. I, II. Athena Scientific, Belmont (1995)

    Google Scholar 

  2. Blum, M., Springenberg, J.T., Wülfing, J., Riedmiller, M.: A Learned Feature Descriptor for Object Recognition in RGB-D Data. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), St. Paul, Minnesota, USA (2012)

    Google Scholar 

  3. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro Dynamic Programming. Athena Scientific, Belmont (1996)

    Google Scholar 

  4. Deisenroth, M.P., Rasmussen, C.E., Peters, J.: Gaussian Process Dynamic Programming. Neurocomputing 72(7–9), 1508–1524 (2009)

    Article  Google Scholar 

  5. Ernst, D., Wehenkel, L., Geurts, P.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503–556 (2005)

    MathSciNet  MATH  Google Scholar 

  6. Gabel, T., Lutz, C., Riedmiller, M.: Improved Neural Fitted Q Iteration Applied to a Novel Computer Gaming and Learning Benchmark. In: Proceedings of the IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL 2011), Paris, France. IEEE Press (April 2011)

    Google Scholar 

  7. Gabel, T., Riedmiller, M.: On Experiences in a Complex and Competitive Gaming Domain: Reinforcement Learning Meets RoboCup. In: Proceedings of the IEEE Symposium on Computational Intelligence and Games, Honolulu, USA (2007)

    Google Scholar 

  8. Gabel, T., Riedmiller, M.: Adaptive Reactive Job-Shop Scheduling with Reinforcement Learning Agents. International Journal of Information Technology and Intelligent Computing 24(4) (2008)

    Google Scholar 

  9. Hafner, R., Riedmiller, M.: Reinforcement learning on an omnidirectional mobile robot. In: Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003), Las Vegas (2003)

    Google Scholar 

  10. Hafner, R., Riedmiller, M.: Neural Reinforcement Learning Controllers for a Real Robot Application. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2007), Rome, Italy (2007)

    Google Scholar 

  11. Hafner, R., Riedmiller, M.: Reinforcement learning in feedback control. Machine Learning 27(1), 55–74 (2011), 10.1007/s10994-011-5235-x

    Google Scholar 

  12. Hans, A., Schneegass, D., Schäfer, A.M., Udluft, S.: Safe exploration for reinforcement learning. In: ESANN, pp. 143–148 (2008)

    Google Scholar 

  13. Kietzmann, T., Riedmiller, M.: The Neuro Slot Car Racer: Reinforcement Learning in a Real World Setting. In: Proceedings of the Int. Conference on Machine Learning Applications (ICMLA 2009), Miami, Florida. Springer (December 2009)

    Google Scholar 

  14. LeCun, Y., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient backProp. In: Orr, G.B., Müller, K.-R. (eds.) NIPS-WS 1996. LNCS, vol. 1524, pp. 9–50. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  15. Lange, S., Riedmiller, M.: Deep auto-encoder neural networks in reinforcement learning. In: International Joint Conference on Neural Networks (IJCNN 2010), Barcelona, Spain (2010)

    Google Scholar 

  16. Lange, S., Riedmiller, M.: Deep learning of visual control policies. In: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2010), Brugge, Belgium (2010)

    Google Scholar 

  17. Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In: Ruspini, H. (ed.) Proceedings of the IEEE International Conference on Neural Networks (ICNN), San Francisco, pp. 586–591 (1993)

    Google Scholar 

  18. Riedmiller, M., Gabel, T.: Distributed Policy Search Reinforcement Learning for Job-Shop Scheduling Tasks. TPRS International Journal of Production Research 50(1) (2012); Available online from (May 2011)

    Google Scholar 

  19. Riedmiller, M., Gabel, T., Hafner, R., Lange, S.: Reinforcement Learning for Robot Soccer. Autonomous Robots 27(1), 55–74 (2009)

    Article  Google Scholar 

  20. Riedmiller, M., Hafner, R., Lange, S., Lauer, M.: Learning to Dribble on a Real Robot by Success and Failure. In: Proceedings of the 2008 International Conference on Robotics and Automation (ICRA 2008), Pasadena CA. Springer (2008) (video presentation)

    Google Scholar 

  21. Riedmiller, M.: Learning to control dynamic systems. In: Trappl, R. (ed.) Proceedings of the 13th European Meeting on Cybernetics and Systems Research, EMCSR 1996 (1996)

    Google Scholar 

  22. Riedmiller, M.: Generating continuous control signals for reinforcement controllers using dynamic output elements. In: European Symposium on Artificial Neural Networks, ESANN 1997, Bruges (1997)

    Google Scholar 

  23. Riedmiller, M.: Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 317–328. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  24. Riedmiller, M.: Neural reinforcement learning to swing-up and balance a real pole. In: Proc. of the Int. Conference on Systems, Man and Cybernetics, 2005, Big Island, USA (October 2005)

    Google Scholar 

  25. Riedmiller, M., Lange, S., Voigtländer, A.: Autonomous reinforcement learning on raw visual input data in a real world application. In: Proceedings of the International Joint Conference on Neural Networks, Brisbane, Australia (2012)

    Google Scholar 

  26. Riedmiller, M., Montemerlo, M., Dahlkamp, H.: Learning to Drive in 20 Minutes. In: Proceedings of the FBIT 2007 Conference, Jeju, Korea. Springer (2007) (Best Paper Award)

    Google Scholar 

  27. Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)

    Google Scholar 

  28. Sutton, R.S.: Generalization in reinforcement learning: Successful examples using sparse coarse coding. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems, vol. 8, pp. 1038–1044. MIT Press, Cambridge (1996)

    Google Scholar 

  29. Timmer, S., Riedmiller, M.: Fitted Q Iteration with CMACs. In: Proceedings of the IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (ADPRL 2007), Honolulu, USA (2007)

    Google Scholar 

  30. Watkins, C.J.: Learning from Delayed Rewards. Phd thesis, Cambridge University (1989)

    Google Scholar 

  31. Walsh, T.J., Nouri, A., Li, L., Littman, M.L.: Planning and Learning in Environments with Delayed Feedback. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 442–453. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Riedmiller, M. (2012). 10 Steps and Some Tricks to Set up Neural Reinforcement Controllers. In: Montavon, G., Orr, G.B., Müller, KR. (eds) Neural Networks: Tricks of the Trade. Lecture Notes in Computer Science, vol 7700. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35289-8_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35289-8_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35288-1

  • Online ISBN: 978-3-642-35289-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics