Skip to main content

Reinforcement Learning in Fine Time Discretization

  • Conference paper
Book cover Adaptive and Natural Computing Algorithms (ICANNGA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4431))

Included in the following conference series:

Abstract

Reinforcement Learning (RL) is analyzed here as a tool for control system optimization. State and action spaces are assumed to be continuous. Time is assumed to be discrete, yet the discretization may be arbitrarily fine. It is shown here that stationary policies, applied by most RL methods, are improper in control applications, since for fine time discretization they can not assure bounded variance of policy gradient estimators. As a remedy to that difficulty, we propose the use of piecewise non-Markov policies. Policies of this type can be optimized by means of most RL algorithms, namely those based on likelihood ratio.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike Adaptive Elements That Can Learn Difficult Learning Control Problems. IEEE Transactions on System Man, and Cybernetics 13, 834–846 (1983)

    Google Scholar 

  2. Baxter, J., Bartlett, P.L.: Infinite-Horizon Policy-Gradient Estimation. Journal of Artificial Intelligence Research 15, 319–350 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  3. Baxter, J., Bartlett, P.L., Weaver, L.: Experiments with Infinite-Horizon, Policy-Gradient Estimation. Journal of Artificial Intelligence Research 15, 351–381 (2001)

    MATH  MathSciNet  Google Scholar 

  4. Kimura, H., Kobayashi, S.: An Analysis of Actor/Critic Algorithm Using Eligibility Traces: reinforcement learning with imperfect value functions. In: Proceedings of the ICML-98 (1998)

    Google Scholar 

  5. Konda, V.R., Tsitsiklis, J.N.: Actor-Critic Algorithms. SIAM Journal on Control and Optimization 42(4), 1143–1166 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  6. Munos, R.: Policy Gradient in Continuous Time. Journal of Machine Learning Research 7, 771–791 (2006)

    MathSciNet  Google Scholar 

  7. Peters, J., Vijayakumar, S., Schaal, S.: Reinforcement learning for humanoid robotics. In: Humanoids2003, 3rd IEEE-RAS International Conference on Humanoid Robots, Karlsruhe, Germany, Sep. 29-30 (2003)

    Google Scholar 

  8. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  9. Watkins, C., Dayan, P.: Q-Learning. Machine Learning 8, 279–292 (1992)

    MATH  Google Scholar 

  10. Williams, R.: Simple Statistical Gradient Following Algorithms for Connectionist Reinforcement Learning. Machine Learning 8, 229–256 (1992)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bartlomiej Beliczynski Andrzej Dzielinski Marcin Iwanowski Bernardete Ribeiro

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Wawrzyński, P. (2007). Reinforcement Learning in Fine Time Discretization. In: Beliczynski, B., Dzielinski, A., Iwanowski, M., Ribeiro, B. (eds) Adaptive and Natural Computing Algorithms. ICANNGA 2007. Lecture Notes in Computer Science, vol 4431. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71618-1_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71618-1_52

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71589-4

  • Online ISBN: 978-3-540-71618-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics