Recursive Adaptation of Stepsize Parameter for Non-stationary Environments

Noda, Itsuki

doi:10.1007/978-3-642-11161-7_38

Itsuki Noda^24,25,26

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5925))

Included in the following conference series:

International Conference on Principles and Practice of Multi-Agent Systems

1119 Accesses
2 Citations

Abstract

In this article, we propose a method to adapt stepsize parameters used in reinforcement learning for non-stationary environments. When the environment is non-stationary, the learning agent must adapt learning parameters like stepsize to the changes of environment through continuous learning. We show several theorems on higher-order derivatives of exponential moving average, which is a base schema of major reinforcement learning methods, using stepsize parameters. We also derive a systematic mechanism to calculate these derivatives in a recursive manner. Based on it, we construct a precise and flexible adaptation method for the stepsize parameter in order to maximize a certain criterion. The proposed method is also validated by several experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Even-dar, E., Mansour, Y.: Learning rates for q-learning. Journal of Machine Learning Research 5 (December 2003)
Google Scholar
George, A.P., Powell, W.B.: Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming. Machine learning 65(1), 167–198 (2006)
Article Google Scholar
Sato, M., Kimura, H., Kobayashi, S.: TD algorithm for the variance of return and mean-variance reinforcement learning (in japanese). Transactions of the Japanese Society for Artificial Intelligence 16(No. 3F), 353–362 (2001)
Article Google Scholar
Douglas, S.C., Mathews, V.J.: Stochastic gradient adaptive step size algorithms for adaptive filtering. In: Proc. International Conference on Digital Signal Processing, pp. 142–147 (1995)
Google Scholar
Ahmadi, M., Taylor, M.E., Stone, P.: IFSA: Incremental feature-set augmentation for reinforcement learning tasks. In: The Sixth International Joint Conference on Autonomous Agents and Multiagent Systems (May 2007)
Google Scholar
Schoknecht, R., Riedmiller, M.: Speeding-up reinforcement learning with multi-step actions. In: Dorronsoro, J.R. (ed.) ICANN 2002. LNCS, vol. 2415, pp. 813–818. Springer, Heidelberg (2002)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Information Technology Research Institute, National Institute of Advanced Industrial Science and Technology, 1-1-1 Umezono, Tsukuba, Ibaraki, 305-8568, Japan
Itsuki Noda
School of Information Science, Japan Advanced Institute of Science and Technology, Japan
Itsuki Noda
Department of Systems Innovation, The University of Tokyo, Japan
Itsuki Noda

Authors

Itsuki Noda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Information Engineering, The Catholic University of Korea, Bucheon, South Korea
Jung-Jin Yang
Faculty of Information Science and Electrical Engineering, Department of Informatics, Kyushu University, 744 Motooka, Nishi-ku, 819-0395, Fukuoka, Japan
Makoto Yokoo
School of Techno-Business Administration, Dept. of Computer Science, Nagoya Institute of Technology, Gokiso, Showa-ku, 466-8555, Nagoya, Japan
Takayuki Ito
School of Electronic Engineering and Computer Science, Peking University, No. 5 Yiheyuan Road, 100871, Beijing, China
Zhi Jin
Robotics Institute, Carnegie Mellon University, 5000 Forbes Avenue, 15213, Pittsburgh, PA, USA
Paul Scerri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Noda, I. (2009). Recursive Adaptation of Stepsize Parameter for Non-stationary Environments. In: Yang, JJ., Yokoo, M., Ito, T., Jin, Z., Scerri, P. (eds) Principles of Practice in Multi-Agent Systems. PRIMA 2009. Lecture Notes in Computer Science(), vol 5925. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11161-7_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-11161-7_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11160-0
Online ISBN: 978-3-642-11161-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics