A Meta-learning Method Based on Temporal Difference Error

Kobayashi, Kunikazu; Mizoue, Hiroyuki; Kuremoto, Takashi; Obayashi, Masanao

doi:10.1007/978-3-642-10677-4_60

Kunikazu Kobayashi¹⁹,
Hiroyuki Mizoue¹⁹,
Takashi Kuremoto¹⁹ &
…
Masanao Obayashi¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5863))

Included in the following conference series:

International Conference on Neural Information Processing

1468 Accesses
6 Citations

Abstract

In general, meta-parameters in a reinforcement learning system, such as a learning rate and a discount rate, are empirically determined and fixed during learning. When an external environment is therefore changed, the sytem cannot adapt itself to the variation. Meanwhile, it is suggested that the biological brain might conduct reinforcement learning and adapt itself to the external environment by controlling neuromodulators corresponding to the meta-parameters. In the present paper, based on the above suggestion, a method to adjust meta-parameters using a temporal difference (TD) error is proposed. Through various computer simulations using a maze search problem and an inverted pendulum control problem, it is verified that the proposed method could appropriately adjust meta-parameters according to the variation of the external environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Schultz, W., Dayan, P., Montague, P.R.: A Neural Substrate of Prediction and Reward. Science 275, 1593–1599 (1997)
Article Google Scholar
Doya, K.: Metalearning and Neuromodulation. Neural Networks 15, 495–506 (2002)
Article Google Scholar
Schweighofer, N., Doya, K.: Meta-learning in Reinforcement Learning. Neural Networks 16(1), 5–9 (2003)
Article Google Scholar
Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine Learning 8(3-4), 279–292 (1992)
Article MATH Google Scholar
Ishii, S., Yoshida, W., Yoshimoto, J.: Control of Exploitation-Exploration Meta-parameter in Reinforcement Learning. Neural Networks 15(4-6), 665–687 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Yamaguchi University, Tokiwadai 2-16-1, Ube, Yamaguchi, 755-8611, Japan
Kunikazu Kobayashi, Hiroyuki Mizoue, Takashi Kuremoto & Masanao Obayashi

Authors

Kunikazu Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Mizoue
View author publications
You can also search for this author in PubMed Google Scholar
Takashi Kuremoto
View author publications
You can also search for this author in PubMed Google Scholar
Masanao Obayashi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronic Engineering, City University of Hong Kong, Hong Kong,
Chi Sing Leung
School of Electrical Engineering and Computer Science, Kyungpook National University, 1370 Sankyuk-Dong, Puk-Gu, 702-701, Taegu, Korea
Minho Lee
School of Information Technology, King Mongkut’s University of Technology Thonburi, 126 Pracha-U-Thit Rd., Bangmod, Thungkru, 10140, Bangkok, Thailand
Jonathan H. Chan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kobayashi, K., Mizoue, H., Kuremoto, T., Obayashi, M. (2009). A Meta-learning Method Based on Temporal Difference Error. In: Leung, C.S., Lee, M., Chan, J.H. (eds) Neural Information Processing. ICONIP 2009. Lecture Notes in Computer Science, vol 5863. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10677-4_60

Download citation

DOI: https://doi.org/10.1007/978-3-642-10677-4_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10676-7
Online ISBN: 978-3-642-10677-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics