An Online Kernel-Based Clustering Approach for Value Function Approximation

Tziortziotis, Nikolaos; Blekas, Konstantinos

doi:10.1007/978-3-642-30448-4_23

Nikolaos Tziortziotis²² &
Konstantinos Blekas²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7297))

Included in the following conference series:

Hellenic Conference on Artificial Intelligence

1647 Accesses

Abstract

Value function approximation is a critical task in solving Markov decision processes and accurately modeling reinforcement learning agents. A significant issue is how to construct efficient feature spaces from samples collected by the environment in order to obtain an optimal policy. The particular study addresses this challenge by proposing an on-line kernel-based clustering approach for building appropriate basis functions during the learning process. The method uses a kernel function capable of handling pairs of state-action as sequentially generated by the agent. At each time step, the procedure either adds a new cluster, or adjusts the winning cluster’s parameters. By considering the value function as a linear combination of the constructed basis functions, the weights are optimized in a temporal-difference framework in order to minimize the Bellman approximation error. The proposed method is evaluated in numerous known simulated environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Inteligence Research 4, 237–285 (1996)
Google Scholar
Sutton, R.: Learning to predict by the method of temporal differences. Machine Learning 3(1), 9–44 (1988)
Google Scholar
Boyan, J.A.: Technical update: Least-squares temporal difference learning. Machine Learning, 233–246 (2002)
Google Scholar
Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. Journal of Machine Learning Research 4, 1107–1149 (2003)
MathSciNet Google Scholar
Xu, X., Hu, D., Lu, X.: Kernel-based least squares policy iteration for reinforcement learning. IEEE Transactions on Neural Networks 18(4), 973–992 (2007)
Article Google Scholar
Rasmussen, C.E., Kuss, M.: Gaussian processes in reinforcement learning. In: Advances in Neural Information Processing Systems 16, pp. 751–759 (2004)
Google Scholar
Engel, Y., Mannor, S., Meir, R.: Reinforcement learning with gaussian process. In: International Conference on Machine Learning, pp. 201–208 (2005)
Google Scholar
Farahmand, A.M., Ghavamzadeh, M., Szepesvári, C., Mannor, S.: Regularized policy iteration. In: NIPS, pp. 441–448 (2008)
Google Scholar
Konidaris, G.D., Osentoski, S., Thomas, P.S.: Value function approximation in reinforcement learning using the fourier basis. In: AAAI Conf. on Artificial Intelligence, pp. 380–385 (2011)
Google Scholar
Mahadevan, S.: Samuel meets amarel: Automating value function approximation using global state space analysis. In: AAAI (2005)
Google Scholar
Mahadevan, S., Maggione, M.: Proto-value Functions: A Laplacian Framework for Learning Repersentation and Control in Markov Decision Porocesses. Journal of Machine Learning Research 8, 2169–2231 (2007)
MATH Google Scholar
Menache, I., Mannor, S., Shimkin, N.: Basis Function Adaptation in Temporal Difference Reinforcement Learning. Annals of Operations Research 134, 215–238 (2005)
Article MathSciNet MATH Google Scholar
Petrik, M.: An analysis of laplacian methods for value function approximation in mdps. In: International Joint Conference on Artificial Intelligence, pp. 2574–2579 (2007)
Google Scholar
Scholkopf, B., Smola, A.J., Muller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10(5), 1299–1319 (1998)
Article Google Scholar
Tzortzis, G., Likas, A.: The Global Kernel k-Means Clustering Algorithm. IEEE Trans. on Neural Networks 20(7), 1181–1194 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Ioannina, P.O. Box 1186, Ioannina, 45110, Greece
Nikolaos Tziortziotis & Konstantinos Blekas

Authors

Nikolaos Tziortziotis
View author publications
You can also search for this author in PubMed Google Scholar
Konstantinos Blekas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Biomedical Informatics, University of Central Greece, 2-4 Passiopoulou Street, 35100, Lamia, Greece
Ilias Maglogiannis
Department of Computer Science and Biomedical Informatics, University of Central Greece, 2-4 Papassiopoulou Street, 35100, Lamia, Greece
Vassilis Plagianakos
Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
Ioannis Vlahavas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tziortziotis, N., Blekas, K. (2012). An Online Kernel-Based Clustering Approach for Value Function Approximation. In: Maglogiannis, I., Plagianakos, V., Vlahavas, I. (eds) Artificial Intelligence: Theories and Applications. SETN 2012. Lecture Notes in Computer Science(), vol 7297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30448-4_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-30448-4_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30447-7
Online ISBN: 978-3-642-30448-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics