Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning

Jakab, Hunor Sandor; Csató, Lehel

doi:10.1007/978-3-642-40728-4_22

Hunor Sandor Jakab²² &
Lehel Csató²²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8131))

Included in the following conference series:

International Conference on Artificial Neural Networks

6135 Accesses
1 Citations

Abstract

We present a novel sparsification and value function approximation method for on-line reinforcement learning in continuous state and action spaces. Our approach is based on the kernel least squares temporal difference learning algorithm. We derive a recursive version and enhance the algorithm with a new sparsification mechanism based on the topology maps represented by proximity graphs. The sparsification mechanism – speeding up computations – favors data-points minimizing the divergence of the target-function gradient, thereby also considering the shape of the target function. The performance of our sparsification and approximation method is tested on a standard benchmark RL problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boyan, J.A.: Technical update: Least-squares temporal difference learning. Machine Learning 49(2-3), 233–246 (2002)
Article MATH Google Scholar
Bradtke, S.J., Barto, A.G., Kaelbling, P.: Linear least-squares algorithms for temporal difference learning. In: Machine Learning, pp. 22–33 (1996)
Google Scholar
Csató, L., Opper, M.: Sparse On-Line Gaussian Processes. In: Neural Computation, vol. 14(3), pp. 641–668 (2002)
Google Scholar
Engel, Y., Mannor, S., Meir, R.: The kernel recursive least squares algorithm. IEEE Transactions on Signal Processing 52, 2275–2285 (2003)
Article MathSciNet Google Scholar
Haasdonk, B., Bahlmann, C.: Learning with distance substitution kernels. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 220–227. Springer, Heidelberg (2004)
Chapter Google Scholar
Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. J. Mach. Learn. Res. 4, 1107–1149 (2003)
MathSciNet Google Scholar
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)
Book MATH Google Scholar
Ruggeri, M.R., Saupe, D.: Isometry-invariant matching of point set surfaces. In: Eurographics Workshop on 3D Object Retrieval (2008)
Google Scholar
Szepesvári, C.: Algorithms for Reinforcement Learning. Morgan & Claypool (2011)
Google Scholar
Taylor, G., Parr, R.: Kernelized value function approximation for reinforcement learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 1017–1024. ACM, New York (2009)
Google Scholar
von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4) (2007)
Google Scholar
Xu, X., Hu, D., Lu, X.: Kernel-based least squares policy iteration for reinforcement learning. IEEE Transactions on Neural Networks, 973–992 (2007)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematics and Computer Science, Babeş-Bolyai University, Romania
Hunor Sandor Jakab & Lehel Csató

Authors

Hunor Sandor Jakab
View author publications
You can also search for this author in PubMed Google Scholar
Lehel Csató
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty Automation,, Technical University of Sofia, 8 St. Kl. Ohridski Blvd., 1000, Sofia, Bulgaria
Valeri Mladenov
Institute of Information and Communication Technologies, Bulgarian Academy of Sciences, Acad. G. Bonchev str. bl.25A, 1113, Sofia, Bulgaria
Petia Koprinkova-Hristova
Institute of Neural Information Processing, University of Ulm, 89075, Ulm, Germany
Günther Palm
Quartier UNIL-Dorigny, Bâtiment Internef, Université de Lausanne, 1015, Lausanne, Switzerland
Alessandro E. P. Villa
Department of Computer Science, University of Milano, Via Comelico, 39, 20135, Milano, Italy
Bruno Appollini
Knowledge Engineering, School of Computing and Mathematical Sciences, Auckland University of Technology, 120 Mayoral Drive, 3rd floor, 1010, Auckland, New Zealand
Nikola Kasabov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jakab, H.S., Csató, L. (2013). Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning. In: Mladenov, V., Koprinkova-Hristova, P., Palm, G., Villa, A.E.P., Appollini, B., Kasabov, N. (eds) Artificial Neural Networks and Machine Learning – ICANN 2013. ICANN 2013. Lecture Notes in Computer Science, vol 8131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40728-4_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-40728-4_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40727-7
Online ISBN: 978-3-642-40728-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics