Skip to main content

Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning

  • Conference paper
Book cover Artificial Neural Networks and Machine Learning – ICANN 2013 (ICANN 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8131))

Included in the following conference series:

Abstract

We present a novel sparsification and value function approximation method for on-line reinforcement learning in continuous state and action spaces. Our approach is based on the kernel least squares temporal difference learning algorithm. We derive a recursive version and enhance the algorithm with a new sparsification mechanism based on the topology maps represented by proximity graphs. The sparsification mechanism – speeding up computations – favors data-points minimizing the divergence of the target-function gradient, thereby also considering the shape of the target function. The performance of our sparsification and approximation method is tested on a standard benchmark RL problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boyan, J.A.: Technical update: Least-squares temporal difference learning. Machine Learning 49(2-3), 233–246 (2002)

    Article  MATH  Google Scholar 

  2. Bradtke, S.J., Barto, A.G., Kaelbling, P.: Linear least-squares algorithms for temporal difference learning. In: Machine Learning, pp. 22–33 (1996)

    Google Scholar 

  3. Csató, L., Opper, M.: Sparse On-Line Gaussian Processes. In: Neural Computation, vol. 14(3), pp. 641–668 (2002)

    Google Scholar 

  4. Engel, Y., Mannor, S., Meir, R.: The kernel recursive least squares algorithm. IEEE Transactions on Signal Processing 52, 2275–2285 (2003)

    Article  MathSciNet  Google Scholar 

  5. Haasdonk, B., Bahlmann, C.: Learning with distance substitution kernels. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 220–227. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  6. Lagoudakis, M.G., Parr, R.: Least-squares policy iteration. J. Mach. Learn. Res. 4, 1107–1149 (2003)

    MathSciNet  Google Scholar 

  7. Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Inc., New York (1994)

    Book  MATH  Google Scholar 

  8. Ruggeri, M.R., Saupe, D.: Isometry-invariant matching of point set surfaces. In: Eurographics Workshop on 3D Object Retrieval (2008)

    Google Scholar 

  9. Szepesvári, C.: Algorithms for Reinforcement Learning. Morgan & Claypool (2011)

    Google Scholar 

  10. Taylor, G., Parr, R.: Kernelized value function approximation for reinforcement learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 1017–1024. ACM, New York (2009)

    Google Scholar 

  11. von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4) (2007)

    Google Scholar 

  12. Xu, X., Hu, D., Lu, X.: Kernel-based least squares policy iteration for reinforcement learning. IEEE Transactions on Neural Networks, 973–992 (2007)

    Google Scholar 

  13. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jakab, H.S., Csató, L. (2013). Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning. In: Mladenov, V., Koprinkova-Hristova, P., Palm, G., Villa, A.E.P., Appollini, B., Kasabov, N. (eds) Artificial Neural Networks and Machine Learning – ICANN 2013. ICANN 2013. Lecture Notes in Computer Science, vol 8131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40728-4_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40728-4_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40727-7

  • Online ISBN: 978-3-642-40728-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics