Skip to main content

Adaptive Kernel-Width Selection for Kernel-Based Least-Squares Policy Iteration Algorithm

  • Conference paper
  • 2364 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6676))

Abstract

The Kernel-based Least-squares Policy Iteration (KLSPI) algorithm provides a general reinforcement learning solution for large-scale Markov decision problems. In KLSPI, the Radial Basis Function (RBF) kernel is usually used to approximate the optimal value-function with high precision. However, selecting a proper kernel-width for the RBF kernel function is very important for KLSPI to be adopted successfully. In previous research, the kernel-width was usually set manually or calculated according to the sample distribution in advance, which requires prior knowledge or model information. In this paper, an adaptive kernel-width selection method is proposed for the KLSPI algorithm. Firstly, a sparsification procedure with neighborhood analysis based on the l 2-ball of radius ε is adopted, which helps obtain a reduced kernel dictionary without presetting the kernel-width. Secondly, a gradient descent method based on the Bellman Residual Error (BRE) is proposed so as to find out a kernel-width minimizing the sum of the BRE. The experimental results show the proposed method can help KLSPI approximate the true value-function more accurately, and, finally, obtain a better control policy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  2. Michai, G.L., Parr, R.: Least-Squares Policy Iteration. Journal of Machine Learning Research 4, 1107–1149 (2003)

    MathSciNet  MATH  Google Scholar 

  3. Xu, X., Hu, D.W., Lu, X.C.: Kernel-based Least Squares Policy Iteration for Reinforcement Learning. IEEE Transactions on Neural Networks 18(4), 973–992 (2007)

    Article  Google Scholar 

  4. Vapnik, V.: Statistical Learning Theory. Wiley Interscience, NewYork (1998)

    MATH  Google Scholar 

  5. Xu, X., Xie, T., Hu, D.W., et al.: Kernel Least-Squares Temporal Difference Learning. Int. J. Inf. Technol. 11(9), 54–63 (2005)

    Google Scholar 

  6. Wu, T.: Kernels’ Properties, Tricks and Its Applications on Obstacle Detection. National University of Defense Technology, Doctor Thesis (2003)

    Google Scholar 

  7. Orr, M.J.L.: Introduction to Radial Basis Functions. Networks (1996)

    Google Scholar 

  8. Haykin, S.: Neural Networks-a Comprehensive Foundation. Prentice-Hall, Englewood Cliffs (1999)

    MATH  Google Scholar 

  9. Moody, J., Darken, C.J.: Fast Learning In Networks of Locally-Tuned Processing Units. Neural Computation 1(2), 281–294 (1989)

    Article  Google Scholar 

  10. Archambeau, C., Lendasse, A., Trullemans, C., et al.: Phosphene Evaluation in a Visual Prosthesis with Artificial Neural Networks. In: Proceedings of the European Symposium on Intelligent Technologies, Hybrid Systems and their Implementation on Smart Adaptive Systems, Tenerife, Spain, pp. 509–515 (2001)

    Google Scholar 

  11. Wang, Y., Huang, G., Saratchandran, P., et al.: Self- Adjustment of Neuron Impact Width in Growing and Pruning RBF (GAP-RBF) Neuron Networks. In: Proceedings of ICS 2005, vol. 2, pp. 1014–1017 (2003)

    Google Scholar 

  12. Gao, D.Q.: Adaptive Structure and Parameter Optimizations of Cascade RBF-LBF Neural Networks. Chinese Journal of Computers 26(5), 575–586 (2003)

    Google Scholar 

  13. Chang, Q., Chen, Q., Wang, X.: Scaling Gaussian RBF Kernel Width to Improve SVM Classification. In: International Conference on Neural Networks and Brain, pp. 19–22 (2005)

    Google Scholar 

  14. Liu, J.H., Lampinen, J.: A Differential Evolution Based Incremental Training Method for RBF Networks. In: Proceedings of GECCO 2005, Washington, DC, USA, pp. 881–888 (2005)

    Google Scholar 

  15. Wang, H.J., Leung, C.S., Sum, P.F., et al.: Kernel Width Optimization for Faulty RBF Neural Networks with Multi-node Open Fault. Neural Processing Letters 32(1), 97–107 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wu, J., Xu, X., Zuo, L., Li, Z., Wang, J. (2011). Adaptive Kernel-Width Selection for Kernel-Based Least-Squares Policy Iteration Algorithm. In: Liu, D., Zhang, H., Polycarpou, M., Alippi, C., He, H. (eds) Advances in Neural Networks – ISNN 2011. ISNN 2011. Lecture Notes in Computer Science, vol 6676. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21090-7_70

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21090-7_70

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21089-1

  • Online ISBN: 978-3-642-21090-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics