Hybrid Least-Squares Methods for Reinforcement Learning

Li, Hailin; Dagli, Cihan H.

doi:10.1007/3-540-45034-3_47

Hailin Li³ &
Cihan H. Dagli³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2718))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

3704 Accesses

Abstract

Model-free Least-Squares Policy Iteration (LSPI) method has been successfully used for control problems in the context of reinforcement learning. LSPI is a promising algorithm that uses linear approximator architecture to achieve policy optimization in the spirit of Q-learning. However it faces challenging issues in terms of the selection of basis functions and training sample. Inspired by orthogonal Least-Squares regression method for selecting the centers of RBF neural network, a new hybrid learning method for LSPI is proposed in this paper. The suggested method uses simulation as a tool to guide the “feature configuration” process. The results on the learning control of Cart-Pole system illustrate the effectiveness of the presented method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Fuzzy Least Square Policy Iteration and Its Mathematical Analysis

Article 14 November 2016

Optimal and Adaptive Control Design Using Recursive Least Square with a New Exponential Forgetting Factor

Least-Squares Reinforcement Learning Methods

References

R.S. Sutton.: Learning to Predict by the Methods of Temporal Difference. Machine Learning, Vol.3, No.1 (1988) 9–44
Google Scholar
C.J.C.H. Watkins.: Learning From Delayed Rewards. PhD thesis, Cambridge University, Cambridge, UK (1989)
Google Scholar
R.S. Sutton, A. Barto.: Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA (1998)
Google Scholar
Steven J. Bradtke, A. Barto.: Linear Least-Squares Algorithms for Temporal Difference Learning. Machine Learning. 22(1/2/3) (1996) 33–57
Article MATH Google Scholar
Daphne Koller, Ronald Parr.: Policy Iteration for factored MDPs. Proceedings of the 16^th Conference on Uncertainty in Artificial Intelligence (UAI-00), Morgan Kaufmann. (2000) 326–334
Google Scholar
Michail Lagoudakis, Ronald Parr.: Model Free Least Squares Policy Iteration. Proceedings of the 14^th Neural Information Processing Systems (NIPS-14), Vancouver, Canada. December (2001)
Google Scholar
S. Chen, C.F. Cowan, P.M. Grant.: Orthogonal Least Squares Algorithm for Radial Basis Function Networks, IEEE Transactions on Neural Networks, vol.21. (1990) 2513–39
MATH Google Scholar
Michail Lagoudakis, Michael L. Littman.: Algorithm Selection Using Reinforcement Learning. Proceedings of the 7^th International Conference on Machine Learning. San Francisco, CA (2000) 511–518
Google Scholar
R.S. Sutton.: Temporal Aspects of Credit Assignment in Reinforcement Learning. PhD thesis, University of Massachusetts (1984)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Engineering Management, Smart Engineering Systems Laboratory, 229 Engineering Management, University of Missouri-Rolla, 65409-0370, Rolla, MO, USA
Hailin Li & Cihan H. Dagli

Authors

Hailin Li
View author publications
You can also search for this author in PubMed Google Scholar
Cihan H. Dagli
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science, Loughborough University, Loughborough, LE11 3TU, England
Paul W. H. Chung & Chris Hinde &
Dept. of Computer Science, Southwest Texas State University, 601 University Drive, San Marcos, TX, 78666, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, H., Dagli, C.H. (2003). Hybrid Least-Squares Methods for Reinforcement Learning. In: Chung, P.W.H., Hinde, C., Ali, M. (eds) Developments in Applied Artificial Intelligence. IEA/AIE 2003. Lecture Notes in Computer Science(), vol 2718. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45034-3_47

Download citation

DOI: https://doi.org/10.1007/3-540-45034-3_47
Published: 24 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40455-2
Online ISBN: 978-3-540-45034-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics