The kNN-TD Reinforcement Learning Algorithm

Martín H., José Antonio; de Lope, Javier; Maravall, Darío

doi:10.1007/978-3-642-02264-7_32

The kNN-TD Reinforcement Learning Algorithm

José Antonio Martín H.²⁰,
Javier de Lope²¹ &
Darío Maravall²¹

Conference paper

1103 Accesses
8 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5601))

Abstract

A reinforcement learning algorithm called kNN-TD is introduced. This algorithm has been developed using the classical formulation of temporal difference methods and a k-nearest neighbors scheme as its expectations memory. By means of this kind of memory the algorithm is able to generalize properly over continuous state spaces and also take benefits from collective action selection and learning processes. Furthermore, with the addition of probability traces, we obtain the kNN-TD(λ) algorithm which exhibits a state of the art performance. Finally the proposed algorithm has been tested on a series of well known reinforcement learning problems and also at the Second Annual RL Competition with excellent results.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R., Barto, A.: Reinforcement Learning, An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Watkins, C.J., Dayan, P.: Technical note Q-learning. Machine Learning 8, 279 (1992)
MATH Google Scholar
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Transactions on Information Theory IT-13(1), 21–27 (1967)
Article MATH Google Scholar
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, Chichester (1973)
MATH Google Scholar
Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Transactions on Systems, Man and Cybernetics SMC-6(4), 325–327 (1976)
Article Google Scholar
Gordon, G.J.: Stable function approximation in dynamic programming. In: ICML, pp. 261–268 (1995)
Google Scholar
Atkeson, C., Moore, A., Schaal, S.: Locally weighted learning. AI Review 11, 11–73 (1997)
Google Scholar
Bosman, S.: Locally weighted approximations: yet another type of neural network. Master’s thesis, Intelligent Autonomous Systems Group, Dep. of Computer Science, University of Amsterdam (July 1996)
Google Scholar
Martin, H., Antonio, J., de Lope, J.: A k-NN based perception scheme for reinforcement learning. In: Moreno Díaz, R., Pichler, F., Quesada Arencibia, A. (eds.) EUROCAST 2007. LNCS, vol. 4739, pp. 138–145. Springer, Heidelberg (2007)
Chapter Google Scholar
Singh, S.P., Sutton, R.S.: Reinforcement learning with replacing eligibility traces. Machine Learning 22(1-3), 123–158 (1996)
Article MATH Google Scholar
Indyk, P., Motwani, R.: Approximate nearest neighbors: Towards removing the curse of dimensionality. In: STOC, pp. 604–613 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Dep. Sistemas Informáticos y Computación, Universidad Complutense de Madrid, Spain
José Antonio Martín H.
Perception for Computers and Robots, Universidad Politécnica de Madrid, Spain
Javier de Lope & Darío Maravall

Authors

José Antonio Martín H.
View author publications
You can also search for this author in PubMed Google Scholar
Javier de Lope
View author publications
You can also search for this author in PubMed Google Scholar
Darío Maravall
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dapartamento de Inteligencia Artificial, Universidad Nacional de Educación a Distancia, E.T.S. de Ingeniería Informática, Juan del Rosal, 16, 28040, Madrid, Spain
José Mira & Félix de la Paz &
Departamento de Electrónica, Tecnología de Computadores y Proyectos, Universidad Politécnica de Cartagena, Pl. Hospital, 1, 30201, Cartagena, Spain
José Manuel Ferrández
Departamento de Inteligencia Artificial, Universidad Nacional de Educación a Distancia, E.T.S. de Ingeniería Informática, Juan del Rosal, 16, 28040, Madrid, Spain
José R. Álvarez
Departamento de Electrónica, Tecnología de Computadoras y Proyectos, Universidad Politécnica de Cartagena, Pl. Hospital, 1, 30201, Cartagena
F. Javier Toledo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martín H., J.A., de Lope, J., Maravall, D. (2009). The kNN-TD Reinforcement Learning Algorithm. In: Mira, J., Ferrández, J.M., Álvarez, J.R., de la Paz, F., Toledo, F.J. (eds) Methods and Models in Artificial and Natural Computation. A Homage to Professor Mira’s Scientific Legacy. IWINAC 2009. Lecture Notes in Computer Science, vol 5601. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02264-7_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-02264-7_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02263-0
Online ISBN: 978-3-642-02264-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics