Abstract
In this chapter we propose an inference metaheuristic for Kernel-Based Reinforcement Learning (KBRL) agents – agents that operate in a continuous-state MDP. The metaheuristic is proposed in the simplified case of greedy policy RL agents with no receding horizon which perform online learning in an environment where feedback is generated by an ergodic and stationary source. We propose two inference strategies: isotropic discrete choice and anisotropic optimization, the former focused on speed and the latter focused on generalization capability. We cast the problem of classification as a RL problem and test the proposed metaheuristic in two experiments: an image recognition experiment on the Yale Faces database and a synthetic data set experiment. We propose a set of inference filters which increase the vigilance of the agent and show that they can prevent the agent from taking erroneous actions in an unknown environment. Two parallel inference algorithms are tested and illustrated in a cluster and GPU implementation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
GEEA – Centru de resurse GRID multi-corE de înalta pErformAnta pentru suportul cercetarii, http://cluster.grid.pub.ro/index.php/projects/projects-geea/
The OpenCL programming model, http://www.ks.uiuc.edu/Research/gpu/files/upcrc_opencl_lec1.pdf
Bucur, L.: The FCINT Computer Vision System (Software, 2011f), http://www.fcint.ro/portal/service/FCINT_ComputerVisionSystem/FCINT_ComputerVision.zip
Ormoneit, D., Sen, S.: Kernel-Based Reinforcement Learning. Machine Learning 49, 161–178 (2002)
Jong, N.K., Stone, P.: Kernel-Based Models for Reinforcement Learning. In: The ICML 2006 Workshop on Kernel Methods in Reinforcement Learning (June 2006)
Bernstein, A., Shimkin, N.: Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains. Machine Learning 81(3), 359–397
Kaelbing, L.P., Littman, M.L., Moore, A.: Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Brox, T., Rosenhahn, B., Cremers, D., Seidel, H.-P.: Nonparametric Density Estimation with Adaptive, Anisotropic Kernels for Human Motion Tracking. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds.) Human Motion 2007. LNCS, vol. 4814, pp. 152–165. Springer, Heidelberg (2007)
Taylor, J.S., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press (2004) ISBN 978-0-521-81396-6
Bucur, L.: Experimental data and software for the Original Yale Faces image recognition experiment, https://docs.google.com/uc?id=0B7VYFkQ0d6D-OTU2NDExNjUtODNkNS00ZDFjLWI5OWItNTFhZTNkNzU3YTE0&export=download&authkey=COHq0rkJ&hl=en
The Extended Yale Faces Database, http://vision.ucsd.edu/~leekc/ExtYaleDatabase/ExtYaleB.html
Bucur, L.: Image recognition data sets and software for the HPC KBRL image recognition experiment, https://docs.google.com/leaf?id=0B7VYFkQ0d6D-Zjg0N2RmNTEtNjYxNS00NDgxLWIzYjUtZTcyM2Q5OGU0NmJh&hl=en_US
Bucur, L.: The FCINT Computer Vision System, http://www.fcint.ro/portal/service/FCINT_ComputerVisionSystem/FCINT_ComputerVision.zip
NVIDIA Corporation GPU Computing SDK, http://developer.nvidia.com/gpu-computing-sdk
NVIDIA GeForce 210 Technical specifications, http://www.nvidia.com/object/product_geforce_210_us.html
The OpenCV Library, http://opencv.willowgarage.com/wiki/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag GmbH Berlin Heidelberg
About this chapter
Cite this chapter
Bucur, L., Florea, A., Chera, C. (2013). A KBRL Inference Metaheuristic with Applications. In: Yang, XS. (eds) Artificial Intelligence, Evolutionary Computing and Metaheuristics. Studies in Computational Intelligence, vol 427. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29694-9_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-29694-9_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29693-2
Online ISBN: 978-3-642-29694-9
eBook Packages: EngineeringEngineering (R0)