Abstract
Q-learning can be used to learn a control policy that maximises a scalar reward through interaction with the environment. Q-learning is commonly applied to problems with discrete states and actions. We describe a method suitable for control tasks which require continuous actions, in response to continuous states. The system consists of a neural network coupled with a novel interpolator. Simulation results are presented for a non-holonomic control task. Advantage Learning, a variation of Q-learning, is shown enhance learning speed and reliability for this task.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
J. S. Albus. A new approach to manipulator control: the cerebrellar model articulated controller (CMAC). J. Dynamic Systems, Measurement and Control, 97:220–227, 1975.
Leemon C. Baird and A. Harry Klopf. Reinforcement learning with high-dimensional, continuous actions. Technical Report WL-TR-93-1147, Wright Laboratory, 1993.
W. Baker and J. Farrel. An introduction to connectionist learning control systems. In D. A. White and D. A. Sofge, editors, Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. Van Nostrand Reinhold, 1992.
A. G. Barto, R. S. Sutton, and C. W. Anderson. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans on systems, man and cybernetics, SMC-13:834–846, 1983.
Chris Gaskett, David Wettergreen, and Alexander Zelinsky. Reinforcement learning applied to the control of an autonomous underwater vehicle. In Proceedings of the Australian Conference on Robotics and Automation (AuCRA99), 1999.
H.-M. Gross, V. Stephan, and M. Krabbes. A neural field approach to topological reinforcement learning in continuous action spaces. In Proc. 1998 IEEE World Congress on Computational Intelligence, WCCI’98 and International Joint Conference on Neural Networks, IJCNN’98, Anchorage, Alaska, 1998.
Mance E. Harmon and Leemon C. Baird. Residual advantage learning applied to a differential game. In Proceedings of the International Conference on Neural Networks, Washington D.C, 1995.
T. Kohonen. Self-Organization and Associative Memory. Springer, Berlin, third edition, 1989.
Peter Lancaster and Kęstutis Šalkauskas. Curve and Surface Fitting, an Introduction. Academic Press, 1986.
Long-Ji Lin. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine Learning Journal, 8(3/4), 1992.
Gavin Adrian Rummery. Problem solving with reinforcement learning. PhD thesis, Cambridge University, 1995.
Juan C. Santamaria, Richard S. Sutton, and Ashwin Ram. Experiments with reinforcement learning in problems with continuous state and action spaces. Adaptive Behaviour, 6(2):163–218, 1998.
Juan Miguel Santos. Contribution to the study and design of reinforcement functions. PhD thesis, Universidad de Buenos Aires, Universite d’Aix-Marseille III, 1999.
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. Bradford Books, MIT, 1998.
Claude F. Touzet. Neural reinforcement learning for behaviour synthesis. Robotics and Autonomous Systems, 22(3–4):251–81, 1997.
Christopher J. C. H. Watkins. Learning from Delayed Rewards. PhD thesis, University of Cambridge, 1989.
Paul J. Werbos. Approximate dynamic programming for real-time control and neural modeling. In D. A. White and D. A. Sofge, editors, Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches. Van Nostrand Reinhold, 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gaskett, C., Wettergreen, D., Zelinsky, A. (1999). Q-Learning in Continuous State and Action Spaces. In: Foo, N. (eds) Advanced Topics in Artificial Intelligence. AI 1999. Lecture Notes in Computer Science(), vol 1747. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46695-9_35
Download citation
DOI: https://doi.org/10.1007/3-540-46695-9_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66822-0
Online ISBN: 978-3-540-46695-6
eBook Packages: Springer Book Archive