Abstract
When a dynamic systems is controlled by a learning controller, the state space is required to be coarsely partitioned to make the learning task computationally feasible. This partition forms the representation of the dynamic system to the learning algorithm. However, such representations normally make the system non-Markovian and thus hard to control, do not naturally allow for asymptotic approach of the setpoint, and often necessitate large control actions. By analysing the partitioning function as an information channel providing partial observation of the underlying Markovian process, it can be shown that the problems of hidden state, selective perception and partially observable systems are not brought about by having the wrong number of cells, which could be improved by pruning or splitting nodes, nor can it be improved by augmenting observations with a history of observations, but is rather due to a poor choice of base representation. Using an information loss metric and sliding mode design heuristics, it is possible to find more appropriate partitions of the state space which remove or reduce the learning and controlling problems associated with partitioned representations of dynamic systems. The practical benefits of these partitions are demonstrated controlling a benchmark process.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
uB. Hayes-Roth. Intelligent control. Artificial Intelligence, 59:213–220, 1993.
uJan Lunze. Qualitative modelling of linear dynamical systems with quantized state measurements. Automatica, 30(3):417–431, 1994.
Sridhar Mahadevan. Average reward reinforcement learning: Foundations, algorithms, and empirical results. Machine Learning, 22:159–195, 1996.
Donald Michie and R.A. Chambers. Boxes: An experiment in adaptive control. In E. Dale and D. Michie, editors, Machine Intelligence 2. Oliver and Boyd, 1968.
Mark Pendrith and Malcolm Ryan. Reinforcement learning for real-world control applications. In Proc. of the 11th Canadian Artificial Intelligence Conf. Springer-Verlag, 1996.
Herbert E. Rauch. Autonomous control reconfiguration. IEEE Control Systems Magazine, 15(6):37–48, 1995.
Claude Sammut and James Cribb. Is learning rate a good performance criterion for learning. In B.W. Porter and R.J. Mooney, editors, Proceedings of the Seventh International Machine Learning Conference. Morgan Kaufman, 1990.
Christopher J.C.H. Watkins. Learning from delayed rewards. PhD thesis, King's College, Cambridge, 1989.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
McGarity, M. (1998). Finding partitions for learning control of dynamic systems. In: Mercer, R.E., Neufeld, E. (eds) Advances in Artificial Intelligence. Canadian AI 1998. Lecture Notes in Computer Science, vol 1418. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64575-6_60
Download citation
DOI: https://doi.org/10.1007/3-540-64575-6_60
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64575-7
Online ISBN: 978-3-540-69349-9
eBook Packages: Springer Book Archive