Abstract
We present the first results obtained from two implementations of a hybrid architecture which balances exploration and exploitation to solve mazes with continuous search spaces. In both cases the critic is based around a Radial Basis Function (RBF) Neural Network which uses Temporal Difference learning to acquire a continuous valued internal model of the environment through interaction with it. Also in both cases an Evolutionary Algorithm is employed in the search policy for each movement. In the first implementation a Genetic Algorithm (GA) is used, and in the second an Evolutionary Strategy (ES). Over successive trials the maze solving agent learns the V-function, a mapping between real numbered positions in the maze and the value of being at those positions.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Back T., Schwefel H-P., 1993, ‘An Overview of Evolutionary Algorithms for Parameter Optimization', Evolutionary Computation Vol.1 Num.1, pp1–23
Barto A. G., Bradtke S. J., Singh S. P., 1991, ‘Real-Time Learning and Control using Asynchronous Dynamic Programming', Dept. of Computer Science, University of Massachusetts, USA, Technical Report 91-57
Barto A. G., Sutton R. S., Watkins C. J. C. H., 1989, ‘Learning and Sequential Decision Making', COINS Technical Report 89–95
Belew R. K., McInerney J., Schraudolph N. N., 1990, ‘Evolving Networks: Using the Genetic Algorithm with Connectionist Learning', University of California at San Diego, USA, CSE Technical Report CS90-174
Booker L. B., Goldberg D. E., Holland J. H., 1989, ‘Classifier Systems and Genetic Algorithms', Artificial Intelligence 40, pp.235–282
Cliff D., Husbands P., Harvey I., 1992, ‘Evolving Visually Guided Robots', University of Sussex, Cognitive Science Research Papers CSRP 220
Lin L., PhD thesis, 1993, ‘Reinforcement Learning for Robots using Neural Networks', Computer Science School, Carnegie Mellon University Pittsburgh, USA
Poggio T., Girosi F., 1989, ‘A theory of Networks for Approximation and Learning', MIT Cambridge, MA, AI lab. Memo 1140
Roberts G., 1989, ‘A rational reconstruction of Wilson's Animat and Holland's CS-1', Procs. of 3rd International Conference on Genetic Algorithms, pp.317–321, Editor Schaffer J. D., Morgan Kaufmann
Roberts G., 1991, ‘Classifier Systems for Situated Autonomous Learning', PhD thesis, Edinburgh University
Roberts G., 1993, ‘Dynamic Planning for Classifier Systems', Proceedings of the 5th International Conference on Genetic Algorithms, pp.231–237
Sanner R. M., Slotine J. E., 1991, ‘Gaussian Networks for Direct Adaptive Control', Nonlinear Systems Lab., MIT, Cambridge, USA, Technical Report NSL-910503
Sutton R. S., 1984, PhD thesis ‘Temporal Credit Assignment in Reinforcement Learning', University of Massachusetts, Dept. of computer and Information Science
Sutton R. S., 1991, ‘Reinforcement Learning Architectures for Animats', From Animals to Animats, pp288–296, Editors Meyer, J., Wilson, S., MIT Press
Thrun S. B., 1992, ‘The Role of Exploration in Learning', Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, Van Nostrand Reinhold, Ed. White D. A., Sofge D. A.
Watkins C. J. C. H., 1989, PhD thesis ‘Learning from Delayed Rewards', King's College, Cambridge.
Werbos P. J., 1992, ‘Approximate Dynamic Programming for Real-Time Control and Neural Modelling', Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches, Van Nostrand Reinhold, Ed. White D. A., Sofge D. A.
Wilson S. W., 1985, ‘Knowledge growth in an artificial animal', Proceedings of an International Conference on Genetic Algorithms and their Applications, pp. 16–23, Editor Grefenstette J. J.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pipe, A.G., Fogarty, T.C., Winfield, A. (1994). Hybrid adaptive heuristic critic architectures for learning in mazes with continuous search spaces. In: Davidor, Y., Schwefel, HP., Männer, R. (eds) Parallel Problem Solving from Nature — PPSN III. PPSN 1994. Lecture Notes in Computer Science, vol 866. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58484-6_291
Download citation
DOI: https://doi.org/10.1007/3-540-58484-6_291
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58484-1
Online ISBN: 978-3-540-49001-2
eBook Packages: Springer Book Archive