Abstract
It is shown how co-evolving populations of individual rules can outperform evolving a population of complete sets of rules with the genetic algorithm in learning control systems. A rule-based control system is presented which uses only the genetic algorithm for learning individual control rules with immediate reinforcement after the firing of each rule. How this has been used for an industrial control problem is described as an example of its operation. The refinement of the system to deal with delayed reward is presented and its operation on the cart-pole balancing problem described. A comparison is made of the performance of the refined system using only selection and mutation to learn individual rules with that of the genetic algorithm to learn a complete set of rules. A comparison is also made of the performance of the refined system using only selection to learn individual rules with that of the bucket-brigade and other reinforcement algorithms on the same task.
Preview
Unable to display preview. Download preview PDF.
References
Anderson C W and Miller,III W T (1991) Challenging Control Problems. In Miller,III W T, Sutton R S and Werbos P J (eds) Neural Networks for Control, p.475–510.
Barto,A.G., Sutton,R.S. and Anderson,C.W.(1983) Neron like elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man and Cybernetics, SMC-13(5), p.834–846.
Booker,L.B.(1985) Intelligent behaviour as an adaptation to the task environment. PhD Dissertation. The University of Michigan.
Booker,L.B.(1992) Viewing classifier systems as an integrated architecture. Presented at the First International Conference on Learning Classifier Systems, Houston, Texas.
Brooks,R.A.(1992) Artificial life and real robots. In Varela F J and Bourgine P (eds) Towards a Prctice of Autonomous Systems — Proceedings of the First European Conference on Artificial Life, p.3–10.
Compiani,M., Montanari,D. & Serra,R (1990) Learning and bucket brigade dynamics in classifier systems. In S.Forrest (Ed) Emergent Computation. Amsterdam: North Holland, p.202–212.
De Jong,K.(1988) Learning with genetic algorithms: an overview. Machine Learning, vol.3, p.121–137.
Fogarty,T.C.(1988) Rule-based optimisation of combustion in multiple burner furnaces and boiler plants. Engineering Applications of Artificial Intelligence, vol:1,iss:3,p.203–9.
Fogarty,T.C.(1989) Adapting a rule-base for optimising combustion on a double burner boiler. Second International Conference on Software Engineering for Real Time Systems (IEE Conf. publ. no.309), p. 106–110.
Fogarty,T.C.(1990) Simulating multiple burner combustion for rule-based control. Systems Science, vol.16, no.2, p.23–38.
Fogarty,T.C.(1991) Putting energy efficiency into the control loop. I Mech E Technology Transfer in Energy Efficiency Session of Eurotech Direct 91, p.39–41.
Fogarty,T.C.(1993) Reproduction, ranking, replacement and noisy evaluations: experimental results. Proceedings of the Fifth International Conference on Genetic Algorithms, edited by Forrest,S., Morgan Kaufman.
Fogarty,T.C. & Huang,R.(1992) Systems control with the genetic algorithm and nearest neighbour classification. CC-AI, vol:9, nos 2 & 3, p.225–236.
Goldberg,D.E. and Deb,K.(1991) A comparative analysis of selection schemes used in genetic algorithms. In Rawlins,G.J.E. (ed) Foundations of Genetic Algorithms, p.69–93, San Mateo, CA, Morgan Kaufmann.
Grefenstette,J.J.(1988) Credit assignment in rule discovery systems based on the genetic algorithm. Machine Learning, vol 3, p.225–245.
Holland,J.H.(1975) Adaptation in natural and artificial systems. Ann Arbor: University of Michigan Press.
Holland,J.H.(1985) Properties of the bucket brigade algorithm. Proceedings of the First International Conference on Genetic Algorithms and their Applications (p.1–7) Hillsdale, New Jersey: Lawrence Erlbaum Associates.
Holland,J.H.(1986) Escaping Brittleness: the possibilities of generalpurpose learning algorithms applied to parallel rule-based systems. In R.S.Michalski, J.G.Carbonell & T.M.Mitchel (Eds.), Machine Learning, an artificial Intelligence Approach. Volume II. Los Altos, California: Morgan Kaufmann.
Holland,J.H.(1990) Concerning the emergence of tag-mediated lookahead in classifier systems. Physica D, vol.41, p.188–201.
Holland,J.H. and Reitman,J.S.(1978) Cognitive systems based on adaptive algorithms. In D.A.Waterman and F.Hayes-Roth (Eds), Pattern-directed Inference Systems. New York: Academic Press.
Liepens,G.E., Hilliard,M.R., Palmer,M. and Rangarajan,G.(1989) Alternatives for classifier system credit assignment. Proceedings of the Eleventh Int. Joint Conference on A.I. p.756–761, Los Altos, CA, Morgan Kaufmann.
Liepens,G.E., Hilliard,M.R., Palmer,M. and Rangarajan,G.(1991) Credit assignment and discovery in classifier systems. Int. Journal of Intelligent Systems, vol 6, p.55–69.
Michie D and Chambers R A (1968) BOXES: an Experiment in Adaptive Control. In Dale E and Michie (eds) Machine Intelligence 2, p.137–152.
Odetayo M O and McGregor D R (1989) Genetic Algorithm for Inducing Control Rules for A Dynamic System. In Proceedings of the 3rd International Conference on Genetic Algorithms, p.177–182.
Odetayo M O (1994) Personal Communication.
Riolo,R.L.(1991) Lookahead planning and latent learning in a classifier system. In From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behaviour, p.316–326
Smith,S.F. & Greene,D.P.(1992) Cooperative diversity using coverage as a constraint. Presented at the First International Conference on Learning Classifier Systems, Houston, Texas.
Smith,S.(1980) A learning system based on genetic algorithms. PhD Dissertation. University of Pittsburgh.
Sutton,R.(1991) Reinforcement learning architecture for animats. In From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behaviour, p.188–296. Cambridge, MA: MIT Press.
Twardowski,K.(1993) Credit assignment for pole balancing with learning classifier systems. Proceedings of the Fifth International Conference on Genetic Algorithms, p.238–245.
Watkins,J.C.H.(1989) Learning with delayed rewards. PhD Dissertation, Kings College, London.
Wilson,S.W.(1985) Knowledge growth in an artificial animal. Proceedings of the First International Conference on Genetic Algorithms and their Applications, p. 16–23.
Wilson,S.W.(1987) Classifier systems and the animat problem. Machine Learning, vol.2, p. 199–228.
Wilson,S.W. & Goldberg,D.E.(1989) A critical review of classifier systems. In Proceedings of the Third International Conference on Genetic Algorithms, p.244–255.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fogarty, T.C. (1994). Co-evolving Co-operative populations of rules in learning control systems. In: Fogarty, T.C. (eds) Evolutionary Computing. AISB EC 1994. Lecture Notes in Computer Science, vol 865. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58483-8_15
Download citation
DOI: https://doi.org/10.1007/3-540-58483-8_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58483-4
Online ISBN: 978-3-540-48999-3
eBook Packages: Springer Book Archive