Q-learning in Evolutionary Rule Based Systems

Giani, Antonella; Baiardi, Fabrizio; Starita, Antonina

doi:10.1007/3-540-58484-6_271

Antonella Giani¹,
Fabrizio Baiardi¹ &
Antonina Starita¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 866))

Included in the following conference series:

International Conference on Parallel Problem Solving from Nature

165 Accesses
4 Citations

Abstract

PANIC (Parallelism And Neural networks In Classifier systems), an Evolutionary Rule Based System (ERBS) to evolve behavioral strategies codified by sets of rules, is presented. PANIC assigns credit to the rules through a new mechanism, Q-Credit Assignment (QCA), based on Q-learning. By taking into account the context where a rule is applied, QCA is more accurate than classical methods when a single rule can fire in different situations. QCA is implemented through a multi-layer feed-forward neural network.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

P. J. Angeline, G. M. Saunders, and J. B. Pollack. An evolutionary algorithm that constructs recurrent neural networks. IEEE Transaction on Neural Networks, 1994.
Google Scholar
A. G. Barto. R. S. Sutton, and C. J. C. H. Watkins. Learning and sequential decision making. Technical Report COINS 89-95, Comp. and Inform. Sci., Univ. Mass., 1989.
Google Scholar
R. K. Belew, J. McInerney, and N. N. Schraudolph. Evolving networks: using the genetic algorithm with connectionist learning. Technical Report CSE CS90-174, Comp. Sci. and Engr., Univ. California, 1990.
Google Scholar
D. J. Chalmers. The evolution of learning: An experiment in genetic connectionism. In Proceedings of the 1990 Connectionist Models Summer School, 1990.
Google Scholar
A. Giani. Un nuovo approccio alla definizione e all'implementazione di siatemi a classificatori. Master's thesis. Dip. di Informatica. University of Pisa, Italy, 1992.
Google Scholar
J. J. Grefenstette. Credit assignment in rule discovery systems based on genetic algorithms. Machine Learning, 3(2–3), 1988.
Google Scholar
J. J. Grefenstette. A system for learning control strategies with genetic algorithms. In Proceedings of the Third International Conference on Genetic Algorithms and Their Applications. 1989.
Google Scholar
J. J. Grefenstette. Lamarckian learning in multi-agent environments. In Proceedings of the Fourth International Conference on Genetic Algorithms and Their Applications, 1991.
Google Scholar
J. H. Holland. Escaping brittleness: The possibilities of general-purpose learning algorithm applied to parallel rule-based systems, volume 2 of Machine learning: An artificial inteligencc approach. Morgan Kaufmann, 1986.
Google Scholar
J. Koza. Genetic programming: On the programming of computers by the means of natural selection. MIT Press, 1992.
Google Scholar
L. Lin. Self-improving reactive agents: Case studies of reinforcement learning frameworks. In From Animals to Animals: Proceedings of the First International Conference on Simulation of Adaptive Behaviour, 1990.
Google Scholar
R. L. Riolo. Empirical studies of default hierarchies and sequences of rules in learning classifier systems. PhD thesis, University of Michigan, 1988.
Google Scholar
G. G. Robertson and R. L. Riolo. A tale of two classifier systems. Machine Learning, 3, 1988.
Google Scholar
D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning internal representation by error propagation, volume 1 of Parallel Distributed Processing. MIT Press, 1986.
Google Scholar
S. F. Smith. A learning system based on genetic adaptive algorithms. PhD thesis, University of Pittsburgh, 1980.
Google Scholar
R. S. Sutton. Learning to predict by the methods of temporal differences. Machine Learning, 3, 1988.
Google Scholar
R. S. Sutton. Reinforcement learning architectures for animats. In From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behaviour, 1990.
Google Scholar
G. J. Tesauro. Practical issues in temporal difference learning. Technical Report RC 17223, IMB T. J. Watson Research Center, Yorktown Heights, NY, 1991.
Google Scholar
C. J. C. H. Watkins. Learning with delayed rewards. PhD thesis, University of Cambridge, England, 1989.
Google Scholar
T. H. Westerdale. A defence of the bucket brigade. In Proceedings of the Third International Conference on Genetic Algorithms and Their Applications, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Informatica, Università di Pisa, Corso Italia 40, 56124, Pisa, Italy
Antonella Giani, Fabrizio Baiardi & Antonina Starita

Authors

Antonella Giani
View author publications
You can also search for this author in PubMed Google Scholar
Fabrizio Baiardi
View author publications
You can also search for this author in PubMed Google Scholar
Antonina Starita
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Yuval Davidor Hans-Paul Schwefel Reinhard Männer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Giani, A., Baiardi, F., Starita, A. (1994). Q-learning in Evolutionary Rule Based Systems. In: Davidor, Y., Schwefel, HP., Männer, R. (eds) Parallel Problem Solving from Nature — PPSN III. PPSN 1994. Lecture Notes in Computer Science, vol 866. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58484-6_271

Download citation

DOI: https://doi.org/10.1007/3-540-58484-6_271
Published: 08 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58484-1
Online ISBN: 978-3-540-49001-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics