Anticipatory Learning Classifier Systems and Factored Reinforcement Learning

Sigaud, Olivier; Butz, Martin V.; Kozlova, Olga; Meyer, Christophe

doi:10.1007/978-3-642-02565-5_18

Olivier Sigaud²³,
Martin V. Butz²⁶,
Olga Kozlova^23,24 &
…
Christophe Meyer²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5499))

Included in the following conference series:

Workshop on Anticipatory Behavior in Adaptive Learning Systems

1399 Accesses

Abstract

Factored Reinforcement Learning (frl) is a new technique to solve Factored Markov Decision Problems (fmdps) when the structure of the problem is not known in advance. Like Anticipatory Learning Classifier Systems (alcss), it is a model-based Reinforcement Learning approach that includes generalization mechanisms in the presence of a structured domain. In general, frl and alcss are explicit, state-anticipatory approaches that learn generalized state transition models to improve system behavior based on model-based reinforcement learning techniques. In this contribution, we highlight the conceptual similarities and differences between frl and alcss, focusing on the one hand on spiti, an instance of frl method, and on alcss, macs and xacs, on the other hand. Though frl systems seem to benefit from a clearer theoretical grounding, an empirical comparison between spiti and xacs on two benchmark problems reveals that the latter scales much better than the former when some combination of state variables do not occur. Based on this finding, we discuss the mechanisms in xacs that result in the better scalability and propose importing these mechanisms into frl systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Robust Markov Decision Processes: A Place Where AI and Formal Methods Meet

Bayes-adaptive hierarchical MDPs

Article 29 January 2016

Probabilistic inference for determining options in reinforcement learning

Article 01 August 2016

References

Butz, M.V., Sigaud, O., Gérard, P.: Anticipatory behavior: Exploiting knowledge about the future to improve current behavior. In: Butz, M.V., Sigaud, O., Gérard, P. (eds.) Anticipatory Behavior in Adaptive Learning Systems. LNCS, vol. 2684, pp. 1–10. Springer, Heidelberg (2003)
Chapter Google Scholar
Butz, M.V.: Anticipatory Learning Classifier Systems. Kluwer Academic Publishers, Boston (2002)
Book MATH Google Scholar
Sutton, R.S.: Planning by incremental dynamic programming. In: Proceedings of the Eighth International Conference on Machine Learning, pp. 353–357. Morgan Kaufmann, San Mateo (1990)
Google Scholar
Gérard, P., Sigaud, O.: Designing efficient exploration with MACS: Modules and function approximation. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2723, pp. 1882–1893. Springer, Heidelberg (2003)
Chapter Google Scholar
Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Proceedings of the 14th International Joint Conference in Artificial Intelligence, pp. 1104–1111 (1995)
Google Scholar
Degris, T., Sigaud, O., Wuillemin, P.H.: Chi-square tests driven method for learning the structure of factored MDPs. In: Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence, Massachusetts Institute of Technology, Cambridge, pp. 122–129. AUAI Press (2006)
Google Scholar
Degris, T., Sigaud, O., Wuillemin, P.H.: Learning the structure of factored markov decision processes in reinforcement learning problems. In: Proceedings of the 23rd International Conference in Machine Learning, pp. 257–264. ACM, Pittsburgh (2006)
Google Scholar
Sigaud, O., Wilson, S.W.: Learning Classifier Systems: a survey. Journal of Soft Computing 11(11), 1065–1078 (2007)
Article MATH Google Scholar
Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. University of Michigan Press, Ann Arbor (1975)
MATH Google Scholar
Wilson, S.W.: ZCS, a Zeroth level Classifier System. Evolutionary Computation 2(1), 1–18 (1994)
Article MathSciNet Google Scholar
Wilson, S.W.: Classifier Fitness Based on Accuracy. Evolutionary Computation 3(2), 149–175 (1995)
Article Google Scholar
Riolo, R.L.: Lookahead planning and latent learning in a Classifier System. In: Meyer, J.A., Wilson, S.W. (eds.) From animals to animats: Proceedings of the First International Conference on Simulation of Adaptative Behavior, pp. 316–326. MIT Press, Cambridge (1991)
Google Scholar
Holland, J.H., Reitman, J.S.: Cognitive Systems based on Adaptive Algorithms. Pattern Directed Inference Systems 7(2), 125–149 (1978)
Google Scholar
Stolzmann, W.: Anticipatory Classifier Systems. In: Koza, J., Banzhaf, W., Chellapilla, K., Deb, K., Dorigo, M., Fogel, D.B., Garzon, M.H., Goldberg, D.E., Iba, H., Riolo, R. (eds.) Proceedings of the 1998 Genetic and Evolutionary Computation Conference, pp. 658–664. Morgan Kaufmann Publishers, Inc., San Francisco (1998)
Google Scholar
Butz, M.V., Goldberg, D.E., Stolzmann, W.: Introducing a genetic generalization pressure to the Anticipatory Classifier Systems part I: Theoretical approach. In: Proceedings of the 2000 Genetic and Evolutionary Computation Conference (GECCO 2000), pp. 34–41 (2000)
Google Scholar
Hoffmann, J.: Vorhersage und Erkenntnis [Anticipation and Cognition]. Hogrefe, Göttingen (1993)
Google Scholar
Butz, M.V.: An Algorithmic Description of ACS2. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS, vol. 2321, pp. 211–229. Springer, Heidelberg (2002)
Chapter Google Scholar
Butz, M.V., Goldberg, D.E., Stolzmann, W.: The Anticipatory Classifier System and Genetic Generalization. Natural Computing 1(4), 427–467 (2002)
Article MathSciNet MATH Google Scholar
Butz, M.V., Goldberg, D.E.: Generalized state values in an anticipatory Learning Classifier System. In: Butz, M.V., Sigaud, O., Gérard, P. (eds.) Anticipatory Behavior in Adaptive Learning Systems. LNCS (LNAI), vol. 2684, pp. 282–301. Springer, Heidelberg (2003)
Chapter Google Scholar
Gérard, P., Stolzmann, W., Sigaud, O.: YACS: a new Learning Classifier System with Anticipation. Journal of Soft Computing: Special Issue on Learning Classifier Systems 6(3-4), 216–228 (2002)
Article MATH Google Scholar
Gérard, P., Meyer, J.A., Sigaud, O.: Combining latent learning with dynamic programming in MACS. European Journal of Operational Research 160, 614–637 (2005)
Article MATH Google Scholar
Dean, T., Kanazawa, K.: A Model for Reasoning about Persistence and Causation. Computational Intelligence 5, 142–150 (1989)
Article Google Scholar
Boutilier, C., Dearden, R., Goldszmidt, M.: Stochastic dynamic programming with factored representations. Artificial Intelligence 121(1-2), 10–49 (2000)
Article MathSciNet MATH Google Scholar
Hoey, J., St-Aubin, R., Hu, A., Boutilier, C.: SPUDD: Stochastic Planning using Decision Diagrams. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 279–288. Morgan Kaufmann, San Francisco (1999)
Google Scholar
Utgoff, P.E.: Incremental induction of decision trees. Machine Learning 4, 161–186 (1989)
Article Google Scholar
Butz, M.V.: Rule-Based Evolutionary Online Learning Systems: A Principled Approach to LCS Analysis and Design. Springer, Heidelberg (2006)
MATH Google Scholar
Butz, M., Kovacs, T., Lanzi, P.L., Wilson, S.W.: Toward a theory of generalization and learning in XCS. IEEE Transactions on Evolutionary Computation 8(1), 28–46 (2004)
Article Google Scholar
Butz, M.V., Lanzi, P.L., Wilson, S.W.: Function approximation with XCS: Hyperellipsoidal conditions, recursive least squares, and compaction. IEEE Transactions on Evolutionary Computation 12, 355–376 (2008)
Article Google Scholar
Potts, D.: Incremental learning of linear model trees. In: Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), pp. 663–670 (2004)
Google Scholar
Schaal, S., Atkeson, C.G.: Constructive incremental learning from only local information. Neural Computation 10, 2047–2084 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institut des Systèmes Intelligents et de Robotique (ISIR), CNRS UMR 7222, Université Pierre et Marie Curie - Paris6, 4 place Jussieu, F-75005, Paris, France
Olivier Sigaud & Olga Kozlova
Thales Security Solutions & Services, Simulation, 1 rue du Général de Gaulle, Osny BP 226, F95523, Cergy Pontoise Cedex, France
Olga Kozlova
Thales Security Solutions & Services, ThereSIS Research and Innovation Office, Route départementale 128, F91767, Palaiseau Cedex, France
Christophe Meyer
University of Würzburg, Röntgenring 11, 97070, Würzburg, Germany
Martin V. Butz

Authors

Olivier Sigaud
View author publications
You can also search for this author in PubMed Google Scholar
Martin V. Butz
View author publications
You can also search for this author in PubMed Google Scholar
Olga Kozlova
View author publications
You can also search for this author in PubMed Google Scholar
Christophe Meyer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Consiglio Nazionale delle Ricerche,Istituto di Linguistica Computazionale “Antonio Zampolli”, Via Giuseppe Moruzzi, 1 - 56124 Pisa, Italy and Consiglio Nazionale delle Ricerche,Istituto di Scienze e Tecnologie della Cognizione, Via San Martino della Ba, Italy
Giovanni Pezzulo
COBOSLAB – Cognitive Bodyspaces: Learning and Behavior, Department of Psychology III, University of Würzburg, Röntgenring 11, 97070, Würzburg, Germany
Martin V. Butz
Institut des Systèmes Intelligents et de Robotique (CNRS UMR 7222), Université Pierre et Marie Curie, Pyramide Tour 55, 4 Place Jussieu, 75252, Paris Cedex 05, France
Olivier Sigaud
Consiglio Nazionale delle Ricerche, Istituto di Scienze e Tecnologie della Cognizione, Laboratory of Computational Embodied Neuroscience, Via San Martino della Battaglia 44, 00185, Roma, Italy
Gianluca Baldassarre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sigaud, O., Butz, M.V., Kozlova, O., Meyer, C. (2009). Anticipatory Learning Classifier Systems and Factored Reinforcement Learning. In: Pezzulo, G., Butz, M.V., Sigaud, O., Baldassarre, G. (eds) Anticipatory Behavior in Adaptive Learning Systems. ABiALS 2008. Lecture Notes in Computer Science(), vol 5499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02565-5_18

Download citation

DOI: https://doi.org/10.1007/978-3-642-02565-5_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02564-8
Online ISBN: 978-3-642-02565-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics