Generalized State Values in an Anticipatory Learning Classifier System

Butz, Martin V.; Goldberg, David E.

doi:10.1007/978-3-540-45002-3_16

Martin V. Butz^9,10 &
David E. Goldberg⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2684))

836 Accesses
17 Citations

Abstract

This paper introduces generalized state values to the anticipatory learning classifier system ACS2. Previous studies showed that the evolving generalized state value in ACS2 might be overgeneral for a proper policy representation. Thus, the policy representation is separated from the model representation. A function approximation module is added that approximates state values. Actual action choice then depends on the learned generalized state values predicted by the means of the predictive model yielding anticipatory behavior. It is shown that the function approximation module accurately generalizes the state value function in the investigated MDP. Improvement of the approach by the means of further anticipatory interaction between predictive model learner and state value learner is suggested. We also propose the implementation of task dependent anticipatory attentional mechanisms exploiting the representation of the generalized state-value function. Finally, the anticipatory framework may be extended to support multiple motivations integrated in a motivational module which could be influenced by emotional biases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Avila-Garcá, O., Cañamero, L.D.: A comparison of behavior selection architectures using viability indicators. In: EPSRC/BBSRC International Workshop Biologically-Inspired Robotics: The Legacy of W. Grey Walter, HP Bristol Labs, UK (2002)
Google Scholar
Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: Safely approximating the value function. Advances in Neural Information Processing Systems 7 (1995)
Google Scholar
Butz, M.V.: Anticipatory learning classifier systems. Kluwer Academic Publishers, Boston (2002)
MATH Google Scholar
Butz, M.V.: Biasing exploration in an anticipatory learning classifier system. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 3–22. Springer, Heidelberg (2002)
Chapter Google Scholar
Butz, M.V.: State value learning with an anticipatory learning classifier system in a markov decision process. IlliGAL report 2002018, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign (2002), http://wwwilligal.ge.uiuc.edu/
Butz, M.V., Goldberg, D.E., Stolzmann, W.: The anticipatory classifier system and genetic generalization. Natural Computing 1, 427–467 (2002)
Article MATH MathSciNet Google Scholar
Butz, M.V., Goldberg, D.E., Tharakunnel, K.: Analysis and improvement of fitness exploitation in xcs: Bounding models, tournament selection, and bilateral accuracy. In: Evolutionary Computation (2003) (in press)
Google Scholar
Butz, M.V., Hoffmann, J.: Anticipations control behavior: Animal behavior in an anticipatory learning classifier system. In: Adaptive Behavior (2003) (in press)
Google Scholar
Butz, M.V., Wilson, S.W.: An algorithmic description of XCS. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2000. LNCS (LNAI), vol. 1996, pp. 253–272. Springer, Heidelberg (2001)
Chapter Google Scholar
Cañamero, L.D.: Designing emotions for activity selection in autonomous agents. In: Trappl, R., Petta, P., Payr, S. (eds.) Emotions in Humans and Artifacts, The MIT Press, Cambridge (2003) (in press)
Google Scholar
Gérard, P., Meyer, J.A., Sigaud, O.: Combining latent learning and dynamic programming in MACS. In: European Journal of Operational Research (2002) (submitted)
Google Scholar
Gérard, P., Sigaud, O.: Adding a generalization mechanism to YACS. In: Proceedings of the Third Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 951–957 (2001)
Google Scholar
Gérard, P., Sigaud, O.: YACS: Combining dynamic programming with generalization in classifier systems. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2000. LNCS (LNAI), vol. 1996, pp. 52–69. Springer, Heidelberg (2001)
Chapter Google Scholar
Hoffmann, J.: Vorhersage und Erkenntnis: Die Funktion von Antizipationen in der menschlichen Verhaltenssteuerung und Wahrnehmung. In: [Anticipation and cognition: The function of anticipations in human behavioral control and perception.], Hogrefe, Göttingen, Germany (1993)
Google Scholar
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–258 (1996)
Google Scholar
Koch, C., Ullman, S.: Shifts in selective attention: Towards the underlying neural circuitry. Human Neurobiology 4, 219–227 (1985)
Google Scholar
Moore, A.W., Atkeson, C.: Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning 13, 103–130 (1993)
Google Scholar
Pashler, H.E.: The psychology of attention. MIT Press, Cambridge (1998)
Google Scholar
Stolzmann, W.: Anticipatory classifier systems. In: Genetic Programming 1998: Proceedings of the Third Annual Conference, pp. 658–664 (1998)
Google Scholar
Stolzmann, W.: An introduction to anticipatory classifier systems. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) From foundations to applications, pp. 175–194. Springer, Berlin (2000)
Google Scholar
Sutton, R.S.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Proceedings of the Seventh International Conference on Machine Learning, pp. 216–224 (1990)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)
Google Scholar
Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, UK (1989)
Google Scholar
Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3, 149–175 (1995)
Article Google Scholar
Wilson, S.W.: Function approximation with a classifier system. In: Proceedings of the Third Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 974–981 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, IL, USA
Martin V. Butz & David E. Goldberg
Department of Cognitive Psychology, University of Würzburg, Germany
Martin V. Butz

Authors

Martin V. Butz
View author publications
You can also search for this author in PubMed Google Scholar
David E. Goldberg
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Psychology, University of Würzburg, Röntgenring 11, 97070, Würzburg, Germany
Martin V. Butz
Animat Lab, University Paris VI, 104 Av du Président Kennedy, 75016, Paris, France
Olivier Sigaud
ADAge, LIPN, Univ. de Paris-Nord, 93 430, Villetaneuse, France
Pierre Gérard

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Butz, M.V., Goldberg, D.E. (2003). Generalized State Values in an Anticipatory Learning Classifier System. In: Butz, M.V., Sigaud, O., Gérard, P. (eds) Anticipatory Behavior in Adaptive Learning Systems. Lecture Notes in Computer Science(), vol 2684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45002-3_16

Download citation

DOI: https://doi.org/10.1007/978-3-540-45002-3_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40429-3
Online ISBN: 978-3-540-45002-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics