Abstract
This paper introduces generalized state values to the anticipatory learning classifier system ACS2. Previous studies showed that the evolving generalized state value in ACS2 might be overgeneral for a proper policy representation. Thus, the policy representation is separated from the model representation. A function approximation module is added that approximates state values. Actual action choice then depends on the learned generalized state values predicted by the means of the predictive model yielding anticipatory behavior. It is shown that the function approximation module accurately generalizes the state value function in the investigated MDP. Improvement of the approach by the means of further anticipatory interaction between predictive model learner and state value learner is suggested. We also propose the implementation of task dependent anticipatory attentional mechanisms exploiting the representation of the generalized state-value function. Finally, the anticipatory framework may be extended to support multiple motivations integrated in a motivational module which could be influenced by emotional biases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Avila-Garcá, O., Cañamero, L.D.: A comparison of behavior selection architectures using viability indicators. In: EPSRC/BBSRC International Workshop Biologically-Inspired Robotics: The Legacy of W. Grey Walter, HP Bristol Labs, UK (2002)
Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: Safely approximating the value function. Advances in Neural Information Processing Systems 7 (1995)
Butz, M.V.: Anticipatory learning classifier systems. Kluwer Academic Publishers, Boston (2002)
Butz, M.V.: Biasing exploration in an anticipatory learning classifier system. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 3–22. Springer, Heidelberg (2002)
Butz, M.V.: State value learning with an anticipatory learning classifier system in a markov decision process. IlliGAL report 2002018, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign (2002), http://wwwilligal.ge.uiuc.edu/
Butz, M.V., Goldberg, D.E., Stolzmann, W.: The anticipatory classifier system and genetic generalization. Natural Computing 1, 427–467 (2002)
Butz, M.V., Goldberg, D.E., Tharakunnel, K.: Analysis and improvement of fitness exploitation in xcs: Bounding models, tournament selection, and bilateral accuracy. In: Evolutionary Computation (2003) (in press)
Butz, M.V., Hoffmann, J.: Anticipations control behavior: Animal behavior in an anticipatory learning classifier system. In: Adaptive Behavior (2003) (in press)
Butz, M.V., Wilson, S.W.: An algorithmic description of XCS. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2000. LNCS (LNAI), vol. 1996, pp. 253–272. Springer, Heidelberg (2001)
Cañamero, L.D.: Designing emotions for activity selection in autonomous agents. In: Trappl, R., Petta, P., Payr, S. (eds.) Emotions in Humans and Artifacts, The MIT Press, Cambridge (2003) (in press)
Gérard, P., Meyer, J.A., Sigaud, O.: Combining latent learning and dynamic programming in MACS. In: European Journal of Operational Research (2002) (submitted)
Gérard, P., Sigaud, O.: Adding a generalization mechanism to YACS. In: Proceedings of the Third Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 951–957 (2001)
Gérard, P., Sigaud, O.: YACS: Combining dynamic programming with generalization in classifier systems. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2000. LNCS (LNAI), vol. 1996, pp. 52–69. Springer, Heidelberg (2001)
Hoffmann, J.: Vorhersage und Erkenntnis: Die Funktion von Antizipationen in der menschlichen Verhaltenssteuerung und Wahrnehmung. In: [Anticipation and cognition: The function of anticipations in human behavioral control and perception.], Hogrefe, Göttingen, Germany (1993)
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–258 (1996)
Koch, C., Ullman, S.: Shifts in selective attention: Towards the underlying neural circuitry. Human Neurobiology 4, 219–227 (1985)
Moore, A.W., Atkeson, C.: Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning 13, 103–130 (1993)
Pashler, H.E.: The psychology of attention. MIT Press, Cambridge (1998)
Stolzmann, W.: Anticipatory classifier systems. In: Genetic Programming 1998: Proceedings of the Third Annual Conference, pp. 658–664 (1998)
Stolzmann, W.: An introduction to anticipatory classifier systems. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) From foundations to applications, pp. 175–194. Springer, Berlin (2000)
Sutton, R.S.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Proceedings of the Seventh International Conference on Machine Learning, pp. 216–224 (1990)
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)
Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, UK (1989)
Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3, 149–175 (1995)
Wilson, S.W.: Function approximation with a classifier system. In: Proceedings of the Third Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 974–981 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Butz, M.V., Goldberg, D.E. (2003). Generalized State Values in an Anticipatory Learning Classifier System. In: Butz, M.V., Sigaud, O., Gérard, P. (eds) Anticipatory Behavior in Adaptive Learning Systems. Lecture Notes in Computer Science(), vol 2684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45002-3_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-45002-3_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40429-3
Online ISBN: 978-3-540-45002-3
eBook Packages: Springer Book Archive