Skip to main content

Generalized State Values in an Anticipatory Learning Classifier System

  • Chapter
Book cover Anticipatory Behavior in Adaptive Learning Systems

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2684))

Abstract

This paper introduces generalized state values to the anticipatory learning classifier system ACS2. Previous studies showed that the evolving generalized state value in ACS2 might be overgeneral for a proper policy representation. Thus, the policy representation is separated from the model representation. A function approximation module is added that approximates state values. Actual action choice then depends on the learned generalized state values predicted by the means of the predictive model yielding anticipatory behavior. It is shown that the function approximation module accurately generalizes the state value function in the investigated MDP. Improvement of the approach by the means of further anticipatory interaction between predictive model learner and state value learner is suggested. We also propose the implementation of task dependent anticipatory attentional mechanisms exploiting the representation of the generalized state-value function. Finally, the anticipatory framework may be extended to support multiple motivations integrated in a motivational module which could be influenced by emotional biases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Avila-Garcá, O., Cañamero, L.D.: A comparison of behavior selection architectures using viability indicators. In: EPSRC/BBSRC International Workshop Biologically-Inspired Robotics: The Legacy of W. Grey Walter, HP Bristol Labs, UK (2002)

    Google Scholar 

  2. Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: Safely approximating the value function. Advances in Neural Information Processing Systems 7 (1995)

    Google Scholar 

  3. Butz, M.V.: Anticipatory learning classifier systems. Kluwer Academic Publishers, Boston (2002)

    MATH  Google Scholar 

  4. Butz, M.V.: Biasing exploration in an anticipatory learning classifier system. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 3–22. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Butz, M.V.: State value learning with an anticipatory learning classifier system in a markov decision process. IlliGAL report 2002018, Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign (2002), http://wwwilligal.ge.uiuc.edu/

  6. Butz, M.V., Goldberg, D.E., Stolzmann, W.: The anticipatory classifier system and genetic generalization. Natural Computing 1, 427–467 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  7. Butz, M.V., Goldberg, D.E., Tharakunnel, K.: Analysis and improvement of fitness exploitation in xcs: Bounding models, tournament selection, and bilateral accuracy. In: Evolutionary Computation (2003) (in press)

    Google Scholar 

  8. Butz, M.V., Hoffmann, J.: Anticipations control behavior: Animal behavior in an anticipatory learning classifier system. In: Adaptive Behavior (2003) (in press)

    Google Scholar 

  9. Butz, M.V., Wilson, S.W.: An algorithmic description of XCS. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2000. LNCS (LNAI), vol. 1996, pp. 253–272. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  10. Cañamero, L.D.: Designing emotions for activity selection in autonomous agents. In: Trappl, R., Petta, P., Payr, S. (eds.) Emotions in Humans and Artifacts, The MIT Press, Cambridge (2003) (in press)

    Google Scholar 

  11. Gérard, P., Meyer, J.A., Sigaud, O.: Combining latent learning and dynamic programming in MACS. In: European Journal of Operational Research (2002) (submitted)

    Google Scholar 

  12. Gérard, P., Sigaud, O.: Adding a generalization mechanism to YACS. In: Proceedings of the Third Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 951–957 (2001)

    Google Scholar 

  13. Gérard, P., Sigaud, O.: YACS: Combining dynamic programming with generalization in classifier systems. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2000. LNCS (LNAI), vol. 1996, pp. 52–69. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  14. Hoffmann, J.: Vorhersage und Erkenntnis: Die Funktion von Antizipationen in der menschlichen Verhaltenssteuerung und Wahrnehmung. In: [Anticipation and cognition: The function of anticipations in human behavioral control and perception.], Hogrefe, Göttingen, Germany (1993)

    Google Scholar 

  15. Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–258 (1996)

    Google Scholar 

  16. Koch, C., Ullman, S.: Shifts in selective attention: Towards the underlying neural circuitry. Human Neurobiology 4, 219–227 (1985)

    Google Scholar 

  17. Moore, A.W., Atkeson, C.: Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning 13, 103–130 (1993)

    Google Scholar 

  18. Pashler, H.E.: The psychology of attention. MIT Press, Cambridge (1998)

    Google Scholar 

  19. Stolzmann, W.: Anticipatory classifier systems. In: Genetic Programming 1998: Proceedings of the Third Annual Conference, pp. 658–664 (1998)

    Google Scholar 

  20. Stolzmann, W.: An introduction to anticipatory classifier systems. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) From foundations to applications, pp. 175–194. Springer, Berlin (2000)

    Google Scholar 

  21. Sutton, R.S.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Proceedings of the Seventh International Conference on Machine Learning, pp. 216–224 (1990)

    Google Scholar 

  22. Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  23. Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, UK (1989)

    Google Scholar 

  24. Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3, 149–175 (1995)

    Article  Google Scholar 

  25. Wilson, S.W.: Function approximation with a classifier system. In: Proceedings of the Third Genetic and Evolutionary Computation Conference (GECCO-2001), pp. 974–981 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Butz, M.V., Goldberg, D.E. (2003). Generalized State Values in an Anticipatory Learning Classifier System. In: Butz, M.V., Sigaud, O., Gérard, P. (eds) Anticipatory Behavior in Adaptive Learning Systems. Lecture Notes in Computer Science(), vol 2684. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45002-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45002-3_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40429-3

  • Online ISBN: 978-3-540-45002-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics