Abstract
The chapter investigates how model and behavioral learning can be improved in an anticipatory learning classifier system by biasing exploration. First, the applied system ACS2 is explained. Next, an overview over the possibilities of applying exploration biases in an anticipatory learning classifier system and specifically ACS2 is provided. In ACS2, a recency bias termed action delay bias as well as an error bias termed knowledge array bias is implemented. The system is applied in a dynamic maze task and an hand-eye coordination task to validate the biases. The experiments exhibit that biased exploration enables ACS2 to evolve and adapt its internal environmental model faster. Also adaptive behavior is improved.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Birk, A. (1995). Stimulus Response Lernen [Stimulus response learning]. Doctoral dissertation, University of Saarbrücken, Germany.
Butz, M. V. (2001). Anticipatory learning classifier systems. Genetic Algorithms and Evolutionary Computation. Boston, MA: Kluwer Academic Publishers.
Butz, M. V., Goldberg, D. E., and Stolzmann, W. (2000). Introducing a genetic generalization pressure to the anticipatory classifier system: Part 2-performance analysis. In Whitely, D., Goldberg, D. E., Cantu-Paz, E., Spector, L., Parmee, I., and Beyer, H.-G. (Eds.), Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2000) pp. 42–49. San Francisco, CA: Morgan Kaufmann.
Butz, M. V., Goldberg, D. E., and Stolzmann, W. (2001). Probability-enhanced predictions in the anticipatory classifier system. In Lanzi, P. L., Stolzmann, W., and Wilson, S. W. (Eds.), Advances in Learning Classifier Systems, LNAI 1996 pp. 37–51. Berlin Heidelberg: Springer-Verlag.
Dayan, P., and Sejnowski, T. J. (1996). Exploration bonus and dual control. Machine Learning, 25(1), 5–22.
Gérard, P., and Sigaud, O. (2001). YACS: Combining dynamic programming with generalization in classifier systems. In Lanzi, P. L., Stolzmann, W., and Wilson, S. W. (Eds.), Advances in Learning Classifier Systems, LNAI 1996 pp. 52–69. Berlin Heidelberg: Springer-Verlag.
Hoffmann, J. (1993). Vorhersage und Erkenntnis [Anticipation and cognition]. Goettingen, Germany: Hogrefe.
Kaelbling, L. P. (1993). Learning in embedded systems. Cambridge, MA: MIT Press.
Lanzi, P. L. (1999). An analysis of generalization in the XCS classifier system. Evolutionary Computation, 7(2), 125–149.
Lanzi, P. L., Stolzmann, W., and Wilson, S. W. (Eds.) (2001). Advances in learning classifier systems, LNAI 1996. Berlin Heidelberg: Springer-Verlag.
Moore, A. W., and Atkeson, C. (1993). Memory-based reinforcement learning: Converging with less data and less real time. Machine Learning, 13, 103–130.
Stolzmann, W. (1997). Antizipative Classifier Systems [Anticipatory classifier systems]. Aachen, Germany: Shaker Verlag.
Stolzmann, W. (2000). An introduction to anticipatory classifier systems. In Lanzi, P. L., Stolzmann, W., and Wilson, S. W. (Eds.), Learning Classifier Systems: From Foundations to Applications, LNAI 1813 pp. 175–194. Berlin Heidelberg: Springer-Verlag.
Stolzmann, W., and Butz, M. V. (2000). Latent learning and action-planning in robots with anticipatory classifier systems. In Lanzi, P. L., Stolzmann, W., and Wilson, S. W. (Eds.), Learning Classifier Systems: From Foundations to Applications, LNAI 1813 pp. 301–317. Berlin Heidelberg: Springer-Verlag.
Stolzmann, W., Butz, M. V., Hoffmann, J., and Goldberg, D. E. (2000). First cognitive capabilities in the anticipatory classifier system. In Meyer, J.-A., Berthoz, A., Floreano, D., Roitblat, H., and Wilson, S. W. (Eds.), From Animals to Animats 6: Proceedings of the Sixth International Conference on Simulation of Adaptive Behavior pp. 287–296. Cambridge, MA: MIT Press.
Sutton, R. S. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Proceedings of the Seventh International Conference on Machine Learning pp. 216–224. San Mateo, CA: Morgan Kaufmann.
Sutton, R. S., and Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
Thrun, S. B. (1992). The role of exploration in learning control. In White, D.A. adn Sofge, D. (Ed.), Handbook of Intelligent Control: Neural, Fuzzy and Adaptive Approaches New York, NY: Van Nostrand Reinhold.
Tomlinson, A., and Bull, L. (2000). A corporate XCS. In Lanzi, P. L., Stolzmann, W., and Wilson, S. W. (Eds.), Learning Classifier Systems: From Foundations to Applications, LNAI 1813 pp. 195–208. Berlin Heidelberg: Springer-Verlag.
Venturini, G. (1994). Adaptation in dynamic environments through a minimal probability of exploration. In Cliff, D., Husbands, P., Meyer, J.-A., and Wilson, S. W. (Eds.), From Animals to Animats 3: Proceedings of the Third International Conference on Simulation of Adaptive Behavior pp. 371–381. Cambridge, MA: MIT Press.
Watkins, C. J. C. H. (1989). Learning from delayed rewards. Doctoral dissertation, King’s College, Cambridge, UK.
Wilson, S. W. (1995). Classifier fitness based on accuracy. Evolutionary Computation, 3(2), 149–175.
Wilson, S. W. (1996). Explore/exploit strategies in autonomy. In Maes, P., Matariac, M. adn Pollak, J., Meyer, J.-A., and Wilson, S. (Eds.), From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior pp. 325–332. Cambridge, MA: MIT Press.
Wilson, S. W. (1998). Generalization in the XCS classifier system. In Koza, J. R., Banzhaf, W., Chellapilla, K., Deb, K., Dorigo, M., Fogel, D., Grazon, M., Goldberg, D., Iba, H., and Riolo, R. (Eds.), Genetic Programming 1998: Proceedings of the Third Annual Conference pp. 665–674. San Francisco: Morgan Kaufmann.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Butz, M.V. (2002). Biasing Exploration in an Anticipatory Learning Classifier System. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds) Advances in Learning Classifier Systems. IWLCS 2001. Lecture Notes in Computer Science(), vol 2321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48104-4_1
Download citation
DOI: https://doi.org/10.1007/3-540-48104-4_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43793-2
Online ISBN: 978-3-540-48104-1
eBook Packages: Springer Book Archive