Abstract
Factored Markov Decision Processes is the theoretical framework underlying multi-step Learning Classifier Systems research. This framework is mostly used in the context of Two-stage Bayes Networks, a subset of Bayes Networks. In this paper, we compare the Learning Classifier Systems approach and the Bayes Networks approach to factored Markov Decision Problems. More specifically, we focus on a comparison between MACS, an Anticipatory Learning Classifier System, and Structured Policy Iteration, a general planning algorithm used in the context of Two-stage Bayes Networks. From that comparison, we define a new algorithm resulting from the adaptation of Structured Policy Iteration to the context of MACS. We conclude by calling for a closer communication between both research communities.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Boutilier, C., Dearden, R., Goldszmidt, M.: Exploiting structure in policy construction. In: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI 1995), Montreal, pp. 1104–1111 (1995)
Boutilier, C., Dearden, R., Goldszmidt, M.: Stochastic dynamic programming with factored representations. Artificial Intelligence 121(1), 49–107 (2000)
Boutilier, C., Dean, T., Hanks, S.: Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research 11, 1–94 (1999)
Butz, M.V., Goldberg, D.E., Stolzmann, W.: Introducing a genetic generalization pressure to the Anticipatory Classifier System part I: Theoretical approach. In: Proceedings of the 2000 Genetic and Evolutionary Computation Conference (GECCO 2000), pp. 34–41 (2000)
Butz, M.V.: An Algorithmic Description of ACS2. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 211–229. Springer, Heidelberg (2002)
Darwiche, A., Goldszmit, M.: Action networks: A framework for reasoning about action and change under uncertainty. In: Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (UAI 1994), Seattle, WA, pp. 136–144 (1994)
Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Computational Intelligence 5(3), 142–150 (1989)
Friedman, N., Murphy, K., Russell, S.: Learning the structure of dynamic probabilistic networks. In: Proceedings of UAI 1998 (1998)
Ghahramani, Z.: Learning dynamic bayesian networks. In: Giles, C.L., Gori, M. (eds.) Adaptive Processing of Temporal Information. LNCS (LNAI), Springer, Heidelberg (1997)
Gérard, P., Meyer, J.-A., Sigaud, O.: Combining latent learning with dynamic programming. European Journal of Operation Research (2003) (to appear)
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison Wesley, Reading (1989)
Gérard, P., Sigaud, O.: YACS: Combining Anticipation and Dynamic Programming in Classifier Systems. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2000. LNCS (LNAI), vol. 1996, pp. 52–69. Springer, Heidelberg (2001)
Gérard, P., Sigaud, O.: Designing efficient exploration with macs: Modules and function approximation. In: Proceedings of the Genetic and Evolutionary Computation Conference 2003 (GECC O03), Chicago, IL, July 2003, pp. 1882–1893. Springer, Heidelberg (2003)
Gérard, P., Stolzmann, W., Sigaud, O.: YACS: a new Learning Classifier System with Anticipation. Journal of Soft Computing: Special Issue on Learning Classifier Systems 6(3-4), 216–228 (2002)
Heckerman, D., Geiger, D., Chickering, D.M.: Learning bayesian networks: The combination of knowledge and statistical data. Machine Learning 20, 197–243 (1995)
Howard, R.A.: Dynamic Probabilistic Systems. Wiley, Chichester (1971)
Lanzi, P.-L.: Learning Classifier Systems from a Reinforcement Learning Perspective. Journal of Soft Computing 6(3-4), 162–170 (2002)
Lanzi, P.-L., Riolo, R.L.: A roadmap to the last decade of Learning Classifier Systems research (from 1989 to 1999). In: Lanzi, P.-L., Stolzmann, W., Wilson, S.W. (eds.) Learning Classifier Systems: from Foundations to Applications, pp. 33–62. Springer, Heidelberg (2000)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufman, San Mateo (1988)
Puterman, M.L., Shin, M.C.: Modified policy iteration algorithms for discounted Markov Decision Problems. Management Science 24, 1127–1137 (1978)
Rabiner, L.R., Juang, B.H.: An introduction to hidden markov models. IEEE ASSP Magazine, 4–16 (January 1986)
Stolzmann, W.: Anticipatory Classifier Systems. In: Genetic Programming, pp. 658–664. Morgan Kaufmann Publishers, Inc., San Francisco (1998)
Wilson, S.W.: ZCS, a zeroth level Classifier System. Evolutionary Computation 2(1), 1–18 (1994)
Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3(2), 149–175 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sigaud, O., Gourdin, T., Wuillemin, PH. (2004). Improving MACS Thanks to a Comparison with 2TBNs. In: Deb, K. (eds) Genetic and Evolutionary Computation – GECCO 2004. GECCO 2004. Lecture Notes in Computer Science, vol 3103. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24855-2_95
Download citation
DOI: https://doi.org/10.1007/978-3-540-24855-2_95
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22343-6
Online ISBN: 978-3-540-24855-2
eBook Packages: Springer Book Archive