Abstract
Learning classifier systems (LCSs) are rule-based machine learning technologies designed to learn optimal decision-making policies in the form of a compact set of maximally general and accurate rules. A study of the literature reveals that most of the existing LCSs focused primarily on learning deterministic policies. However a desirable policy may often be stochastic, in particular when the environment is partially observable. To fill this gap, based on XCS, which is one of the most successful accuracy-based LCSs, a new Michigan-style LCS called Natural XCS (i.e. NXCS) is proposed in this paper. NXCS enables direct learning of stochastic policies by utilizing a natural gradient learning technology under a policy gradient framework. Its effectiveness is experimentally compared with XCS and one of its variation known as XCS μ in this paper. Our results show that NXCS can achieve competitive performance in both deterministic and stochastic multi-step problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amari, S.: Natural gradient works efficiently in learning. Neural Computation 10(2), 251–276 (1998)
Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., Lee, M.: Natural actor-critic algorithms. Journal Automatica 45(11), 2471–2482 (2009)
Butz, M.V., Goldberg, D.E., Lanzi, P.L.: Gradient descent methods in learning classifier systems: improving xcs performance in multistep problems. IEEE Transactions on Evolutionary Computation (2005)
Butz, M.V., Wilson, S.W.: An Algorithmic Description of XCS. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 253–272. Springer, Heidelberg (2002)
Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975)
Holland, J.H.: Adaptation. In: Progress in Theoretical Biology, vol. 4, pp. 263–293. Academic Press (1976)
Lanzi, P.L.: An analysis of the memory mechanism of xcsm. In: Proceedings of the Third Genetic Programming Conference, pp. 643–651 (1998)
Lanzi, P.L.: Learning classifier systems: then and now. Evolutionary Intelligence (2008)
Lanzi, P.L., Colombetti, M.: An extension to the xcs classifier system for stochastic environments. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 353–360 (2000)
Peters, J., Schaal, S.: Natural actor-critic. Neurocomputing, 1180–1190 (2008)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)
Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems 12 (NIPS 1999), vol. 12, pp. 1057–1063. MIT Press (2000)
Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3(2), 149–175 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, G., Zhang, M., Pang, S., Douch, C. (2014). Stochastic Decision Making in Learning Classifier Systems through a Natural Policy Gradient Method. In: Loo, C.K., Yap, K.S., Wong, K.W., Beng Jin, A.T., Huang, K. (eds) Neural Information Processing. ICONIP 2014. Lecture Notes in Computer Science, vol 8836. Springer, Cham. https://doi.org/10.1007/978-3-319-12643-2_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-12643-2_37
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12642-5
Online ISBN: 978-3-319-12643-2
eBook Packages: Computer ScienceComputer Science (R0)