Stochastic Decision Making in Learning Classifier Systems through a Natural Policy Gradient Method

Chen, Gang; Zhang, Mengjie; Pang, Shaoning; Douch, Colin

doi:10.1007/978-3-319-12643-2_37

Gang Chen²⁰,
Mengjie Zhang²⁰,
Shaoning Pang²⁰ &
…
Colin Douch²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8836))

Included in the following conference series:

International Conference on Neural Information Processing

4397 Accesses
2 Citations

Abstract

Learning classifier systems (LCSs) are rule-based machine learning technologies designed to learn optimal decision-making policies in the form of a compact set of maximally general and accurate rules. A study of the literature reveals that most of the existing LCSs focused primarily on learning deterministic policies. However a desirable policy may often be stochastic, in particular when the environment is partially observable. To fill this gap, based on XCS, which is one of the most successful accuracy-based LCSs, a new Michigan-style LCS called Natural XCS (i.e. NXCS) is proposed in this paper. NXCS enables direct learning of stochastic policies by utilizing a natural gradient learning technology under a policy gradient framework. Its effectiveness is experimentally compared with XCS and one of its variation known as XCS_μ in this paper. Our results show that NXCS can achieve competitive performance in both deterministic and stochastic multi-step problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Amari, S.: Natural gradient works efficiently in learning. Neural Computation 10(2), 251–276 (1998)
Article MathSciNet Google Scholar
Bhatnagar, S., Sutton, R.S., Ghavamzadeh, M., Lee, M.: Natural actor-critic algorithms. Journal Automatica 45(11), 2471–2482 (2009)
Article MathSciNet MATH Google Scholar
Butz, M.V., Goldberg, D.E., Lanzi, P.L.: Gradient descent methods in learning classifier systems: improving xcs performance in multistep problems. IEEE Transactions on Evolutionary Computation (2005)
Google Scholar
Butz, M.V., Wilson, S.W.: An Algorithmic Description of XCS. In: Lanzi, P.L., Stolzmann, W., Wilson, S.W. (eds.) IWLCS 2001. LNCS (LNAI), vol. 2321, pp. 253–272. Springer, Heidelberg (2002)
Google Scholar
Holland, J.H.: Adaptation in Natural and Artificial Systems. University of Michigan Press (1975)
Google Scholar
Holland, J.H.: Adaptation. In: Progress in Theoretical Biology, vol. 4, pp. 263–293. Academic Press (1976)
Google Scholar
Lanzi, P.L.: An analysis of the memory mechanism of xcsm. In: Proceedings of the Third Genetic Programming Conference, pp. 643–651 (1998)
Google Scholar
Lanzi, P.L.: Learning classifier systems: then and now. Evolutionary Intelligence (2008)
Google Scholar
Lanzi, P.L., Colombetti, M.: An extension to the xcs classifier system for stochastic environments. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 353–360 (2000)
Google Scholar
Peters, J., Schaal, S.: Natural actor-critic. Neurocomputing, 1180–1190 (2008)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press (1998)
Google Scholar
Sutton, R.S., McAllester, D., Singh, S., Mansour, Y.: Policy gradient methods for reinforcement learning with function approximation. In: Advances in Neural Information Processing Systems 12 (NIPS 1999), vol. 12, pp. 1057–1063. MIT Press (2000)
Google Scholar
Wilson, S.W.: Classifier fitness based on accuracy. Evolutionary Computation 3(2), 149–175 (1995)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Unitec Institute of Technology, Victoria University of Wellington, New Zealand
Gang Chen, Mengjie Zhang, Shaoning Pang & Colin Douch

Authors

Gang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Mengjie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shaoning Pang
View author publications
You can also search for this author in PubMed Google Scholar
Colin Douch
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Artificial Intelligence, Faculty of Computer Science and Information Technology Building, University of Malaya, 50603, Kuala Lumpur, Malaysia
Chu Kiong Loo
Department of Electronics and Communication Engineering, College of Engineering, Jalan IKRAM-UNITEN, Universiti Tenaga Nasional, 43009, Kajang, Selangor, Malaysia
Keem Siah Yap
School of Engineering and Information Technology, Murdoch University, 6150, South St, Murdoch, Western Australia, Australia
Kok Wai Wong
Department of Electrical and Electronics Engineering, Yonsei University, 50 Yonsei-ro, Seodaemun-gu, 120-749, Seoul, South Korea
Andrew Teoh Beng Jin
Department of Electrical and Electronic Engineering, Xi’an Jiaotong-Liverpool University, Ren’ai Road 111, SIP 215123, Suzhou, Jiangsu Province, China
Kaizhu Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, G., Zhang, M., Pang, S., Douch, C. (2014). Stochastic Decision Making in Learning Classifier Systems through a Natural Policy Gradient Method. In: Loo, C.K., Yap, K.S., Wong, K.W., Beng Jin, A.T., Huang, K. (eds) Neural Information Processing. ICONIP 2014. Lecture Notes in Computer Science, vol 8836. Springer, Cham. https://doi.org/10.1007/978-3-319-12643-2_37

Download citation

DOI: https://doi.org/10.1007/978-3-319-12643-2_37
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12642-5
Online ISBN: 978-3-319-12643-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics