skip to main content
10.1145/3449726.3463159acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

An experimental comparison of explore/exploit strategies for the learning classifier system XCS

Published:08 July 2021Publication History

ABSTRACT

When determining the actions to execute, reinforcement learners are constantly faced with the decision of either exploiting existing knowledge or exploring new options, risking short-term costs but potentially improving performance in the long run. This paper describes and experimentally evaluates four existing explore/exploit strategies for the learning classifier system XCS. The evaluation takes place on three well-known learning problems - two multiplexers and one maze environment. An automized parameter optimization is conducted, showing that different environments require different parametrization of the strategies. Further, our results indicate that none of the strategies is superior to the others. It turns out that multi-step problems with scarce rewards are challenging for the selected strategies, highlighting the need to develop more reliable explore/exploit strategies to tackle such environments.

References

  1. Anthony J. Bagnall and George D. Smith. 2005. A multiagent model of the UK market in electricity generation. IEEE Transactions on Evolutionary Computation 9, 5 (oct 2005), 522--536. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. M. Barry. 2002. The stability of long action chains in XCS. Soft Computing - A Fusion of Foundations, Methodologies and Applications 6, 3 (jun 2002), 183--199. Google ScholarGoogle ScholarCross RefCross Ref
  3. Martin V. Butz. 2001. Biasing Exploration in an Anticipatory Learning Classifier System. In Advances in Learning Classifier Systems, 4th International Workshop, IWLCS 2001, San Francisco, CA, USA, July 7-8, 2001, Revised Papers (Lecture Notes in Computer Science), Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson (Eds.), Vol. 2321. Springer, 3--22. Google ScholarGoogle ScholarCross RefCross Ref
  4. Ali Hamzeh and Adel Rahmani. 2005. A Fuzzy System to Control Exploration Rate in XCS. In Learning Classifier Systems, International Workshops, IWLCS 2003-2005, Revised Selected Papers (Lecture Notes in Computer Science), Tim Kovacs, Xavier Llorà, Keiki Takadama, Pier Luca Lanzi, Wolfgang Stolzmann, and Stewart W. Wilson (Eds.), Vol. 4399. Springer, 115--127. Google ScholarGoogle ScholarCross RefCross Ref
  5. Tim Kovacs. 2002. Performance and population state metrics for rule-based learning systems. In Proceedings of the 2002 Congress on Evolutionary Computation, CEC 2002, Vol. 2. IEEE Computer Society, 1781--1786. Google ScholarGoogle ScholarCross RefCross Ref
  6. Peter R. Lewis, Marco Platzner, Bernhard Rinner, Jim Tørresen, and Xin Yao (Eds.). 2016. Self-aware Computing Systems. Springer International Publishing. Google ScholarGoogle ScholarCross RefCross Ref
  7. Manuel López-Ibáñez, Jérémie Dubois-Lacoste, Leslie Pérez Cáceres, Thomas Stützle, and Mauro Birattari. 2016. The irace package: Iterated Racing for Automatic Algorithm Configuration. Operations Research Perspectives 3 (2016), 43--58. Google ScholarGoogle ScholarCross RefCross Ref
  8. H. B. Mann and D. R. Whitney. 1947. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other. The Annals of Mathematical Statistics 18, 1 (mar 1947), 50--60. Google ScholarGoogle ScholarCross RefCross Ref
  9. Alex McMahon, Dan Scott, Paul Baxter, and Will Browne. 2006. An autonomous explore/exploit strategy. In Proceedings of AISB'06: Adaptation in Artificial and Biological Systems, Vol. 2. ACM Press, New York, New York, USA, 192--201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Christian Müller-Schloer, Hartmut Schmeck, and Theo Ungerer (Eds.). 2011. Organic Computing --- A Paradigm Shift for Complex Systems. Springer Basel. Google ScholarGoogle ScholarCross RefCross Ref
  11. Lilia Rejeb, Zahia Guessoum, and Rym M'Hallah. 2005. An Adaptive Approach for the Exploration-Exploitation Dilemma and Its Application to Economic Systems. In Learning and Adaption in Multi-Agent Systems, First International Workshop, LAMAS 2005, Utrecht, The Netherlands, July 25, 2005, Revised Selected Papers (Lecture Notes in Computer Science), Karl Tuyls, Pieter Jan't Hoen, Katja Verbeeck, and Sandip Sen (Eds.), Vol. 3898. Springer, 165--176. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Richard S Sutton and Andrew G Barto. 2018. Reinforcement Learning: An Introdcution (2nd ed.). MIT Press. 427 pages.Google ScholarGoogle Scholar
  13. Stewart W. Wilson. 1995. Classifier Fitness Based on Accuracy. Evolutionary Computation 3, 2 (jun 1995), 149--175. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Stewart W. Wilson. 1996. Explore/Exploit Strategies in Autonomy. In From Animals to Animats 4: Proceedings of the Fourth International Conference on Simulation of Adaptive Behavior. The MIT Press. Google ScholarGoogle ScholarCross RefCross Ref
  15. Robert F Zhang and Ryan J Urbanowicz. 2020. A Scikit-learn Compatible Learning Classifier System. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion. ACM, New York, NY, USA. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An experimental comparison of explore/exploit strategies for the learning classifier system XCS

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GECCO '21: Proceedings of the Genetic and Evolutionary Computation Conference Companion
        July 2021
        2047 pages
        ISBN:9781450383516
        DOI:10.1145/3449726

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 8 July 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate1,669of4,410submissions,38%

        Upcoming Conference

        GECCO '24
        Genetic and Evolutionary Computation Conference
        July 14 - 18, 2024
        Melbourne , VIC , Australia

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader