ABSTRACT
XCS (Accuracy-based learning classifier system) can acquire accurate classifiers on the basis of consistent reward, but it does not always receive the consistent reward in real world problems even if it provides the same output for the same input. Such a situation prevents XCS from reducing the number of overspecific accurate classifiers by the subsumption mechanism. This means that XCS finds it hard to acquire the optimal classifiers. For this issue, our previous research proposed XCS-MR (XCS based on Mean of Reward) which can reduce the number of classifiers even in the environments where the size of the rewards is uncertain. However, XCS-MR requires a large amount of learning data to correctly determine the accuracy of classifiers because XCS-MR needs to record the average and variance of the rewards in all input-output space. To overcome this problem, this paper proposes a new XCS that can reduce the number of the classifiers even in the uncertain reward environments without recording the average and variance of the rewards in all input-output space. This paper shows the effectiveness of the proposed XCS through the experiments.
- M. V. Butz, T. Kovacs, P. L. Lanzi, and S. W. Wilson. 2004. Toward a Theory of Generalization and Learning in XCS. Evolutionary Computation, IEEE Transactions on 8, 1 (2004), 28--46. Google ScholarDigital Library
- M. V. Butz and O. Sigaud. 2012. XCSF with Local Deletion: Preventing Detrimental Forgetting. Evolutionary Intelligence 5, 2 (2012), 117--127.Google ScholarCross Ref
- D. E. Goldberg. 1989. Genetic Algorithms in Search, Optimization and Machine Learning (1st ed.). Addison-Wesley Longman Publishing Co., Inc. Google ScholarDigital Library
- J. H. Holland. 1986. Escaping Brittleness: The Possibilities of General-Purpose Learning Algorithms Applied to Parallel Rule-Based Systems. Machine learning (1986), 593--623.Google Scholar
- P. L. Lanzi. 1999. An Analysis of Generalization in the XCS Classifier System. Evolutionary Computation Journal 7, 2 (1999), 125--149. Google ScholarDigital Library
- P. L. Lanzi and M. Colombetti. 1999. An Extension to the XCS Classifier System for Stochastic Environments. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-99). 353--360. Google ScholarDigital Library
- P. L. Lanzi and S. W. Wilson. 2000. Toward Optimal Classifier System Performance in Non-Markov Environments. Evol. Comput. 8, 4 (Dec. 2000), 393--418. Google ScholarDigital Library
- R. S. Sutton. 1988. Learning to Predict by the Methods of Temporal Differences. Machine Learning 3, 1 (1988), 9--44. Google ScholarDigital Library
- H. Takagi. 2001. Interactive evolutionary computation: Fusion of the capabilities of EC optimization and human evaluation. Proc. IEEE 89, 9 (2001), 1275--1296.Google ScholarCross Ref
- T. Tatsumi, T. Komine, M. Nakata, H. Sato, T. Kovacs, and K. Takadama. 2016. Variance-based Learning Classifier System without Convergence of Reward Estimation. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion (GECCO '16 Companion). ACM, 67--68. Google ScholarDigital Library
- A. Webb, E. Hart, P. Ross, and A. Lawson. 2003. Controlling a Simulated Khepera with an XCS Classifier System with Memory. Springer Berlin Heidelberg, Berlin, Heidelberg, 885--892.Google Scholar
- S. W. Wilson. 1995. Classifier Fitness Based on Accuracy. Evol. Comput. 3, 2 (June 1995), 149--175. Google ScholarDigital Library
- S. W. Wilson. 2000. Get Real! XCS with Continuous-Valued Inputs. Springer Berlin Heidelberg, 209--219. Google ScholarDigital Library
Index Terms
- Automatic adjustment of selection pressure based on range of reward in learning classifier system
Recommendations
XCS-CR: determining accuracy of classifier by its collective reward in action set toward environment with action noise
GECCO '18: Proceedings of the Genetic and Evolutionary Computation Conference CompanionAccuracy based Learning Classifier System (XCS) prefers to generalize the classifiers that always acquire the same reward, because they make accurate reward predictions. However, real-world problems have noise, which means that classifiers may not ...
Learning classifier system with average reward reinforcement learning
In the family of Learning Classifier Systems, the classifier system XCS is most widely used and investigated. However, the standard XCS has difficulties solving large multi-step problems, where long action chains are needed to get delayed rewards. Up to ...
Improving genetic search in XCS-based classifier systems through understanding the evolvability of classifier rules
Learning classifier systems (LCSs), an established evolutionary computation technique, are over 30 years old with much empirical testing and foundations of theoretical understanding. XCS is a well-tested LCS model that generates optimal (i.e., maximally ...
Comments