ABSTRACT
Accuracy based Learning Classifier System (XCS) prefers to generalize the classifiers that always acquire the same reward, because they make accurate reward predictions. However, real-world problems have noise, which means that classifiers may not receive the same reward even if they always take the correct action. For this case, since all classifiers acquire multiple values as the reward, XCS cannot identify accurate classifiers. In this paper, we study a single step environment with action noise, where XCS's action is sometimes changed at random. To overcome this problem, this paper proposes XCS based on Collective weighted Reward (XCS-CR) to identify the accurate classifiers. In XCS each rule predicts its next reward by averaging its past rewards. Instead, XCS-CR predicts its next reward by selecting a reward from the set of past rewards, by comparing the past rewards to the collective weighted average reward of the rules matching the current input for each action. This comparison helps XCS-CR identify rewards that result from action noise. In experiments, XCS-CR acquired the optimal generalized classifier subset in 6-Multiplexer problems with action noise, similar to the environment without noise, and judged those optimal generalized classifiers correctly accurate.
- M. V. Butz, T. Kovacs, P. L. Lanzi, and S. W. Wilson. 2004. Toward a Theory of Generalization and Learning in XCS. Evolutionary Computation, IEEE Transactions on 8, 1 (2004), 28--46. Google ScholarDigital Library
- M. V. Butz and S. W. Wilson. 2002. An algorithmic description of XCS. Soft Computing 6, 3--4 (2002), 144--153.Google ScholarCross Ref
- D. E. Goldberg. 1989. Genetic Algorithms in Search, Optimization and Machine Learning (1st ed.). Addison-Wesley Longman Publishing Co., Inc. Google ScholarDigital Library
- J. H. Holland. 1986. Escaping Brittleness: The Possibilities of General-Purpose Learning Algorithms Applied to Parallel Rule-Based Systems. Machine learning (1986), 593--623.Google Scholar
- P. L. Lanzi. 1999. An Analysis of Generalization in the XCS Classifier System. Evolutionary Computation Journal 7, 2 (1999), 125--149. Google ScholarDigital Library
- P. L. Lanzi and M. Colombetti. 1999. An Extension to the XCS Classifier System for Stochastic Environments. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-99). 353--360. Google ScholarDigital Library
- P. L. Lanzi and S. W. Wilson. 2000. Toward Optimal Classifier System Performance in Non-Markov Environments. Evol. Comput. 8, 4 (Dec. 2000), 393--418. Google ScholarDigital Library
- R. S. Sutton. 1988. Learning to Predict by the Methods of Temporal Differences. Machine Learning 3, 1 (1988), 9--44. Google ScholarDigital Library
- H. Takagi. 2001. Interactive evolutionary computation: Fusion of the capabilities of EC optimization and human evaluation. Proc. IEEE 89, 9 (2001), 1275--1296.Google ScholarCross Ref
- T. Tatsumi, T. Komine, M. Nakata, H. Sato, T. Kovacs, and K. Takadama. 2016. Variance-based Learning Classifier System without Convergence of Reward Estimation. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion (GECCO '16 Companion). ACM, 67--68. Google ScholarDigital Library
- A. Webb, E. Hart, P. Ross, and A. Lawson. 2003. Controlling a Simulated Khepera with an XCS Classifier System with Memory. Springer Berlin Heidelberg, Berlin, Heidelberg, 885--892.Google Scholar
- S. W. Wilson. 1995. Classifier Fitness Based on Accuracy. Evol. Comput. 3, 2 (June 1995), 149--175. Google ScholarDigital Library
- S. W. Wilson. 2000. Get Real! XCS with Continuous-Valued Inputs. Springer Berlin Heidelberg, 209--219. Google ScholarDigital Library
Index Terms
- XCS-CR: determining accuracy of classifier by its collective reward in action set toward environment with action noise
Recommendations
XCS-CR for handling input, output, and reward noise
GECCO '19: Proceedings of the Genetic and Evolutionary Computation Conference CompanionTo briefly represent a dataset, it is crucial to find common attributes among the data. Extended learning classifier system (XCS) finds common attributes of multiple data and acquires generalized rules that match multiple data. In real-world problems, ...
Automatic adjustment of selection pressure based on range of reward in learning classifier system
GECCO '17: Proceedings of the Genetic and Evolutionary Computation ConferenceXCS (Accuracy-based learning classifier system) can acquire accurate classifiers on the basis of consistent reward, but it does not always receive the consistent reward in real world problems even if it provides the same output for the same input. Such ...
Improving genetic search in XCS-based classifier systems through understanding the evolvability of classifier rules
Learning classifier systems (LCSs), an established evolutionary computation technique, are over 30 years old with much empirical testing and foundations of theoretical understanding. XCS is a well-tested LCS model that generates optimal (i.e., maximally ...
Comments