skip to main content
10.1145/2001858.2002021acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
tutorial

Random artificial incorporation of noise in a learning classifier system environment

Published: 12 July 2011 Publication History

Abstract

Effective rule generalization in learning classifier systems (LCSs) has long since been an important consideration. In noisy problem domains, where attributes do not precisely determine class, overemphasis on accuracy without sufficient generalization leads to over-fitting of the training data, and a large discrepancy between training and testing accuracies. This issue is of particular concern within noisy bioinformatic problems such as complex disease, gene association studies. In an effort to promote effective generalization we introduce and explore a simple strategy which seeks to discourage over-fitting via the probabilistic incorporation of random noise within training instances. We evaluate a variety of noise models and magnitudes which either specify an equal probability of noise per attribute, or target higher noise probability to the attributes which tend to be more frequently generalized. Our results suggest that targeted noise incorporation can reduce training accuracy without eroding testing accuracy. In addition, we observe a slight improvement in our power estimates (i.e. ability to detect the true underlying model(s)).

References

[1]
J. Bacardit and M. Butz. Data mining in learning classifier systems: comparing XCS with GAssist. Urbana, 51:61801, 2004.
[2]
J. Bacardit and J. Garrell. Métodos de generalización para sistemas clasificadores de Pittsburgh. In Proceedings of the Primer Congreso Espanol de Algoritmos Evolutivos y Bioinspirados (AEB'02), 486--493, 2002.
[3]
J. Bacardit and J. Garrell. Bloat control and generalization pressure using the minimum description length principle for a pittsburgh approach LCS. Learning Classifier Systems, pages 59--79, 2007.
[4]
J. Bacardit, D. Goldberg, M. Butz, X. Llora, and J. Garrell. Speeding-up pittsburgh learning classifier systems: Modeling time and accuracy. In Parallel Problem Solving from Nature-PPSN VIII, 1021--1031. Springer, 2004.
[5]
J. Bacardit and N. Krasnogor. Biohel: Bioinformatics-oriented hierarchical evolutionary learning. Computer Science and IT, 2006.
[6]
J. Bacardit and N. Krasnogor. Performance and efficiency of memetic pittsburgh LCSs. Evolutionary computation, 17(3):307--342, 2009.
[7]
E. Bernadó, X. Llora, and J. Garrell. XCS and GALE: A comparative study of two learning classifier systems on data mining. Advances in learning classifier systems, 115--132, 2002.
[8]
E. Bernadó-Mansilla and J. Garrell-Guiu. Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evolutionary Computation, 11(3):209--238, 2003.
[9]
M. Butz. Rule-based evolutionary online learning systems: A principled approach to LCS analysis and design. Springer Verlag, 2006.
[10]
M. Butz, T. Kovacs, P. Lanzi, and S. Wilson. Toward a theory of generalization and learning in XCS. Evolutionary Computation, IEEE Transactions on, 8(1):28--46, 2004.
[11]
M. Butz and M. Pelikan. Analyzing the evolutionary pressures in XCS. In Proceedings of the Third Genetic and Evolutionary Computation Conference (GECCO-2001), volume 935, page 942, 2001.
[12]
M. Butz, K. Sastry, and D. Goldberg. Tournament selection: Stable fitness pressure in XCS. In Genetic and Evolutionary Computation GECCO 2003, 215--216. Springer, 2003.
[13]
M. Butz, K. Sastry, and D. Goldberg. Strong, stable, and reliable fitness pressure in XCS due to tournament selection. Genetic Programming and Evolvable Machines, 6(1):53--77, 2005.
[14]
Y. Gao, J. Huang, H. Rong, and D. Gu. Learning classifier system ensemble for data mining. In Proceedings of the 2005 workshops on Genetic and evolutionary computation, pages 63--66. ACM, 2005.
[15]
Y. Gao, J. Huang, H. Rong, and D. Gu. LCSE: learning classifier system ensemble for incremental medical instances. Lecture Notes in Computer Science, 4399:93, 2007.
[16]
J. Holmes. A genetics-based machine learning approach to knowledge discovery in clinical data. In Proceedings of the AMIA Annual Fall Symposium, page 883, 1996.
[17]
J. Holmes. Discovering risk of disease with a learning classifier system. In Proceedings of the Seventh International Conference of Genetic Algorithms (ICGA97), pages 426--433. Citeseer, 1997.
[18]
J. Holmes and J. Sager. Rule discovery in epidemiologic surveillance data using EpiXCS: an evolutionary computation approach. Lecture Notes in Computer Science, 3581:444, 2005.
[19]
T. Kovacs. A comparison of strength and accuracy-based fitness in LCSs. PhD thesis, PhD thesis, University of Birmingham, 2002.
[20]
T. Kovacs. XCS classifier system reliably evolves accurate, complete and minimal representations for Boolean functions. Cognitive Science Research Papers, University of Birmingham CSRP, 1997.
[21]
T. Kovacs. Strength or accuracy? Fitness calculation in learning classifier systems. Learning Classifier Systems, pages 143--160, 2000.
[22]
X. Llorà and D. Goldberg. Bounding the effect of noise in multiobjective learning classifier systems. Evolutionary Computation, 11(3):279--298, 2003.
[23]
T. Mitchell. Machine learning. 1997. Burr Ridge, IL: McGraw Hill.
[24]
O. Sigaud and S. Wilson. LCSs: a survey. Soft Computing-A Fusion of Foundations, Methodologies and Applications, 11(11):1065--1078, 2007.
[25]
R. Urbanowicz, J. Kiralis, Sinnott-Armstrong, T. N. Heberling, J. Fisher, and J. Moore. A Fast, Direct Algorithm for Generating Pure, Strict, Epistatic Models with Random Architectures. In preparation.
[26]
R. Urbanowicz and J. Moore. Learning Classifier Systems: A Complete Introduction, Review, and Roadmap. Journal of Artificial Evolution and Applications, 2009, 2009.
[27]
R. Urbanowicz and J. Moore. The application of michigan-style LCSs to address genetic heterogeneity and epistasis in association studies. In Proceedings of the 12th annual conference on Genetic and evolutionary computation, 195--202. ACM, 2010.
[28]
R. Urbanowicz and J. Moore. The application of pittsburgh-style learning classifier systems to address genetic heterogeneity and epistasis in association studies. In Proceedings of the 11th annual international conference on Parallel Problem Solving in Nature, 404--413. Springer, 2010.
[29]
S. Wilson. ZCS: A zeroth level classifier system. Evolutionary computation, 2(1):1--18, 1994.
[30]
S. Wilson. Classifier fitness based on accuracy. Evolutionary computation, 3(2):149--175, 1995.
[31]
S. Wilson. Mining oblique data with XCS. Advances in Learning Classifier Systems, 283--290, 2001.
[32]
S. Wilson, S. Wilson, G. Xcs, et al. Generalization in the XCS classifier system. 1998.

Cited By

View all
  • (2013)Efficient training set use for blood pressure prediction in a large scale learning classifier systemProceedings of the 15th annual conference companion on Genetic and evolutionary computation10.1145/2464576.2482705(1267-1274)Online publication date: 6-Jul-2013

Index Terms

  1. Random artificial incorporation of noise in a learning classifier system environment

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        GECCO '11: Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
        July 2011
        1548 pages
        ISBN:9781450306904
        DOI:10.1145/2001858
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 12 July 2011

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. gene association study
        2. generalization
        3. genetic algorithm
        4. genetics-based machine learning
        5. learning classifier system
        6. noise
        7. ucs

        Qualifiers

        • Tutorial

        Conference

        GECCO '11
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)1
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 16 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2013)Efficient training set use for blood pressure prediction in a large scale learning classifier systemProceedings of the 15th annual conference companion on Genetic and evolutionary computation10.1145/2464576.2482705(1267-1274)Online publication date: 6-Jul-2013

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media