Skip to main content
Log in

Improving the scalability of rule-based evolutionary learning

  • Regular Research Paper
  • Published:
Memetic Computing Aims and scope Submit manuscript

Abstract

Evolutionary learning techniques are comparable in accuracy with other learning methods such as Bayesian Learning, SVM, etc. These techniques often produce more interpretable knowledge than, e.g. SVM; however, efficiency is a significant drawback. This paper presents a new representation motivated by our observations that Bioinformatics and Systems Biology often give rise to very large-scale datasets that are noisy, ambiguous and usually described by a large number of attributes. The crucial observation is that, in the most successful rules obtained for such datasets, only a few key attributes (from the large number of available ones) are expressed in a rule, hence automatically discovering these few key attributes and only keeping track of them contributes to a substantial speed up by avoiding useless match operations with irrelevant attributes. Thus, in effective terms this procedure is performing a fine-grained feature selection at a rule-wise level, as the key attributes may be different for each learned rule. The representation we propose has been tested within the BioHEL machine learning system, and the experiments performed show that not only the representation has competent learning performance, but that it also manages to reduce considerably the system run-time. That is, the proposed representation is up to 2–3 times faster than state-of-the-art evolutionary learning representations designed specifically for efficiency purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bacardit J (2004) Pittsburgh genetics-based machine learning in the data mining era: Representations, generalization, and run-time. PhD thesis, Ramon Llull University, Barcelona, Spain

  2. Bacardit J (2005) Analysis of the initialization stage of a pittsburgh approach learning classifier system. In: GECCO 2005: Proceedings of the genetic and evolutionary computation conference, ACM Press, vol 2, pp 1843–1850

  3. Bacardit J, Butz MV (2007) Data mining in learning classifier systems: Comparing xcs with gassist. In: Advances at the frontier of Learning Classifier Systems, Springer-Verlag, pp 282–290. doi:10.1007/978-3-540-71231-2_19

  4. Bacardit J, Krasnogor N (2006a) Biohel: bioinformatics-oriented hierarchical evolutionary learning. Nottingham eprints, University of Nottingham

  5. Bacardit J, Krasnogor N (2006b) Empirical evaluation of ensemble techniques for a pittsburgh learning classifier system. In: Ninth international workshop on learning classifier systems (IWLCS 2006), Springer, Lecture Notes in Artificial Intelligenge. http://www.asap.cs.nott.ac.uk/publications/pdf/iwlcs2006.pdf (to appear)

  6. Bacardit J, Goldberg D, Butz M, Llorà X, Garrell JM (2004) Speeding-up pittsburgh learning classifier systems: modeling time and accuracy. In: Parallel Problem Solving from Nature—PPSN 2004, Springer, LNCS 3242, pp 1021–1031

  7. Bacardit J, Stout M, Krasnogor N, Hirst JD, Blazewicz J (2006) Coordination number prediction using learning classifier systems: performance and interpretability. In: GECCO ’06: Proceedings of the 8th annual conference on genetic and evolutionary computation, ACM Press, pp 247–254

  8. Bacardit J, Goldberg DE, Butz MV (2007a) Improving the performance of a pittsburgh learning classifier system using a default rule. In: Learning Classifier systems, revised selected papers of the international workshop on learning classifier systems 2003-2005, Springer-Verlag, LNCS 4399, pp 291–307

  9. Bacardit J, Stout M, Hirst JD, Sastry K, Llorà X, Krasnogor N (2007b) Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. In: GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, ACM Press, New York, NY, USA, pp 346–353. doi:10.1145/1276958.1277033

  10. Bernadó E, Llorà X, Garrell JM (2001) XCS and GALE: a comparative study of two learning classifier systems with six other learning algorithms on classification tasks. In: Fourth international workshop on learning classifier systems-IWLCS, pp 337–341

  11. Blake C, Keogh E, Merz C (1998) UCI repository of machine learning databases. (www.ics.uci.edu/mlearn/MLRepository.html)

  12. Breiman L (1996) Bagging predictors. Mach Learn 24(2): 123–140

    MATH  MathSciNet  Google Scholar 

  13. Butz MV (2006) Rule-based evolutionary online learning systems: a principled approach to LCS analysis and design, studies in fuzziness and soft computing. Springer, Berlin, vol 109

  14. Cantu-Paz E, Kamath C (2003) Inducing oblique decision trees with evolutionary algorithms. IEEE Trans Evol Comput 7(1): 54–68

    Article  Google Scholar 

  15. Corcoran AL, Sen S (1994) Using real-valued genetic algorithms to evolve rule sets for classification. In: Proceedings of the IEEE conference on evolutionary computation, IEEE Press, pp 120–124. http://citeseer.nj.nec.com/corcoran94using.html

  16. Cordón O, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems. Evolutionary tuning and learning of fuzzy knowledge bases. World Scientific, Singapore

    MATH  Google Scholar 

  17. Cuff JA, Barton GJ (1999) Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34: 508–519

    Article  Google Scholar 

  18. De Jong KA, Spears WM (1991) Learning concept classification rules using genetic algorithms. In: Proceedings of the international joint conference on artificial intelligence, Morgan Kaufmann, pp 651–656

  19. Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1–30

    MathSciNet  Google Scholar 

  20. Divina F, Marchiori E (2005) Handling continuous attributes in an evolutionary inductive learner. IEEE Trans Evol Comput 9(1): 31–43

    Article  Google Scholar 

  21. Divina F, Keijzer M, Marchiori E (2003) A method for handling numerical attributes in GA-based inductive concept learners. In: GECCO 2003: Proceedings of the genetic and evolutionary computation conference, Springer, pp 898–908

  22. Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, Berlin

    MATH  Google Scholar 

  23. Fürnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1):3–54. http://citeseer.ist.psu.edu/26490.html

    Google Scholar 

  24. Giráldez R, Aguilar-Ruiz J, Riquelme J (2003) Natural coding: A more efficient representation for evolutionary learning. In: GECCO 2003: Proceedings of the genetic and evolutionary computation conference, Springer, pp 979–990

  25. Giráldez R, Aguilar-Ruiz JS, Santos JCR (2005) Knowledge-based fast evaluation for evolutionary learning. IEEE Trans Syst Man Cybernet Part C 35(2): 254–261

    Article  Google Scholar 

  26. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182. http://portal.acm.org/citation.cfm?id=944968

    Google Scholar 

  27. John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers, San Mateo, pp 338–345. http://citeseer.ist.psu.edu/john95estimating.html

  28. Llorà X (2008) Personal communication

  29. Llorà X, Garrell JM (2001) Knowledge-independent data mining with fine-grained parallel evolutionary algorithms. In: Proceedings of the third genetic and evolutionary computation conference, Morgan Kaufmann, pp 461–468

  30. Llorà X, Sastry K (2006) Fast rule matching for learning classifier systems via vector instructions. In: GECCO ’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation, ACM Press, New York, NY, USA, pp 1513–1520. doi:10.1145/1143997.1144244

  31. Llorà X, Priya A, Bhargava R (2008) Observer-invariant histopathology using genetics-based machine learning. Natural Computing, Special issue on Learning Classifier Systems p (in press)

  32. Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco

  33. Rissanen J (1978) Modeling by shortest data description. Automatica 14: 465–471

    Article  MATH  Google Scholar 

  34. Ruiz R (2007) New heuristics in feature selection for high dimensional data. AI Commun 20(2): 129–131

    MathSciNet  Google Scholar 

  35. Stone C, Bull L (2003) For real! XCS with continuous-valued inputs. Evol Comput J 11(3): 298–336

    Article  Google Scholar 

  36. Stout M, Bacardit J, Hirst JD, Krasnogor N (2008) Prediction of recursive convex hull class assignments for protein residues. Bioinformatics 24(7): 916–923

    Article  Google Scholar 

  37. Vafaie H, De Jong KA (1992) Genetic algorithms as a tool for feature selection in machine learning. In: Proceeding of the 4th international conference on tools with artificial intelligence, pp 200–203

  38. Venturini G (1993) Sia: A supervised inductive algorithm with genetic search for learning attributes based concepts. In: Brazdil PB (eds) Machine Learning: ECML-93, Proceedings of the European Conference on machine learning. Springer, Berlin, pp 280–296

    Google Scholar 

  39. Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2): 149–175

    Article  Google Scholar 

  40. Wilson SW (1999) Get real! XCS with continuous-valued inputs. In: Booker L, Forrest S, Mitchell M, Riolo RL (eds) Festschrift in Honor of John H. Holland, Center for the Study of Complex Systems, pp 111–121. http://citeseer.nj.nec.com/233869.html

  41. Witten IH, Frank E (2000) Data mining: practical machine learning tools and techniques with java implementations. Morgan Kaufmann, San Francisco

  42. Wood MJ, Hirst JD (2005) Protein secondary structure prediction with dihedral angles. Proteins 59: 476–481

    Article  Google Scholar 

  43. Yang J, Honavar VG (1998) Feature subset selection using a genetic algorithm. IEEE Intell Syst 13(2): 44–49. doi:10.1109/5254.671091

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Natalio Krasnogor.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bacardit, J., Burke, E.K. & Krasnogor, N. Improving the scalability of rule-based evolutionary learning. Memetic Comp. 1, 55–67 (2009). https://doi.org/10.1007/s12293-008-0005-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12293-008-0005-4

Keywords

Navigation