Improving the scalability of rule-based evolutionary learning

Bacardit, Jaume; Burke, Edmund K.; Krasnogor, Natalio

doi:10.1007/s12293-008-0005-4

Improving the scalability of rule-based evolutionary learning

Regular Research Paper
Published: 12 December 2008

Volume 1, pages 55–67, (2009)
Cite this article

Memetic Computing Aims and scope Submit manuscript

Jaume Bacardit^1,2,
Edmund K. Burke¹ &
Natalio Krasnogor¹

276 Accesses
73 Citations
4 Altmetric
Explore all metrics

Abstract

Evolutionary learning techniques are comparable in accuracy with other learning methods such as Bayesian Learning, SVM, etc. These techniques often produce more interpretable knowledge than, e.g. SVM; however, efficiency is a significant drawback. This paper presents a new representation motivated by our observations that Bioinformatics and Systems Biology often give rise to very large-scale datasets that are noisy, ambiguous and usually described by a large number of attributes. The crucial observation is that, in the most successful rules obtained for such datasets, only a few key attributes (from the large number of available ones) are expressed in a rule, hence automatically discovering these few key attributes and only keeping track of them contributes to a substantial speed up by avoiding useless match operations with irrelevant attributes. Thus, in effective terms this procedure is performing a fine-grained feature selection at a rule-wise level, as the key attributes may be different for each learned rule. The representation we propose has been tested within the BioHEL machine learning system, and the experiments performed show that not only the representation has competent learning performance, but that it also manages to reduce considerably the system run-time. That is, the proposed representation is up to 2–3 times faster than state-of-the-art evolutionary learning representations designed specifically for efficiency purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bacardit J (2004) Pittsburgh genetics-based machine learning in the data mining era: Representations, generalization, and run-time. PhD thesis, Ramon Llull University, Barcelona, Spain
Bacardit J (2005) Analysis of the initialization stage of a pittsburgh approach learning classifier system. In: GECCO 2005: Proceedings of the genetic and evolutionary computation conference, ACM Press, vol 2, pp 1843–1850
Bacardit J, Butz MV (2007) Data mining in learning classifier systems: Comparing xcs with gassist. In: Advances at the frontier of Learning Classifier Systems, Springer-Verlag, pp 282–290. doi:10.1007/978-3-540-71231-2_19
Bacardit J, Krasnogor N (2006a) Biohel: bioinformatics-oriented hierarchical evolutionary learning. Nottingham eprints, University of Nottingham
Bacardit J, Krasnogor N (2006b) Empirical evaluation of ensemble techniques for a pittsburgh learning classifier system. In: Ninth international workshop on learning classifier systems (IWLCS 2006), Springer, Lecture Notes in Artificial Intelligenge. http://www.asap.cs.nott.ac.uk/publications/pdf/iwlcs2006.pdf (to appear)
Bacardit J, Goldberg D, Butz M, Llorà X, Garrell JM (2004) Speeding-up pittsburgh learning classifier systems: modeling time and accuracy. In: Parallel Problem Solving from Nature—PPSN 2004, Springer, LNCS 3242, pp 1021–1031
Bacardit J, Stout M, Krasnogor N, Hirst JD, Blazewicz J (2006) Coordination number prediction using learning classifier systems: performance and interpretability. In: GECCO ’06: Proceedings of the 8th annual conference on genetic and evolutionary computation, ACM Press, pp 247–254
Bacardit J, Goldberg DE, Butz MV (2007a) Improving the performance of a pittsburgh learning classifier system using a default rule. In: Learning Classifier systems, revised selected papers of the international workshop on learning classifier systems 2003-2005, Springer-Verlag, LNCS 4399, pp 291–307
Bacardit J, Stout M, Hirst JD, Sastry K, Llorà X, Krasnogor N (2007b) Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. In: GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, ACM Press, New York, NY, USA, pp 346–353. doi:10.1145/1276958.1277033
Bernadó E, Llorà X, Garrell JM (2001) XCS and GALE: a comparative study of two learning classifier systems with six other learning algorithms on classification tasks. In: Fourth international workshop on learning classifier systems-IWLCS, pp 337–341
Blake C, Keogh E, Merz C (1998) UCI repository of machine learning databases. (www.ics.uci.edu/mlearn/MLRepository.html)
Breiman L (1996) Bagging predictors. Mach Learn 24(2): 123–140
MATH MathSciNet Google Scholar
Butz MV (2006) Rule-based evolutionary online learning systems: a principled approach to LCS analysis and design, studies in fuzziness and soft computing. Springer, Berlin, vol 109
Cantu-Paz E, Kamath C (2003) Inducing oblique decision trees with evolutionary algorithms. IEEE Trans Evol Comput 7(1): 54–68
Article Google Scholar
Corcoran AL, Sen S (1994) Using real-valued genetic algorithms to evolve rule sets for classification. In: Proceedings of the IEEE conference on evolutionary computation, IEEE Press, pp 120–124. http://citeseer.nj.nec.com/corcoran94using.html
Cordón O, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems. Evolutionary tuning and learning of fuzzy knowledge bases. World Scientific, Singapore
MATH Google Scholar
Cuff JA, Barton GJ (1999) Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins 34: 508–519
Article Google Scholar
De Jong KA, Spears WM (1991) Learning concept classification rules using genetic algorithms. In: Proceedings of the international joint conference on artificial intelligence, Morgan Kaufmann, pp 651–656
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1–30
MathSciNet Google Scholar
Divina F, Marchiori E (2005) Handling continuous attributes in an evolutionary inductive learner. IEEE Trans Evol Comput 9(1): 31–43
Article Google Scholar
Divina F, Keijzer M, Marchiori E (2003) A method for handling numerical attributes in GA-based inductive concept learners. In: GECCO 2003: Proceedings of the genetic and evolutionary computation conference, Springer, pp 898–908
Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, Berlin
MATH Google Scholar
Fürnkranz J (1999) Separate-and-conquer rule learning. Artif Intell Rev 13(1):3–54. http://citeseer.ist.psu.edu/26490.html
Google Scholar
Giráldez R, Aguilar-Ruiz J, Riquelme J (2003) Natural coding: A more efficient representation for evolutionary learning. In: GECCO 2003: Proceedings of the genetic and evolutionary computation conference, Springer, pp 979–990
Giráldez R, Aguilar-Ruiz JS, Santos JCR (2005) Knowledge-based fast evaluation for evolutionary learning. IEEE Trans Syst Man Cybernet Part C 35(2): 254–261
Article Google Scholar
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182. http://portal.acm.org/citation.cfm?id=944968
Google Scholar
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers, San Mateo, pp 338–345. http://citeseer.ist.psu.edu/john95estimating.html
Llorà X (2008) Personal communication
Llorà X, Garrell JM (2001) Knowledge-independent data mining with fine-grained parallel evolutionary algorithms. In: Proceedings of the third genetic and evolutionary computation conference, Morgan Kaufmann, pp 461–468
Llorà X, Sastry K (2006) Fast rule matching for learning classifier systems via vector instructions. In: GECCO ’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation, ACM Press, New York, NY, USA, pp 1513–1520. doi:10.1145/1143997.1144244
Llorà X, Priya A, Bhargava R (2008) Observer-invariant histopathology using genetics-based machine learning. Natural Computing, Special issue on Learning Classifier Systems p (in press)
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco
Rissanen J (1978) Modeling by shortest data description. Automatica 14: 465–471
Article MATH Google Scholar
Ruiz R (2007) New heuristics in feature selection for high dimensional data. AI Commun 20(2): 129–131
MathSciNet Google Scholar
Stone C, Bull L (2003) For real! XCS with continuous-valued inputs. Evol Comput J 11(3): 298–336
Article Google Scholar
Stout M, Bacardit J, Hirst JD, Krasnogor N (2008) Prediction of recursive convex hull class assignments for protein residues. Bioinformatics 24(7): 916–923
Article Google Scholar
Vafaie H, De Jong KA (1992) Genetic algorithms as a tool for feature selection in machine learning. In: Proceeding of the 4th international conference on tools with artificial intelligence, pp 200–203
Venturini G (1993) Sia: A supervised inductive algorithm with genetic search for learning attributes based concepts. In: Brazdil PB (eds) Machine Learning: ECML-93, Proceedings of the European Conference on machine learning. Springer, Berlin, pp 280–296
Google Scholar
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2): 149–175
Article Google Scholar
Wilson SW (1999) Get real! XCS with continuous-valued inputs. In: Booker L, Forrest S, Mitchell M, Riolo RL (eds) Festschrift in Honor of John H. Holland, Center for the Study of Complex Systems, pp 111–121. http://citeseer.nj.nec.com/233869.html
Witten IH, Frank E (2000) Data mining: practical machine learning tools and techniques with java implementations. Morgan Kaufmann, San Francisco
Wood MJ, Hirst JD (2005) Protein secondary structure prediction with dihedral angles. Proteins 59: 476–481
Article Google Scholar
Yang J, Honavar VG (1998) Feature subset selection using a genetic algorithm. IEEE Intell Syst 13(2): 44–49. doi:10.1109/5254.671091
Article Google Scholar

Download references

Author information

Authors and Affiliations

ASAP Research Group, School of Computer Science, University of Nottingham, Jubilee Campus, Nottingham, NG8 1BB, UK
Jaume Bacardit, Edmund K. Burke & Natalio Krasnogor
Multidisciplinary Centre for Integrative Biology, School of Biosciences, University of Nottingham, Sutton Bonington, LE12 5RD, UK
Jaume Bacardit

Authors

Jaume Bacardit
View author publications
You can also search for this author in PubMed Google Scholar
Edmund K. Burke
View author publications
You can also search for this author in PubMed Google Scholar
Natalio Krasnogor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Natalio Krasnogor.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bacardit, J., Burke, E.K. & Krasnogor, N. Improving the scalability of rule-based evolutionary learning. Memetic Comp. 1, 55–67 (2009). https://doi.org/10.1007/s12293-008-0005-4

Download citation

Received: 06 July 2008
Accepted: 27 October 2008
Published: 12 December 2008
Issue Date: March 2009
DOI: https://doi.org/10.1007/s12293-008-0005-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving the scalability of rule-based evolutionary learning

Abstract

Access this article

Similar content being viewed by others

A review on genetic algorithm: past, present, and future

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A random forest guided tour

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving the scalability of rule-based evolutionary learning

Abstract

Access this article

Similar content being viewed by others

A review on genetic algorithm: past, present, and future

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A random forest guided tour

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation