Analysing BioHEL using challenging boolean functions

Franco, María A.; Krasnogor, Natalio; Bacardit, Jaume

doi:10.1007/s12065-012-0080-9

Analysing BioHEL using challenging boolean functions

Special Issue
Published: 22 May 2012

Volume 5, pages 87–102, (2012)
Cite this article

Evolutionary Intelligence Aims and scope Submit manuscript

María A. Franco¹,
Natalio Krasnogor¹ &
Jaume Bacardit^1,2

222 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

In this work we present an extensive empirical analysis of the BioHEL genetics-based machine learning system using the k-Disjunctive Normal Form (k-DNF) family of boolean functions. These functions present a broad set of possible challenges for most machine learning techniques, such as different degrees of specificity, class imbalance and niche overlap. Moreover, as the ideal solutions are known, it is possible to assess if a learning system is able to find them, and how fast. Specifically, we study two aspects of BioHEL: its sensitivity to the coverage breakpoint parameter (that determines the degree of generality pressure applied by the fitness function) and the impact of the default rule policy. The results show that BioHEL is highly sensitive to the choice of coverage breakpoint and that using a default class suitable for the problem allows the system to learn faster than using other default class policies (e.g. the majority class policy). Moreover, the experiments indicate that BioHEL’s scalability depends directly on both k (the specificity of the k-DNF terms) and the number of terms in the problem. In the last part of the paper we discuss alternative policies to adjust the coverage breakpoint parameter.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving genetic search in XCS-based classifier systems through understanding the evolvability of classifier rules

Article 11 July 2014

Geometric Semantic Genetic Programming Is Overkill

On the Analysis of Simple Genetic Programming for Evolving Boolean Functions

Notes

The number of examples needed to learn a classification problem.

References

Bacardit J (2004) Pittsburgh Genetics-Based machine learning in the data mining era: representations, generalization, and run-time. phdthesis. Ramon Llull University, Barcelona, Spain
Bacardit J, Burke E, Krasnogor N (2009) Improving the scalability of rule-based evolutionary learning. Memetic Computing 1(1):55–67. doi:10.1007/s12293-008-0005-4
Google Scholar
Bacardit J, Garrell JM (2003) Bloat control and generalization pressure using the minimum description length principle for a pittsburgh approach learning classifier system. In: Proceedings of the 6th International Workshop on Learning Classifier Systems
Bacardit J, Goldberg DE, Butz MV (2007) Improving the performance of a pittsburgh learning classifier system using a default rule. In: Learning classifier systems, revised selected papers of the international workshop on learning classifier systems 2003–2005. Springer, LNCS 4399, pp. 291–307
Bacardit J, Goldberg DE, Butz MV, Llorá X, Garrell JM (2004) Speeding-Up pittsburgh learning classifier systems: modeling time and accuracy. In: Parallel problem solving from nature—PPSN VIII, Lecture Notes in Computer Science, vol. 3242, chap. 103. Springer, Berlin, Heidelberg, pp 1021–1031. http://www.springerlink.com/content/66w8u56a61wntqa6
Bacardit J, Hirst JD, Stout M, Blazewicz J, Krasnogor N (2006) Coordination number prediction using learning classifier systems: performance and interpretability. In: In GECCO ’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation. ACM Press, New York, NY, pp 247–254
Bacardit J, Krasnogor N (2009) A mixed discrete-continuous attribute list representation for large scale classification domains. In: GECCO ’09: Proceedings of the 11th annual conference on genetic and evolutionary computation, pp 1155–1162. ACM Press, New York, NY. doi:10.1145/1569901.1570057
Bacardit J, Stout M, Hirst JD, Sastry K, Llorà X, Krasnogor N (2007) Automated alphabet reduction method with evolutionary algorithms for protein structure prediction. In: GECCO ’07: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, New York, NY, pp 346–353. doi:10.1145/1276958.1277033
Bacardit J, Stout M, Hirst JD, Valencia A, Smith R, Krasnogor N (2009) Automated alphabet reduction for protein datasets. BMC Bioinformatics 10(1):6. doi:10.1186/1471-2105-10-6
Google Scholar
Bassel GW, Glaab E, Marquez J, Holdsworth MJ, Bacardit J (2011) Functional network construction in arabidopsis using rule-based machine learning on large-scale data sets. Plant Cell Online 23(9):3101–3116. doi:10.1105/tpc.111.088153
Article Google Scholar
Butz MV (2006) Rule-based evolutionary online learning systems: a principled approach to LCS analysis and design, studies in fuzziness and soft computing. vol 109, Springer, Berlin
Google Scholar
Butz MV, Pelikan M (2006) Studying XCS/BOA learning in boolean functions: structure encoding and random boolean functions. In: GECCO ’06: Proceedings of the 8th annual conference on Genetic and evolutionary computation. ACM, New York, NY, pp 1449–456. doi:10.1145/1143997.1144236
Ehrenfeucht A, Haussler D, Kearns MJ, Valiant L (1988) A general lower bound on the number of examples needed for learning. In: Proceedings of the first annual workshop on Computational learning theory. Morgan Kaufmann Publishers Inc., MIT, Cambridge, MA, pp 139–154. http://portal.acm.org/citation.cfm?id=93068
Franco MA, Krasnogor N, Bacardit J (2010) Speeding up the evaluation of evolutionary learning systems using GPGPUs. In: GECCO ’10: Proceedings of the 12th annual conference on genetic and evolutionary computation. ACM, New York, NY, pp. 1039–1046. doi:10.1145/1830483.1830672
Franco MA, Krasnogor N, Bacardit J (2011) Modelling the initialisation stage of the alkr representation for discrete domains and gabil encoding. In: Proceedings of the 13th annual conference on Genetic and evolutionary computation, GECCO ’11. ACM, New York, NY, pp 1291–1298. doi:10.1145/2001576.2001750
Hernández-Aguirre A, Buckles BP, Coello CAC (2001) On learning kDNF ^s_n s boolean formulas. In: Evolvable hardware, NASA/DoD conference on, vol 0. IEEE Computer Society, Los Alamitos, CA, p 0240.doi:10.1109/EH.2001.937967
Hirschberg DS, Pazzani MJ, Ali KM (1994) Average case analysis of k-CNF and k-DNF learning algorithms. In: Proceedings of the workshop on computational learning theory and natural learning systems (vol 2): intersections between theory and experiment. MIT Press, Cambridge, MA, pp 15–28
Ioannides C, Barrett G, Eder K (2011) Xcs cannot learn all boolean functions. In: Proceedings of the 13th annual conference on Genetic and evolutionary computation, GECCO ’11, pp. 1283–1290. ACM, New York, NY. doi:10.1145/2001576.2001749
Jong KD, Spears WM (1991) Learning concept classification rules using genetic algorithms. In: Proceedings of the 12th international joint conference on Artificial intelligence, vol 2. Morgan Kaufmann Publishers Inc., Sydney, New South Wales, pp 651–656. http://portal.acm.org/citation.cfm?id=1631559
Kearns MJ (1990) The computational complexity of machine learning. MIT Press, Cambridge, MA
Google Scholar
Orriols-Puig A, Bernadó-Mansilla E (2008) Evolutionary rule-based systems for imbalanced data sets. Soft Comput 13(3):213–225. http://portal.acm.org/citation.cfm?id=1459244
Google Scholar
Orriols-Puig A, Bernadó-Mansilla E, Goldberg DE, Sastry K, Lanzi PL (2009) Facetwise analysis of XCS for problems with class imbalances. Trans Evol Comp 13(5):1093–1119. http://portal.acm.org/citation.cfm?id=1720407
Google Scholar
Rissanen J (1978) Modeling by shortest data description. Automatica 14:465–471
Article MATH Google Scholar
Stone C, Bull L (2003) For real! XCS with continuous-valued inputs. Evol Comput 11:299–336. doi:10.1162/106365603322365315
Google Scholar
Stout M, Bacardit J, Hirst JD, Krasnogor N (2008) Prediction of recursive convex hull class assignments for protein residues. Bioinformatics 24(7):916–923. doi:10.1093/bioinformatics/btn050. http://bioinformatics.oxfordjournals.org/cgi/
Google Scholar
Venturini G (1993) SIA: a supervised inductive algorithm with genetic search for learning attributes based concepts. In: Brazdil PB (eds), Machine learning: ECML-93—Proceedings of the European Conference on Machine Learning. Springer, New York, pp 280–296
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175. doi:10.1162/evco.1995.3.2.149
Google Scholar
Wilson SW (2001) Mining oblique data with XCS. In: Luca Lanzi P, Stolzmann W, Wilson S (eds), Advances in learning classifier systems, lecture notes in computer science, vol 1996. Springer, Berlin/Heidelberg, pp 283–290. doi:10.1007/3-540-44640-0_11
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Waltham, MA
MATH Google Scholar

Download references

Acknowledgments

The authors would like to thank the UK Engineering and Physical Sciences Research Council (EPSRC) for its support under grant EP/H016597/1. They would also like to acknowledge the High Performance Computing facility at the University of Nottingham for providing the necessary framework for these experiments.

Author information

Authors and Affiliations

ICOS Research Group, School of Computer Science, University of Nottingham, Nottingham, NG8 1BB, UK
María A. Franco, Natalio Krasnogor & Jaume Bacardit
Multi-disciplinary Centre for Integrative Biology (MyCIB), School of Biosciences, University of Nottingham, Sutton Bonington, LE12 5RD, UK
Jaume Bacardit

Authors

María A. Franco
View author publications
You can also search for this author in PubMed Google Scholar
Natalio Krasnogor
View author publications
You can also search for this author in PubMed Google Scholar
Jaume Bacardit
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to María A. Franco.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Franco, M.A., Krasnogor, N. & Bacardit, J. Analysing BioHEL using challenging boolean functions. Evol. Intel. 5, 87–102 (2012). https://doi.org/10.1007/s12065-012-0080-9

Download citation

Received: 10 January 2012
Accepted: 08 May 2012
Published: 22 May 2012
Issue Date: June 2012
DOI: https://doi.org/10.1007/s12065-012-0080-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analysing BioHEL using challenging boolean functions

Abstract

Access this article

Similar content being viewed by others

Improving genetic search in XCS-based classifier systems through understanding the evolvability of classifier rules

Geometric Semantic Genetic Programming Is Overkill

On the Analysis of Simple Genetic Programming for Evolving Boolean Functions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Analysing BioHEL using challenging boolean functions

Abstract

Access this article

Similar content being viewed by others

Improving genetic search in XCS-based classifier systems through understanding the evolvability of classifier rules

Geometric Semantic Genetic Programming Is Overkill

On the Analysis of Simple Genetic Programming for Evolving Boolean Functions

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation