Skip to main content
Log in

KEEL: a software tool to assess evolutionary algorithms for data mining problems

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

This paper introduces a software tool named KEEL which is a software tool to assess evolutionary algorithms for Data Mining problems of various kinds including as regression, classification, unsupervised learning, etc. It includes evolutionary learning algorithms based on different approaches: Pittsburgh, Michigan and IRL, as well as the integration of evolutionary learning techniques with different pre-processing techniques, allowing it to perform a complete analysis of any learning model in comparison to existing software tools. Moreover, KEEL has been designed with a double goal: research and educational.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alcalá R, Alcala-Fdez J, Casillas J, Cordón O, Herrera F (2006) Hybrid learning models to get the interpretabilityaccuracy trade-off in fuzzy modeling. Soft Comput 10(9): 717–734

    Article  Google Scholar 

  • Batista GE, Monard MC (2003) An analysis of four missing data treatment methods for supervised learning. Appl Artif Intell 17: 519–533

    Article  Google Scholar 

  • Bernadó-Mansilla E, Ho TK (2005) Domain of competence of XCS classifier system in complexity measurement space. IEEE Trans Evol Comput 9(1): 82–104

    Article  Google Scholar 

  • Berthold MR, Cebron N, Dill F, Di Fatta G, Gabriel TR, Georg F, Meinl T, Ohl P (2006) KNIME: The Konstanz Information Miner, In: Proceedings of the 4th annual industrial simulation conference, Workshop on multi-agent systems and simulations, Palermo

  • Cano JR, Herrera F, Lozano M (2003) Using evolutionary algorithms as instance selection for data reduction in KDD: An experimental study. IEEE Trans Evol Comput 7(6): 561–575

    Article  Google Scholar 

  • Cordón O, del Jesus MJ, Herrera F, Lozano M (1999) MOGUL: a methodology to obtain genetic fuzzy rule-based systems under the iterative rule learning approach. Int J Intell Syst 14(9): 1123–1153

    Article  MATH  Google Scholar 

  • Cordón O, Herrera F, Sánchez L (1999) Solving electrical distribution problems using hybrid evolutionary data analysis techniques. Appl Intell 10: 5–24

    Article  Google Scholar 

  • Cordón O, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems: Evolutionary tuning and learning of fuzzy knowledge bases. World Scientific, Singapore, p 488

    MATH  Google Scholar 

  • Chuang AS (2000) An extendible genetic algorithm framework for problem solving in a common environment. IEEE Trans Power Syst 15(1): 269–275

    Article  Google Scholar 

  • del Jesus MJ, Hoffmann F, Navascues LJ, Sánchez L (2004) Induction of Fuzzy-Rule-Based Classifiers with Evolutionary Boosting Algorithms. IEEE Trans Fuzzy Syst 12(3): 296–308

    Article  Google Scholar 

  • Demšar J, Zupan B Orange: From experimental machine learning to interactive data mining, White Paper (http://www.ailab.si/orange). Faculty of Computer and Information Science, University of Ljubljana

  • Dietterich TG (1998) Approximate Statistica Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation 10(7): 1895–1924

    Article  Google Scholar 

  • Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer, Berlin, p 299

    MATH  Google Scholar 

  • Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms. Springer, Berlin, p 264

    MATH  Google Scholar 

  • Gagné C, Parizeau M (2006) Genericity in evolutionary computation sofyware tools: principles and case-study Int J Artif Intell Tools 15(2): 173–194

    Article  Google Scholar 

  • Ghosh A, Jain LC (2005) Evolutionary Computation in Data Mining. Springer, New York, pp 264

    Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, New York, pp 372

    MATH  Google Scholar 

  • Grefenstette JJ (1993) Genetic Algorithms for Machine Learning. Kluwer, Norwell, p 176

    MATH  Google Scholar 

  • Holland JH (1975) Adaptation in natural and artificial systems. The University of Michigan Press, London, p 228

    Google Scholar 

  • Keijzer M, Merelo JJ, Romero G, Schoenauer M (2001) Evolving objects: A general purpose evolutionary computation library. In: Collet P, Fonlupt C, Hao JK, Lutton E, Schoenauer M (eds) Artificial evolution: selected papers from the 5th european conference on artificial evolution, London, UK, pp 231–244

  • Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence 2(12):1137–1143

  • Krasnogor N, Smith J (2000) MAFRA: A Java memetic algorithms framework. In: Proceedings of the Genetic and Evolutionary Computation Workshops. Las Vegas, Nevada, USA, pp 125–131

  • Llorà X (2006) E2K: Evolution to knowledge. SIGEVOlution 1(3): 10–16

    Article  Google Scholar 

  • Llorà X, Garrell JM (2003) Prototype induction and attribute selection via evolutionary algorithms. Int Data Anal 7(3): 193–208

    Google Scholar 

  • Liu H, Hussain F, Lim C, Dash M (2002) Discretization: an enabling technique. Data Min Knowl Discov 6(4): 393–423

    Article  MathSciNet  Google Scholar 

  • Luke S, Panait L, Balan G, Paus S, Skolicki Z, Bassett J, Hubley R, Chircop A (2007) ECJ: A Java based evolutionary computation research system. http://cs.gmu.edu/~eclab/projects/ecj

  • Martínez-Estudillo A, Martínez-Estudillo F, Hervás-Martínez C, García-Pedrajas N (2006) Evolutionary product unit based neural networks for regression. Neural Netw 19: 477–486

    Article  MATH  Google Scholar 

  • Meyer M, Hufschlag K (2006) A generic approach to an object-oriented learning classifier system library. Journal of Artificial Societies and Social Simulation 9:3 http://jasss.soc.surrey.ac.uk/9/3/9.html

    Google Scholar 

  • Mierswa I, Wurst M, Klinkenberg R, Scholz M, Euler T (2006) YALE: Rapid Prototyping for Complex Data Mining Tasks. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1–6

  • Morik K, Scholz M (2004) The MiningMart Approach to Knowledge Discovery in Databases. In: Zhong N, Liu J (eds) Intelligent Technologies for Information Analysis. Springer, Heidelberg, pp 47–65

    Google Scholar 

  • Mucientes M, Moreno DL, Bugarín A, Barro S (2006) Evolutionary learning of a fuzzy controller for wallfollowing behavior in mobile robotics. Soft Comput 10(10): 881–889

    Article  Google Scholar 

  • Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11): 1424–1437

    Article  Google Scholar 

  • Ortega M, Bravo J (2000) Computers and education in the 21st century. Kluwer, Norwell, p 266

    Google Scholar 

  • Otero J, Sánchez L (2006) Induction of descriptive fuzzy classifiers with the Logitboost algorithm. Soft Comput 10(9): 825–835

    Article  Google Scholar 

  • Pal SK, Wang PP (1996) Genetic algorithms for pattern recognition. CRC Press, Boca Raton,p 336

    Google Scholar 

  • Punch B, Zongker D (1998) lib-gp 1.1 beta. http://garage.cse.msu.edu/software/lil-gp

  • Pyle D (1999) Data preparation for data mining. Morgan Kaufmann, San Mateo, p 540

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo, p 316

  • R Development Core Team (2005) R: A language and environment for statistical computing. R Foundation for Statistical Computing. Vienna, Austria http://www.R-project.org

  • Rakotomalala R (2005) TANAGRA: un logiciel gratuit pour l’enseignement et la recherche. In: Proceedings of the 5th Journées d’Extraction et Gestion des Connaissances 2:697–702

  • Rivera AJ, Rojas I, Ortega J, del Jesus MJ (2007) A new hybrid methodology for cooperative-coevolutionary optimization of radial basis function networks. Soft Comput 11(7): 655–668

    Article  Google Scholar 

  • Rodríguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Trans Pattern Anal Mach Intell 28(10): 1619–1630

    Article  Google Scholar 

  • Romero C, Ventura S, Bra P (2004) Knowledge discovery with genetic programming for providing feedback to courseware author, user modeling and user-adapted interaction. J Personal Res 14(5): 425–465

    Google Scholar 

  • Rummler A (2007) Evolvica: a Java framework for evolutionary algorithms. http://www.evolvica.org

  • Rushing J, Ramachandran R, Nair U, Graves S, Welch R, Lin H (2005) ADaM: a data mining toolkit for scientists and engineers. Comput Geosci 31(5): 607–618

    Article  Google Scholar 

  • Sonnenburg S, Braun ML, Ong ChS, Bengio S, Bottou L, Holmes G, LeCun Y, Müller K-R, Pereira F, Rasmussen CE, Rätsch G, Schölkopf B, Smola A, Vincent P, Weston J, Williamson RC (2007) The need for open source software in machine learning. J Mach Learn Res 8: 2443–2466

    Google Scholar 

  • Stejić Z, Takama Y, Hirota K (2007) Variants of evolutionary learning for interactive image retrieval. Soft Comput 11(7): 669–678

    Article  Google Scholar 

  • Tan JC, Lee TH, Khoo D, Khor EF (2001) A multiobjective evolutionary algorithm toolbox for computer-aided multiobjective optimization. IEEE Trans Syst Man Cybern B Cybern 31(4): 537–556

    Article  Google Scholar 

  • Tan JC, Tay A, Cai J (2003) Design and implementation of a distributed evolutionary computing software. IEEE Trans Syst Man Cybern B Cybern 33(3): 325–338

    Google Scholar 

  • Tan PN, Steinbach M, Kumar V (2006) Introduction to Data Mining. Addison-Wesley, Reading, p 769

    Google Scholar 

  • Ventura S, Romero C, Zafra A, Delgado JA, Hervás C (2008) JCLEC: a java framework for evolutionary computation. Soft Comput 12(4): 381–392

    Article  Google Scholar 

  • Wang LX, Mendel JM (1992) Generating fuzzy rules by learning from examples. IEEE Trans Syst Man Cybern 22(6): 1414–1427

    Article  MathSciNet  Google Scholar 

  • Wang X, Nauck DD, Spott M, Kruse R (2007) Intelligent data analysis with fuzzy decision trees. Soft Comput 11(5): 439–457

    Article  Google Scholar 

  • Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2): 149–175

    Article  Google Scholar 

  • Wilson DR, Martinez TR (2000) Reduction techniques for instance-based learning algorithms. Mach Learn 38: 257–268

    Article  MATH  Google Scholar 

  • Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco, p 525. http://www.cs.waikato.ac.nz/ml/weka/index.html

  • Wong ML, Leung KS (2000) Data mining using grammar based genetic programming and applications. Kluwer, Norwell, p 232

    MATH  Google Scholar 

  • Zhang S, Zhang C, Yang Q (2003) Data preparation for data mining. Appl Artif Intell 17: 375–381

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. Alcalá-Fdez.

Additional information

Supported by the Spanish Ministry of Science and Technology under Projects TIN-2005-08386-C05-(01, 02, 03, 04 and 05). The work of Dr. Bacardit is also supported by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant GR/T07534/01.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alcalá-Fdez, J., Sánchez, L., García, S. et al. KEEL: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13, 307–318 (2009). https://doi.org/10.1007/s00500-008-0323-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-008-0323-y

Keywords

Navigation