A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability

García, S.; Fernández, A.; Luengo, J.; Herrera, F.

doi:10.1007/s00500-008-0392-y

A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability

Original Paper
Published: 20 December 2008

Volume 13, pages 959–977, (2009)
Cite this article

Soft Computing Aims and scope Submit manuscript

S. García¹,
A. Fernández²,
J. Luengo² &
…
F. Herrera²

5858 Accesses
514 Citations
Explore all metrics

Abstract

The experimental analysis on the performance of a proposed method is a crucial and necessary task to carry out in a research. This paper is focused on the statistical analysis of the results in the field of genetics-based machine Learning. It presents a study involving a set of techniques which can be used for doing a rigorous comparison among algorithms, in terms of obtaining successful classification models. Two accuracy measures for multi-class problems have been employed: classification rate and Cohen’s kappa. Furthermore, two interpretability measures have been employed: size of the rule set and number of antecedents. We have studied whether the samples of results obtained by genetics-based classifiers, using the performance measures cited above, check the necessary conditions for being analysed by means of parametrical tests. The results obtained state that the fulfillment of these conditions are problem-dependent and indefinite, which supports the use of non-parametric statistics in the experimental analysis. In addition, non-parametric tests can be satisfactorily employed for comparing generic classifiers over various data-sets considering any performance measure. According to these facts, we propose the use of the most powerful non-parametric statistical tests to carry out multiple comparisons. However, the statistical analysis conducted on interpretability must be carefully considered.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Article Open access 08 March 2021

Artificial Intelligence in Physical Sciences: Symbolic Regression Trends and Perspectives

Article Open access 19 April 2023

References

Aguilar-Ruiz JS, Giráldez R, Riquelme JC (2000) Natural encoding for evolutionary supervised learning. IEEE Trans Evol Comput 11(4):466–479
Article Google Scholar
Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput 13(3):307–318
Article Google Scholar
Alpaydin E (2004) Introduction to machine learning, vol 452. MIT Press, Cambridge
Google Scholar
Anglano C, Botta M (2002) NOW G-Net: learning classification programs on networks of workstations. IEEE Trans Evol Comput 6(13):463–480
Article Google Scholar
Asuncion A, Newman DJ (2007) UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA. http://www.ics.uci.edu/~mlearn/MLRepository.htm
Bacardit J (2004) Pittsburgh genetic-based machine learning in the data mining era: representations, generalization and run-time, Dept. Comput. Sci., University Ramon Llull, Barcelona, Spain
Bacardit J, Garrell JM (2003) Evolving multiple discretizations with adaptive intervals for a pittsburgh rule-based learning classifier system. In: Proceedings of the genetic and evolutionary computation conference (GECCO’03), vol 2724. LNCS, Germany, pp 1818–1831
Google Scholar
Bacardit J, Garrell JM (2004) Analysis and improvements of the adaptive discretization intervals knowledge representation. In: Proceedings of the genetic and evolutionary computation conference (GECCO’04), vol 3103. LNCS, Germany, pp 726–738
Google Scholar
Bacardit J, Garrell JM (2007) Bloat control and generalization pressure using the minimum description length principle for Pittsburgh approach learning classifier system. In: Kovacs T, Llorá X, Takadama K (eds) Advances at the frontier of learning classifier systems, vol 4399. LNCS, USA, pp 61–80
Google Scholar
Barandela R, Sánchez JS, García V, Rangel E (2003) Strategies for learning in class imbalance problems. Pattern Recognit 36(3):849–851
Article Google Scholar
Ben-David A (2007) A lot of randomness is hiding in accuracy. Eng Appl Artif Intell 20:875–885
Article Google Scholar
Bernadó-Mansilla E, Garrell JM (2003) Accuracy-based learning classifier systems: models, analysis and applications to classification tasks. Evol Comput 11(3):209–238
Article Google Scholar
Bernadó-Mansilla E, Ho TK (2005) Domain of competence of XCS classifier system in complexity measurement space. IEEE Trans Evol Comput 9(1):82–104
Article Google Scholar
Clark P, Niblett T (1989) The CN2 induction algorithm. Machine Learn 3(4):261–283
Google Scholar
Cohen JA (1960) Coefficient of agreement for nominal scales. Educ Psychol Meas 37–46
Corcoran AL, Sen S (1994) Using real-valued genetic algorithms to evolve rule sets for classification. In: Proceedings of the IEEE conference on evolutionary computation, pp 120–124
De Jong KA, Spears WM, Gordon DF (1993) Using genetic algorithms for concept learning. Machine Learn 13:161–188
Article Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Machine Learn Res 7:1–30
Google Scholar
Drummond C, Holte RC (2006) Cost curves: an improved method for visualizing classifier performance. Machine Learn 65(1):95–130
Article Google Scholar
Freitas AA (2002) Data mining and knowledge discovery with evolutionary algorithms, vol 264. Springer, Berlin
Google Scholar
Grefenstette JJ (1993) Genetic algorithms for machine learning, vol 176. Kluwer, Norwell
Google Scholar
Guan SU, Zhu F (2005) An incremental approach to genetic-algorithms-based classification. IEEE Trans Syst Man Cybern B 35(2):227–239
Article Google Scholar
Hekanaho J (1998) An evolutionary approach to concept learning. Dissertation, Department of Computer Science, Abo akademi University, Abo, Finland
Hochberg Y (2000) A sharper bonferroni procedure for multiple tests of significance. Biometrika 75:800–803
Article MATH MathSciNet Google Scholar
Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70
MATH MathSciNet Google Scholar
Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310
Article Google Scholar
Iman RL, Davenport JM (1980) Approximations of the critical region of the Friedman statistic. Commun Stat 18:571–595
Article Google Scholar
Jiao L, Liu J, Zhong W (2006) An organizational coevolutionary algorithm for classification. IEEE Trans Evol Comput 10(1):67–80
Article Google Scholar
Koch GG (1970) The use of non-parametric methods in the statistical analysis of a complex split plot experiment. Biometrics 26(1):105–128
Article Google Scholar
Landgrebe TCW, Duin RPW (2008) Efficient multiclass ROC approximation by decomposition via confusion matrix perturbation analysis. IEEE Trans Pattern Anal Mach Intell 30(5):810–822
Article Google Scholar
Lim T-S, Loh W-Y, Shih Y-S (2000) A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learn 40(3):203–228
Article MATH Google Scholar
Markatou M, Tian H, Biswas S, Hripcsak G (2005) Analysis of variance of cross-validation estimators of the generalization error. J Machine Learn Res 6:1127–1168
MathSciNet Google Scholar
Rivest RL (1987) Learning decision lists. Machine Learn 2:229–246
MathSciNet Google Scholar
Sheskin DJ (2006) Handbook of parametric and nonparametric statistical procedures, vol 1736. Chapman & Hall/CRC, London/West Palm Beach
Shaffer JP (1995) Multiple hypothesis testing. Ann Rev Psychol 46:561–584
Article Google Scholar
Sigaud O, Wilson SW (2007) Learning classifier systems: a survey. Soft Comput 11:1065–1078
Article MATH Google Scholar
Sokolova M, Japkowicz N, Szpakowicz S (2006) Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. In: Australian conference on artificial intelligence, vol 4304. LNCS, Germany, pp 1015–1021
Google Scholar
Tan KC, Yu Q, Ang JH (2006) A coevolutionary algorithm for rules discovery in data mining. Int J Syst Sci 37(12):835–864
Article MATH MathSciNet Google Scholar
Tulai AF, Oppacher F (2004) Multiple species weighted voting - a genetics-based machine learning system. In: Proceedings of the genetic and evolutionary computation conference (GECCO’03), vol 3103. LNCS, Germany, pp 1263–1274
Venturini G (1993) SIA: a supervised inductive algorithm with genetic search for learning attributes based concepts. In: Proceedings of the machine learning ECML’93, vol 667. LNAI, Germany, pp 280–296
Wilson SW (1994) ZCS: a zeroth order classifier system. Evol Comput 2:1–18
Article Google Scholar
Wilson SW (1995) Classifier fitness based on accuracy. Evol Comput 3(2):149–175
Article Google Scholar
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn, vol 525. Morgan Kaufmann, San Francisco
Google Scholar
Wright SP (1992) Adjusted p-values for simultaneous inference. Biometrics 48:1005–1013
Article Google Scholar
Youden W (1950) Index for rating diagnostic tests. Cancer 3:32–35
Article Google Scholar
Zar JH (1999) Biostatistical analysis, vol 929. Prentice Hall, Englewood Cliffs

Download references

Acknowledgments

The study was supported by the Spanish Ministry of Science and Technology under Project TIN-2005-08386-C05-01. J. Luengo holds a FPU scholarship from Spanish Ministry of Education and Science. The authors are very grateful to the anonymous reviewers for their valuable suggestions and comments to improve the quality of this paper. We also are very grateful to Prof. Bacardit, Prof. Bernadó-Mansilla and Prof. Aguilar-Ruiz for providing the KEEL software with the GASSIST-ADI, XCS and HIDER algorithms, respectively.

Author information

Authors and Affiliations

Department of Computer Science, University of Jaén, 23071, Jaén, Spain
S. García
Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain
A. Fernández, J. Luengo & F. Herrera

Authors

S. García
View author publications
You can also search for this author in PubMed Google Scholar
A. Fernández
View author publications
You can also search for this author in PubMed Google Scholar
J. Luengo
View author publications
You can also search for this author in PubMed Google Scholar
F. Herrera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. García.

A genetic algorithms in classification

Here we will give a wider description of all the methods employed in our work, regarding their main components, structure and operation of each one of them.

For more details about the methods explained here, please refer to the corresponding references.

Pitts-GIRLA algorithm. The Pittsburgh genetic interval rule learning algorithm (Pitts-GIRLA) (Corcoran and Sen 1994) is a GBML method which makes use of the Pittsburgh approach in order to perform a classification task. Two real variables indicate the minimum and maximum value of the attribute, where a “don’t care” condition may occur if the maximum value is lower than the minimum value.

This algorithm employs three different operators: modified simple (one point) crossover, creep mutation and simple random mutation.

XCS algorithm. XCS (Wilson 1995) is a LCS that evolves online a set of rules that describe the feature space accurately. In the following we will present in detail the different components of this algorithm:

1.
Interaction with the environment: In keeping with the typical LCS model, the environment provides as input to the system a series of sensory situations σ(t) ∈ {0,1}^L, where L is the number of bits in each situation. In response the system executes actions α(t) ∈ {a ₁,…,a _n} upon the environment. Each action results in a scalar reward ρ(t).
2.
A classifier in XCS: XCS keeps a population of classifiers which represent its knowledge about the problem. Each classifier is a condition-action-predic-tion rule having the following parts: the condition C ∈ {0,1,#}^L, the action A ∈ {a ₁,…,a _n} and the prediction p. Furthermore, each classifier keeps certain additional parameters such as the prediction error ε, the fitness F, the experience exp, the time stamp ts, the action set size as and the numerosity.
3.
The different sets: There are four different sets that need to be considered in XCS: the population [P], the match set [M], the action set [A] and the previous action set [A ₋₁].

The result of this algorithm is that the knowledge is represented by a set of rules or classifiers with a certain fitness. When classifying unseen examples, each rule that matches the input votes according its prediction and fitness. The most voted class is chosen to be the output.

GASSIST algorithm. Genetic Algorithms based claSSIfier sySTem (GASSIST) (Bacardit and Garrell 2007) is a Pittsburgh style classifier system based on GABIL (De Jong et al. 1993) from where it has taken the semantically correct crossover operator. The main features of this classifier system are presented as follows:

1.
General operators and policies
- Matching strategy The matching process follows a “if ... then ... else if ... then ...” structure, usually called decision lists (Rivest 1987).
- Mutation operators When an individual is selected for mutation a random gene is chosen inside its chromosome to be mutated.
2.
Control of the individuals length: This control is achieved using two different operators:
- Rule deletion: This operator deletes the rules of the individuals that do not match any training example.
- Selection bias using the individual size: Tournament selection is used, where the criterion of the tournament is given by an operator called “hierarchical selection”, defined as follows:
  - If |accuracy _a–accuracy _b| < threshold then:
    - If length _a < length _b then a is better than b.
    - If length _a > length _b then b is better than a.
    - If length _a = length _b then we will use the general case.
  - Otherwise, we use the general case: we select the individual with higher fitness.
3.
Knowledge representations
- Rule Representations for symbolic or discrete attributes: It uses the GABIL (De Jong et al. 1993) representation for this kind of attributes.
- Rule Representations for real-valued attributes For GASSIST-ADI, the representation is based on the Adaptive Discretization Intervals rule representation (Bacardit and Garrell 2003; Bacardit 2004).

HIDER algorithm. HIerarchical DEcision Rules (HIDER) (Aguilar-Ruiz et al. 2000), produces a hierarchical set of rules, which may be viewed as a Decision List. In order to extract the rule-list a real-coded GA is employed in the search process. The elements of this procedure are described below.

1.
Coding: Each rule is represented by an individual (chromosome), where two genes define the lower and upper bounds of the rule attribute.
2.
Algorithm: The algorithm is a typical sequential covering GA. It chooses the best individual of the evolutionary process, transforming it into a rule which is used to eliminate data from the training file (Venturini 1993).

Initially, the set of rules R is empty, but in each iteration a rule is included in R. In each iteration, the training file is reduced, eliminating those examples that have been covered by the description of the rule r, independently of its class.

The GA main operators are defined in the following:

(a)
Initialization: First, an example is randomly selected from the training file for each individual of the population. Afterwards, an interval to which the example belongs is obtained.
(b)
Crossover: The crossover works as follows: let [l ^j_i , u ^j_i ] and [l ^k_i , u ^k_i ] be the intervals of two parents, j and k, for the same attribute i. From these parents one child is generated by selecting values that satisfy the expression: l ∈ [min(l ^j_i , l ^k_i ), max(l ^j_i , l ^k_i )] and u ∈ [min(u ^j_i , u ^k_i ),max(u ^j_i , u ^k_i )].
(c)
Mutation: a small value is subtracted or added, depending on whether it is the lower or the upper boundary, respectively.
(d)
Fitness function: The fitness function f considers a two-objective optimization, trying to maximize the number of correctly classified examples and to minimize the number of errors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

García, S., Fernández, A., Luengo, J. et al. A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13, 959–977 (2009). https://doi.org/10.1007/s00500-008-0392-y

Download citation

Published: 20 December 2008
Issue Date: August 2009
DOI: https://doi.org/10.1007/s00500-008-0392-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability

Abstract

Access this article

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Artificial Intelligence in Physical Sciences: Symbolic Regression Trends and Perspectives

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

A genetic algorithms in classification

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability

Abstract

Access this article

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods

Artificial Intelligence in Physical Sciences: Symbolic Regression Trends and Perspectives

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

A genetic algorithms in classification

A genetic algorithms in classification

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation