Elsevier

Applied Soft Computing

Volume 11, Issue 3, April 2011, Pages 3032-3045
Applied Soft Computing

Evolutionary selection of hyperrectangles in nested generalized exemplar learning

https://doi.org/10.1016/j.asoc.2010.11.030Get rights and content

Abstract

The nested generalized exemplar theory accomplishes learning by storing objects in Euclidean n-space, as hyperrectangles. Classification of new data is performed by computing their distance to the nearest “generalized exemplar” or hyperrectangle. This learning method allows the combination of the distance-based classification with the axis-parallel rectangle representation employed in most of the rule-learning systems. In this paper, we propose the use of evolutionary algorithms to select the most influential hyperrectangles to obtain accurate and simple models in classification tasks. The proposal has been compared with the most representative models based on hyperrectangle learning; such as the BNGE, RISE, INNER, and SIA genetics based learning approach. Our approach is also very competitive with respect to classical rule induction algorithms such as C4.5Rules and RIPPER. The results have been contrasted through non-parametric statistical tests over multiple data sets and they indicate that our approach outperforms them in terms of accuracy requiring a lower number of hyperrectangles to be stored, thus obtaining simpler models than previous NGE approaches. Larger data sets have also been tackled with promising outcomes.

Introduction

Exemplar-based learning was originally proposed in [1] and considers a set of methods widely used in machine learning and data mining [2], [3]. A similar scheme for learning from examples is based on the Nested generalized exemplar (NGE) theory. It was introduced in [4] and makes several significant modifications to the exemplar-based learning model. The most important one is that it retains the notion of storing verbatim examples in memory but also allows examples to be generalized. They are strongly related to the nearest neighbor classifier (NN) [5] and were proposed in order to extend it. NGE learning algorithms are very popular for their simplicity and efficient results.

In NGE theory, generalizations take the form of hyperrectangles in an Euclidean n-space. It can be approached as an exemplar-based generalization model. The hyperrectangles may be nested and inner hyperrectangles serve as exceptions to surrounding hyperrectangles. An specific example can be viewed as a minimal hyperrectangle. Hyperrectangles are axis-parallel rectangle representations employed in most of the rule-learning systems [6]. After the learning process, a new example can be classified by computing the Euclidean distance between the example and each of the hyperrectangles, predicting the class of the new example considering the nearest hyperrectangle. If two or more hyperrectangles cover the example, a conflict resolution method to determine the predicted class has to be used [4].

Several works argue the benefits of using hyperrectangles together with instances to form the classification rule [7], [8], [9]. With respect to instance-based classification [1], the employment of hyperrectangles increases the comprehension of the data stored to perform classification of unseen data and the achievement of a substantial compression of the data, reducing the storage requirements. Considering rule induction [6], the ability of modeling decision surfaces by hybridizations between distance-based methods (Voronoi diagrams) and parallel axis separators could improve the performance of the models in domains with clusters of exemplars or exemplars strung out along a curve. In addition, NGE learning allows us to capture generalizations with exceptions.

The methods used for generating nearest hyperrectangles classification can work in an incremental fashion, such as EACH [4], or in batch mode (BNGE [7], RISE [8], FAN [10] and INNER [9]). The incremental way is dependent on the order of presentation of examples and usually offers poor results in standard classification. However, it could be used in on-line learning scenarios. Batch mode methods employ heuristics to determine the choice of the exemplars to be merged or generalized at each stage. The results offered are very interesting and they usually outperform the results obtained by the NN classifier [7].

Extensions to NGE can be found in the specialized literature. Heath et al. [11] address the problem of whether reducing the memory capacity of a learning algorithm will have an effect on the speed learning, for a particular concept class, that of nested hyperrectangles. In [12], authors investigate the impact on the predictive accuracy of the learnt concepts by NGE as a consequence of using three distance functions, namely HVDM, IVDM and WVDM [13]. The fuzzy NGE model was proposed in [14], [15] and the transformation of neural networks based knowledge to NGE based knowledge was investigated in [16], [17]. An interesting study for analyzing hybridizations of exemplar-based learning with other machine learning paradigms can be found in [18].

The problem of yielding an optimal number of hyperrectangles for classifying a set of points is NP-hard. A large but finite subset of hyperrectangles can be easily obtained following a simple heuristic algorithm acting over the training data. However, almost all hyperrectangles produced could be irrelevant and, as a result, most influential ones must be distinguished. This complete set of hyperrectangles is thus suitable for improvement by a data reduction technique [19]. Evolutionary algorithms (EAs) [20] have been used for data reduction with promising results [21], [22]. They have been successfully used for feature selection [23], [24], [25], [26], [27], instance selection [28], [29], [30], [31], simultaneous instance and feature selection [32], [33] and under-sampling for imbalanced learning [34], [35]. NGE is also directly related to clustering and EAs have been extensively used for this problem [36]. EAs for clustering could be useful as alternative components of NGE learning, especially when the initial candidate set of hyperrectangles to be selected has to be obtained.

In this paper, we propose the use of EAs for hyperrectangles’ selection in classification tasks. One similar approach is SIA [37], which is a genetics-based machine learning method to obtain a set of rules by means of computing distances among rules. Our objective is to increase the accuracy of this type of representation by means of selecting the best suitable set of hyperrectangles which optimizes the nearest hyperrectangle classification rule. We compare our approach with other NGE learning models, such as BNGE, RISE, INNER and SIA, and two well-known rule induction learning methods: RIPPER and C4.5Rules. The empirical study has been contrasted via non-parametrical statistical testing [38], [39], [40], [41]. The results show an improvement in accuracy whereas the number of hyperrectangles stored in the final subset is reduced. This outcome is especially observed when dealing with larger data sets. Regarding classic rule induction, we can observe that our proposal is more adaptable under different types of input data.

We note that the proposal described in this paper is an extended algorithm to that described in our previous work [42]. Our previous version presented some weaknesses related to the few number of examples covered by the hyperrectangles learned and the treatment of noisy examples. In this paper, the coverage of the hyperrectangles is incorporated in the proposal and we present a modification based on a previous stage of noise filtering. In addition, more data sets (including large size data sets) and appropriate statistical tools have been used to justify the conclusions achieved.

The paper is organized as follows. Section 2 gives an explanation of the NGE learning model. In Section 3, all topics concerning the approach proposed to tackle this problem are explained. In Section 4 the experimentation framework is given and in Section 5 the results and analysis are presented. In Section 6, the conclusions are highlighted. Finally, Appendix A is included in order to illustrate the comparisons of our proposal with other techniques through star plots.

Section snippets

NGE learning model

NGE is a learning paradigm based on class exemplars, where an induced hypothesis has the graphical shape of a set of hyperrectangles in an M-dimensional Euclidean space. Exemplars of classes are either hyperrectangles or single instances [4]. The input of an NGE system is a set of training examples, each described as a vector of pairs numeric_attribute/value and an associated class. Attributes can either be numerical or categorical. Numerical attributes are usually normalized in the [0,1]

Evolutionary selection of hyperrectangles

The approach proposed in this paper, named evolutionary hyperrectangle selection by CHC (EHS-CHC), is fully explained in this section. First, we introduce the CHC model used as an EA to perform hyperrectangle selection in Section 3.1. After this, the specific issues regarding representation and fitness function are specified in Section 3.2. Section 3.3 describes the process for generating the initial set of hyperrectangles and Section 3.4 presents the extended version of EHS-CHC: filtered

Experimental framework

In this section we first provide details of the real-world problems chosen for the experimentation and the configuration parameters of the methods studied (Sections 4.1 Data sets, 4.2 Parameters). Finally, we present the statistical tests applied to compare the results obtained with the different approaches (Section 4.3).

Results and analysis

This section shows the results obtained in the experimental study as well as the analysis based on them. The experimental study will be divided into two parts: experiments for small data sets (Section 5.1) and experiments for medium data sets (Section 5.2). In addition, a study of the efficiency of the NGE models considered in this paper is carried out in Section 5.3.

Concluding remarks

The purpose of this paper is to present a proposal of an evolutionary hyperrectangle selection algorithm for nested generalized exemplar learning in classification called EHS-CHC. It creates an initial set of hyperrectangles from training data and then it performs a selection process focused on maximizing the accuracy and coverage of examples with the lowest possible number of hyperrectangles.

The results show that EHS-CHC allows us to obtain very accurate models with a low number of

Acknowledgement

This work was supported by TIN2008-06681-C06-01. J. Derrac holds a research scholarship from the University of Granada. J. Luengo holds a FPU scholarship from from Spanish Ministry of Education and Science. We are grateful to the reviewers for identifying essential issues and providing us with important and valuable feedback. We have found the suggestions and comments in the reviews very useful to improve the quality of this paper.

References (49)

  • S. García et al.

    Enhancing the effectiveness and interpretability of decision tree and rule induction classifiers with evolutionary training set selection over imbalanced problems

    Applied Soft Computing

    (2009)
  • S. García et al.

    Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power

    Information Sciences

    (2010)
  • L.J. Eshelman

    The CHC adaptive search algorithm: How to safe search when engaging in nontraditional genetic recombination

  • D.W. Aha et al.

    Instance-based learning algorithms

    Machine Learning

    (1991)
  • I.H. Witten et al.

    Data Mining: Practical Machine Learning Tools and Techniques

    (2005)
  • I. Kononenko et al.

    Machine Learning and Data Mining: Introduction to Principles and Algorithms

    (2007)
  • S. Salzberg

    A nearest hyperrectangle method

    Machine Learning

    (1991)
  • T.M. Cover et al.

    Nearest neighbor pattern classification

    IEEE Transactions on Information Theory

    (1967)
  • J. Fürnkranz

    Separate-and-conquer rule learning

    Artificial Intelligence Review

    (1999)
  • D. Wettschereck et al.

    An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms

    Machine Learning

    (1995)
  • P. Domingos

    Unifying instance-based and rule-based induction

    Machine Learning

    (1996)
  • O. Luaces et al.

    Inflating examples to obtain rules

    International Journal of Intelligent Systems

    (2003)
  • D.G. Heath et al.

    Learning nested concept classes with limited storage

    Journal of Experimental and Theoreticall Artificial Intelligence

    (1996)
  • L.B. Figueira et al.

    Evaluating the effects of distance metrics on a NGE-based system

  • Cited by (15)

    • A joint generalized exemplar method for classification of massive datasets

      2015, Applied Soft Computing Journal
      Citation Excerpt :

      With JGE the obtained mean accuracy was 76.48% and the ones obtained with BNGE [34], NNGE [24], kNN [34], C4.5 [34], BP [35], SVM [35] and ELM [35] are 72.78%, 73.96%, 70.33%, 73.85%, 74.73%, 77.31%, and 77.57%, respectively. Previously BNGE had been compared with RISE, EHS-CHC and kNN and nearly similar average accuracies had been reported [36,48]. A strong relationship between accuracy and geometric complexity has been reported in the literature by Salzberg and Wettschereck and Diettrich [25,37].

    • IRAHC: Instance reduction algorithm using hyperrectangle clustering

      2015, Pattern Recognition
      Citation Excerpt :

      They then have proved that this problem and its several variations are NP-hard. In Ref. [43], the authors have proposed the use of evolutionary algorithms to select the most influential hyperrectangles to obtain accurate and simple models in classification tasks. Their proposed method has been compared with the most representative models based on hyperrectangle learning.

    • Applying soft computing techniques to optimise a dental milling process

      2013, Neurocomputing
      Citation Excerpt :

      At the end, the conclusions are set out and some comments on future research lines are outlined. Soft computing [10,26–30] is a set of several technologies whose aim is to solve inexact and complex problems [31,32]. It investigates, simulates, and analyses very complex issues and phenomena in order to solve real-world problems [33,34].

    • Time series forecasting using a weighted cross-validation evolutionary artificial neural network ensemble

      2013, Neurocomputing
      Citation Excerpt :

      The use of Soft Computing Models in Industrial and Environmental Applications (SCMIEAs) is a key research field due to its effectiveness in human life [1–5].

    View all citing articles on Scopus
    View full text