Elsevier

Information Fusion

Volume 41, May 2018, Pages 195-216
Information Fusion

Dynamic classifier selection: Recent advances and perspectives

https://doi.org/10.1016/j.inffus.2017.09.010Get rights and content

Highlights

  • An updated taxonomy of Dynamic Selection techniques is proposed.

  • A review of the state-of-the-art dynamic selection techniques is presented.

  • Empirical comparison between 18 dynamic selection techniques is conducted.

  • We discuss about the recent findings and open research question in this field.

Abstract

Multiple Classifier Systems (MCS) have been widely studied as an alternative for increasing accuracy in pattern recognition. One of the most promising MCS approaches is Dynamic Selection (DS), in which the base classifiers are selected on the fly, according to each new sample to be classified. This paper provides a review of the DS techniques proposed in the literature from a theoretical and empirical point of view. We propose an updated taxonomy based on the main characteristics found in a dynamic selection system: (1) The methodology used to define a local region for the estimation of the local competence of the base classifiers; (2) The source of information used to estimate the level of competence of the base classifiers, such as local accuracy, oracle, ranking and probabilistic models, and (3) The selection approach, which determines whether a single or an ensemble of classifiers is selected. We categorize the main dynamic selection techniques in the DS literature based on the proposed taxonomy. We also conduct an extensive experimental analysis, considering a total of 18 state-of-the-art dynamic selection techniques, as well as static ensemble combination and single classification models. To date, this is the first analysis comparing all the key DS techniques under the same experimental protocol. Furthermore, we also present several perspectives and open research questions that can be used as a guide for future works in this domain.

Introduction

Multiple Classifier System (MCS) is a very active area of research in machine learning and pattern recognition. In recent years, several studies have been published demonstrating its advantages over individual classifier models based on theoretical [1], [2], [3] and empirical [4], [5], [6] evaluations. They are widely used to solve many real-world problems, such as face recognition [7], music genre classification [8], credit scoring [9], [10], class imbalance [11], recommender system [12], [13], software bug prediction [14], [15], intrusion detection [16], [17], and for dealing with changing environments [18], [19], [20].

Several approaches are currently used to construct an MCS, and they have been presented in many excellent reviews covering different aspects of MCS [3], [6], [21], [22], [23]. One of the most promising MCS approaches is Dynamic Selection (DS), in which the base classifiers1 are selected on the fly, according to each new sample to be classified. DS has become an active research topic in the multiple classifier systems literature in past years. This has been due to the fact that more and more works are reporting the superior performance of such techniques over traditional combination approaches, such as majority voting and Boosting [24], [25], [26], [27]. DS techniques work by estimating the competence level of each classifier from a pool of classifiers. Only the most competent, or an ensemble containing the most competent classifiers is selected to predict the label of a specific test sample. The rationale for such techniques is that not every classifier in the pool is an expert in classifying all unknown samples; rather, each base classifier is an expert in a different local region of the feature space [28].

In dynamic selection, the key is how to select the most competent classifiers for any given query sample. Usually, the competence of the classifiers is estimated based on a local region of the feature space where the query sample is located. This region can be defined by different methods, such as applying the K-Nearest Neighbors technique, to find the neighborhood of this query sample, or by using clustering techniques [29], [30]. Then, the competence level of the base classifiers is estimated, considering only the samples belonging to this local region according to any selection criteria; these include the accuracy of the base classifiers in this local region [30], [31], [32] or ranking [33] and probabilistic models [25], [34]. At this point, the classifier(s) that attained a certain competence level is (are) selected.

In this paper, we present an updated taxonomy of dynamic classifier and ensemble selection techniques, taking into account the following three aspects: (1) The selection approach, which considers, whether a single classifier is selected (this is known as Dynamic Classifier Selection (DCS)) or an ensemble is selected (this for its part is known as Dynamic Ensemble Selection (DES)); (2) The method used to define the local region in which the local competences of the base classifiers are estimated, and (3) The selection criteria used to estimate the competence level of the classifier. We review and categorize the state-of-the-art dynamic classifier and ensemble selection techniques based on the proposed taxonomy.

We also discuss the increasing use of dynamic selection techniques, considering different classification contexts, such as One-Class Classification (OCC) [35], concept drift [36], [37], [38], One-Versus-One (OVO) decomposition problems [39], [40], [41], as well as the application of DS techniques to solve complex real-world problems such as signature verification [42], face recognition [7], [43], [44], [45], music classification [8] and credit scoring [10]. In particular, we describe how the properties of dynamic selection techniques can be used to handle the intrinsic properties of each problem.

An experimental analysis is conducted comparing the performance of 18 state-of-the-art dynamic classifier and ensemble selection techniques over multiple classification datasets. The DS techniques are also compared against the baseline methods, namely, (1) Static Selection (SS), i.e., the selection of an ensemble of classifiers during the training stage of the system [46]; (2) Single Best (SB), which corresponds to the performance of the best classifier in the pool according to the validation data, and majority voting (MV), which corresponds to the majority voting combination of all classifiers in the pool without any pre-selection of classifiers. To allow a fair comparison of the techniques, all the DS and static techniques were evaluated using the same experimental protocol, i.e., the same division of datasets, as well as the same pool of classifiers. The performance of the DS techniques was also compared with those of the best classification models according to [4], including Support Vector Machine (SVM) and Random Forests.

The contributions of this paper in relation to other reviews in classifiers ensembles are:

  • 1.

    It proposes an updated taxonomy of dynamic selection techniques.

  • 2.

    It discusses the use of dynamic selection techniques on different contexts, including One-Versus-One decomposition (OVO), and One-Class Classification (OCC).

  • 3.

    It reviews the use of DS techniques to solve complex real-world problems such as image classification and biomedical applications.

  • 4.

    It presents an empirical comparison between several state-of-the-art dynamic selection techniques over several classification datasets under the same experimental protocol.

  • 5.

    It discusses the most recent findings in this field, and examines the open questions that can be addressed in future works.

This work is organized as follows: Section 2 presents an overview of multiple classifier system approaches. In Section 3, we propose an updated dynamic selection taxonomy, and discuss each component of a DS technique. In Section 4, we describe the most relevant dynamic selection methods and categorize them according to the proposed taxonomy. Section 5 presents a review of several real-world applications that use dynamic selection techniques to achieve a higher classification accuracy. An empirical comparison between the state-of-the-art DS techniques is conducted in Section 6. The conclusion and perspectives for future research in dynamic selection are given in the last section.

Section snippets

Basic concepts

This section presents the main concepts comprised in DS approaches. They provide the background needed to understand how DS techniques work as well as the main challenges involved in this class of techniques. The following mathematical notation is used in this paper:

  • C={c1,,cM} is the pool consisting of M base classifiers.

  • xj is a test sample with an unknown class label.

  • θj={x1,,xK} is the region of competence of xj, and xk is one instance belonging to θj.

  • P(ωl∣ xj, ci) Posterior probability

Dynamic selection

In dynamic selection, the classification of a new query sample usually involves three steps:

  • 1.

    Definition of the region of competence; that is, how to define the local region surrounding the query, xj, in which the competence level of the base classifiers is estimated.

  • 2.

    Determination of the selection criteria used to estimate the competence level of the base classifiers, e.g., Accuracy, Probabilistic, and Ranking.

  • 3.

    Determination of the selection mechanism that chooses a single classifier (DCS) or an

Dynamic selection techniques

In this section, we present a review of the most relevant dynamic selection algorithms. The DS techniques were chosen taking into account their importance in the literature by the introduction of new concepts in the area (i.e., methods that introduced different ways of defining the competence region or selection criteria), their number of citations, as well as the availability of source code. Minor variations of an existing technique, such as different versions of the KNORA-E technique proposed

Applications

In this section, we present a review of real-world applications using dynamic selection techniques. Moreover, we also discuss how the authors adapt traditional DS techniques to the intrinsic characteristics of their applications; these includes aspects such as imbalanced distributions in customer classification and credit scoring [126] and the lack of validation samples in face recognition applications [7], [45].

Table 2 lists several real-world applications of DS techniques. Based on usage

Comparative study

In this section, we present an empirical comparison between 18 state-of-the-art techniques under the same experimental protocol. First, we compare the results of each DS techniques (Section 6.3). The DS techniques are then also compared against the baseline methods, namely, (1) Static Selection (SS), that is, the selection of an EoC during the training stage, and its combination using a majority voting scheme [46]; (2) single best (SB), which corresponds to the performance of the best

Conclusion and perspectives

In this paper, we presented an updated taxonomy of dynamic selection systems. The key points of a dynamic selection systems are analyzed: 1) the methodology used for the definition of the region of competence used to estimate the local competences of the base classifiers; 2) the source of information used to estimate the competence of the base classifiers, and 3) The selection approach used to determine whether a single base classifier or an ensemble of classifiers is selected. Then, we present

Acknowledgment

This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), the École de technologie supérieure (ÉTS Montréal) and CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico).

References (177)

  • T. Woloszynski et al.

    A measure of competence based on random classification for dynamic ensemble selection

    Inf. Fus.

    (2012)
  • B. Krawczyk et al.

    Dynamic classifier selection for one-class classification

    Knowl. Based Syst.

    (2016)
  • A. Tsymbal et al.

    Dynamic integration of classifiers for handling concept drift

    Inf. Fus.

    (2008)
  • I. Mendialdua et al.

    Dynamic selection of the best base classifier in one versus one

    Knowl. Based Syst.

    (2015)
  • M. Galar et al.

    Dynamic classifier selection for one-vs-one strategy: avoiding non-competent classifiers

    Pattern Recognit.

    (2013)
  • Z.-L. Zhang et al.

    Exploring the effectiveness of dynamic ensemble selection in the one-versus-one scheme

    Knowl. Based Syst.

    (2017)
  • L. Batista et al.

    Dynamic selection of generative–discriminative ensembles for off-line signature verification

    Pattern Recognit.

    (2012)
  • S. Bashbaghi et al.

    Dynamic ensembles of exemplar-svms for still-to-video face recognition

    Pattern Recognit

    (2017)
  • D. Ruta et al.

    Classifier selection for majority voting

    Inf. Fus.

    (2005)
  • M. Skurichina et al.

    Bagging for linear classifiers

    Pattern Recognit.

    (1998)
  • A. Rahman et al.

    Effect of ensemble classifier composition on offline cursive character recognition

    Inf. Process. Manage.

    (2013)
  • L. Rokach

    Decision forest: twenty years of research

    Inf. Fus.

    (2016)
  • G. Giacinto et al.

    Design of effective neural network ensembles for image classification purposes

    Image Vis. Comput.

    (2001)
  • G. Giacinto et al.

    Design of effective neural network ensembles for image classification purposes

    Image Vis. Comput.

    (2001)
  • R.M.O. Cruz et al.

    Feature representation selection based on classifier projection space and oracle analysis

    Expert Syst. Appl.

    (2013)
  • E.M. dos Santos et al.

    Overfitting cautious selection of classifier ensembles with genetic algorithms

    Inf. Fus.

    (2009)
  • B. Gabrys et al.

    Genetic algorithms in classifier fusion

    Appl. Soft Comput.

    (2006)
  • ZhouZ.-H. et al.

    Ensembling neural networks: many could be better than all

    Artif. Intell.

    (2002)
  • R.E. Banfield et al.

    Ensemble diversity measures and their application to thinning

    Inf. Fus.

    (2005)
  • L.I. Kuncheva et al.

    Decision templates for multiple classifier fusion: an experimental comparison

    Pattern Recognit.

    (2001)
  • G.L. Rogova

    Combining the results of several neural network classifiers

    Neural Netw.

    (1994)
  • D.M.J. Tax et al.

    Combining multiple classifiers by averaging or by multiplying?

    Pattern Recognit.

    (2000)
  • L. Lam et al.

    Optimal combinations of pattern classifiers

    Pattern Recognit. Lett.

    (1995)
  • D.H. Wolpert

    Stacked generalization

    Neural Netw.

    (1992)
  • Š. Raudys

    Trainable fusion rules. ii. small sample-size effects

    Neural Netw.

    (2006)
  • Š. Raudys

    Trainable fusion rules. i. large sample size case

    Neural Netw.

    (2006)
  • D. Štefka et al.

    Dynamic classifier aggregation using interaction-sensitive fuzzy measures

    Fuzzy Sets Syst.

    (2015)
  • R.M.O. Cruz et al.

    Meta-des. Oracle: meta-learning and feature selection for dynamic ensemble selection

    Inf. Fus.

    (2017)
  • L.I. Kuncheva

    A theoretical study on six classifier fusion strategies

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2002)
  • T.G. Dietterich

    Ensemble methods in machine learning

    International Workshop on Multiple Classifier Systems

    (2000)
  • L.I. Kuncheva

    Combining Pattern Classifiers: Methods and Algorithms

    (2004)
  • M. Fernández-Delgado et al.

    Do we need hundreds of classifiers to solve real world classification problems?

    J. Mach. Learn. Res.

    (2014)
  • D. Opitz et al.

    Popular ensemble methods: an empirical study

    J. Artif. Intell. Res.

    (1999)
  • R. Polikar

    Ensemble based systems in decision making

    IEEE Circuits Syst. Mag.

    (2006)
  • S. Bashbaghi et al.

    Dynamic selection of exemplar-svms for watch-list screening through domain adaptation

    Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods (ICPRAM)

    (2017)
  • P.R.L. de Almeida et al.

    Music genre classification using dynamic selection of ensemble of classifiers

    Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on

    (2012)
  • M. Galar et al.

    A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches

    IEEE Trans. Syst. Man. Cybern. Part C

    (2012)
  • M. Jahrer et al.

    Combining predictions for accurate recommender systems

    Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

    (2010)
  • D. Di Nucci et al.

    Dynamic selection of classifiers in bug prediction: an adaptive method

    IEEE Trans. Emerg. Topics Comput. Intell.

    (2017)
  • A. Panichella et al.

    Cross-project defect prediction models: L’union fait la force

    Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), 2014 Software Evolution Week-IEEE Conference on

    (2014)
  • Cited by (0)

    View full text