Dynamic classifier selection: Recent advances and perspectives

doi:10.1016/j.inffus.2017.09.010

Information Fusion

Volume 41, May 2018, Pages 195-216

https://doi.org/10.1016/j.inffus.2017.09.010 Get rights and content

Highlights

•
An updated taxonomy of Dynamic Selection techniques is proposed.
•
A review of the state-of-the-art dynamic selection techniques is presented.
•
Empirical comparison between 18 dynamic selection techniques is conducted.
•
We discuss about the recent findings and open research question in this field.

Abstract

Multiple Classifier Systems (MCS) have been widely studied as an alternative for increasing accuracy in pattern recognition. One of the most promising MCS approaches is Dynamic Selection (DS), in which the base classifiers are selected on the fly, according to each new sample to be classified. This paper provides a review of the DS techniques proposed in the literature from a theoretical and empirical point of view. We propose an updated taxonomy based on the main characteristics found in a dynamic selection system: (1) The methodology used to define a local region for the estimation of the local competence of the base classifiers; (2) The source of information used to estimate the level of competence of the base classifiers, such as local accuracy, oracle, ranking and probabilistic models, and (3) The selection approach, which determines whether a single or an ensemble of classifiers is selected. We categorize the main dynamic selection techniques in the DS literature based on the proposed taxonomy. We also conduct an extensive experimental analysis, considering a total of 18 state-of-the-art dynamic selection techniques, as well as static ensemble combination and single classification models. To date, this is the first analysis comparing all the key DS techniques under the same experimental protocol. Furthermore, we also present several perspectives and open research questions that can be used as a guide for future works in this domain.

Introduction

Multiple Classifier System (MCS) is a very active area of research in machine learning and pattern recognition. In recent years, several studies have been published demonstrating its advantages over individual classifier models based on theoretical [1], [2], [3] and empirical [4], [5], [6] evaluations. They are widely used to solve many real-world problems, such as face recognition [7], music genre classification [8], credit scoring [9], [10], class imbalance [11], recommender system [12], [13], software bug prediction [14], [15], intrusion detection [16], [17], and for dealing with changing environments [18], [19], [20].

Several approaches are currently used to construct an MCS, and they have been presented in many excellent reviews covering different aspects of MCS [3], [6], [21], [22], [23]. One of the most promising MCS approaches is Dynamic Selection (DS), in which the base classifiers¹ are selected on the fly, according to each new sample to be classified. DS has become an active research topic in the multiple classifier systems literature in past years. This has been due to the fact that more and more works are reporting the superior performance of such techniques over traditional combination approaches, such as majority voting and Boosting [24], [25], [26], [27]. DS techniques work by estimating the competence level of each classifier from a pool of classifiers. Only the most competent, or an ensemble containing the most competent classifiers is selected to predict the label of a specific test sample. The rationale for such techniques is that not every classifier in the pool is an expert in classifying all unknown samples; rather, each base classifier is an expert in a different local region of the feature space [28].

In dynamic selection, the key is how to select the most competent classifiers for any given query sample. Usually, the competence of the classifiers is estimated based on a local region of the feature space where the query sample is located. This region can be defined by different methods, such as applying the K-Nearest Neighbors technique, to find the neighborhood of this query sample, or by using clustering techniques [29], [30]. Then, the competence level of the base classifiers is estimated, considering only the samples belonging to this local region according to any selection criteria; these include the accuracy of the base classifiers in this local region [30], [31], [32] or ranking [33] and probabilistic models [25], [34]. At this point, the classifier(s) that attained a certain competence level is (are) selected.

In this paper, we present an updated taxonomy of dynamic classifier and ensemble selection techniques, taking into account the following three aspects: (1) The selection approach, which considers, whether a single classifier is selected (this is known as Dynamic Classifier Selection (DCS)) or an ensemble is selected (this for its part is known as Dynamic Ensemble Selection (DES)); (2) The method used to define the local region in which the local competences of the base classifiers are estimated, and (3) The selection criteria used to estimate the competence level of the classifier. We review and categorize the state-of-the-art dynamic classifier and ensemble selection techniques based on the proposed taxonomy.

We also discuss the increasing use of dynamic selection techniques, considering different classification contexts, such as One-Class Classification (OCC) [35], concept drift [36], [37], [38], One-Versus-One (OVO) decomposition problems [39], [40], [41], as well as the application of DS techniques to solve complex real-world problems such as signature verification [42], face recognition [7], [43], [44], [45], music classification [8] and credit scoring [10]. In particular, we describe how the properties of dynamic selection techniques can be used to handle the intrinsic properties of each problem.

An experimental analysis is conducted comparing the performance of 18 state-of-the-art dynamic classifier and ensemble selection techniques over multiple classification datasets. The DS techniques are also compared against the baseline methods, namely, (1) Static Selection (SS), i.e., the selection of an ensemble of classifiers during the training stage of the system [46]; (2) Single Best (SB), which corresponds to the performance of the best classifier in the pool according to the validation data, and majority voting (MV), which corresponds to the majority voting combination of all classifiers in the pool without any pre-selection of classifiers. To allow a fair comparison of the techniques, all the DS and static techniques were evaluated using the same experimental protocol, i.e., the same division of datasets, as well as the same pool of classifiers. The performance of the DS techniques was also compared with those of the best classification models according to [4], including Support Vector Machine (SVM) and Random Forests.

The contributions of this paper in relation to other reviews in classifiers ensembles are:

1.
It proposes an updated taxonomy of dynamic selection techniques.
2.
It discusses the use of dynamic selection techniques on different contexts, including One-Versus-One decomposition (OVO), and One-Class Classification (OCC).
3.
It reviews the use of DS techniques to solve complex real-world problems such as image classification and biomedical applications.
4.
It presents an empirical comparison between several state-of-the-art dynamic selection techniques over several classification datasets under the same experimental protocol.
5.
It discusses the most recent findings in this field, and examines the open questions that can be addressed in future works.

This work is organized as follows: Section 2 presents an overview of multiple classifier system approaches. In Section 3, we propose an updated dynamic selection taxonomy, and discuss each component of a DS technique. In Section 4, we describe the most relevant dynamic selection methods and categorize them according to the proposed taxonomy. Section 5 presents a review of several real-world applications that use dynamic selection techniques to achieve a higher classification accuracy. An empirical comparison between the state-of-the-art DS techniques is conducted in Section 6. The conclusion and perspectives for future research in dynamic selection are given in the last section.

Section snippets

Basic concepts

This section presents the main concepts comprised in DS approaches. They provide the background needed to understand how DS techniques work as well as the main challenges involved in this class of techniques. The following mathematical notation is used in this paper:

•
$C = {c_{1}, \dots, c_{M}}$ is the pool consisting of M base classifiers.
•
x_j is a test sample with an unknown class label.
•
$θ_{j} = {x_{1}, \dots, x_{K}}$ is the region of competence of x_j, and x_k is one instance belonging to θ_j.
•
P(ω_l∣ x_j, c_i) Posterior probability

Dynamic selection

In dynamic selection, the classification of a new query sample usually involves three steps:

1.
Definition of the region of competence; that is, how to define the local region surrounding the query, x_j, in which the competence level of the base classifiers is estimated.
2.
Determination of the selection criteria used to estimate the competence level of the base classifiers, e.g., Accuracy, Probabilistic, and Ranking.
3.
Determination of the selection mechanism that chooses a single classifier (DCS) or an

Dynamic selection techniques

In this section, we present a review of the most relevant dynamic selection algorithms. The DS techniques were chosen taking into account their importance in the literature by the introduction of new concepts in the area (i.e., methods that introduced different ways of defining the competence region or selection criteria), their number of citations, as well as the availability of source code. Minor variations of an existing technique, such as different versions of the KNORA-E technique proposed

Applications

In this section, we present a review of real-world applications using dynamic selection techniques. Moreover, we also discuss how the authors adapt traditional DS techniques to the intrinsic characteristics of their applications; these includes aspects such as imbalanced distributions in customer classification and credit scoring [126] and the lack of validation samples in face recognition applications [7], [45].

Table 2 lists several real-world applications of DS techniques. Based on usage

Comparative study

In this section, we present an empirical comparison between 18 state-of-the-art techniques under the same experimental protocol. First, we compare the results of each DS techniques (Section 6.3). The DS techniques are then also compared against the baseline methods, namely, (1) Static Selection (SS), that is, the selection of an EoC during the training stage, and its combination using a majority voting scheme [46]; (2) single best (SB), which corresponds to the performance of the best

Conclusion and perspectives

In this paper, we presented an updated taxonomy of dynamic selection systems. The key points of a dynamic selection systems are analyzed: 1) the methodology used for the definition of the region of competence used to estimate the local competences of the base classifiers; 2) the source of information used to estimate the competence of the base classifiers, and 3) The selection approach used to determine whether a single base classifier or an ensemble of classifiers is selected. Then, we present

Acknowledgment

This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), the École de technologie supérieure (ÉTS Montréal) and CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico).

References (177)

S. Lessmann et al.
Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research
Eur. J. Oper. Res.
(2015)
XiaoH. et al.
Ensemble classification based on supervised clustering for credit scoring
Appl. Soft Comput.
(2016)
C. Porcel et al.
A hybrid recommender system for the selective dissemination of research resources in a technology transfer office
Inf. Sci.
(2012)
G. Giacinto et al.
Fusion of multiple classifiers for intrusion detection in computer networks
Pattern Recognit. Lett.
(2003)
G. Giacinto et al.
Intrusion detection in computer networks by a modular ensemble of one-class classifiers
Inf. Fus.
(2008)
B. Krawczyk et al.
Ensemble learning for data stream analysis: a survey
Inf. Fus.
(2017)
M. Wozniak et al.
A survey of multiple classifier systems as hybrid systems
Inf. Fus.
(2014)
A.S. Britto et al.
Dynamic selection of classifiers - a comprehensive review
Pattern Recognit.
(2014)
T. Woloszynski et al.
A probabilistic model of classifier competence for dynamic ensemble selection
Pattern Recognit.
(2011)
R.M.O. Cruz et al.
META-DES: a dynamic ensemble selection framework using meta-learning
Pattern Recognit.
(2015)

T. Woloszynski et al.

A measure of competence based on random classification for dynamic ensemble selection

Inf. Fus.

(2012)

B. Krawczyk et al.

Dynamic classifier selection for one-class classification

Knowl. Based Syst.

(2016)

A. Tsymbal et al.

Dynamic integration of classifiers for handling concept drift

Inf. Fus.

(2008)

I. Mendialdua et al.

Dynamic selection of the best base classifier in one versus one

Knowl. Based Syst.

(2015)

M. Galar et al.

Dynamic classifier selection for one-vs-one strategy: avoiding non-competent classifiers

Pattern Recognit.

(2013)

Z.-L. Zhang et al.

Exploring the effectiveness of dynamic ensemble selection in the one-versus-one scheme

Knowl. Based Syst.

(2017)

L. Batista et al.

Dynamic selection of generative–discriminative ensembles for off-line signature verification

Pattern Recognit.

(2012)

S. Bashbaghi et al.

Dynamic ensembles of exemplar-svms for still-to-video face recognition

Pattern Recognit

(2017)

D. Ruta et al.

Classifier selection for majority voting

Inf. Fus.

(2005)

M. Skurichina et al.

Bagging for linear classifiers

Pattern Recognit.

(1998)

A. Rahman et al.

Effect of ensemble classifier composition on offline cursive character recognition

Inf. Process. Manage.

(2013)

L. Rokach

Decision forest: twenty years of research

Inf. Fus.

(2016)

G. Giacinto et al.

Design of effective neural network ensembles for image classification purposes

Image Vis. Comput.

(2001)

G. Giacinto et al.

Design of effective neural network ensembles for image classification purposes

Image Vis. Comput.

(2001)

R.M.O. Cruz et al.

Feature representation selection based on classifier projection space and oracle analysis

Expert Syst. Appl.

(2013)

E.M. dos Santos et al.

Overfitting cautious selection of classifier ensembles with genetic algorithms

Inf. Fus.

(2009)

B. Gabrys et al.

Genetic algorithms in classifier fusion

Appl. Soft Comput.

(2006)

ZhouZ.-H. et al.

Ensembling neural networks: many could be better than all

Artif. Intell.

(2002)

R.E. Banfield et al.

Ensemble diversity measures and their application to thinning

Inf. Fus.

(2005)

L.I. Kuncheva et al.

Decision templates for multiple classifier fusion: an experimental comparison

Pattern Recognit.

(2001)

G.L. Rogova

Combining the results of several neural network classifiers

Neural Netw.

(1994)

D.M.J. Tax et al.

Combining multiple classifiers by averaging or by multiplying?

Pattern Recognit.

(2000)

L. Lam et al.

Optimal combinations of pattern classifiers

Pattern Recognit. Lett.

(1995)

D.H. Wolpert

Stacked generalization

Neural Netw.

(1992)

Š. Raudys

Trainable fusion rules. ii. small sample-size effects

Neural Netw.

(2006)

Š. Raudys

Trainable fusion rules. i. large sample size case

Neural Netw.

(2006)

D. Štefka et al.

Dynamic classifier aggregation using interaction-sensitive fuzzy measures

Fuzzy Sets Syst.

(2015)

R.M.O. Cruz et al.

Meta-des. Oracle: meta-learning and feature selection for dynamic ensemble selection

Inf. Fus.

(2017)

L.I. Kuncheva

A theoretical study on six classifier fusion strategies

IEEE Trans. Pattern Anal. Mach. Intell.

(2002)

T.G. Dietterich

Ensemble methods in machine learning

International Workshop on Multiple Classifier Systems

(2000)

L.I. Kuncheva

Combining Pattern Classifiers: Methods and Algorithms

(2004)

M. Fernández-Delgado et al.

Do we need hundreds of classifiers to solve real world classification problems?

J. Mach. Learn. Res.

(2014)

D. Opitz et al.

Popular ensemble methods: an empirical study

J. Artif. Intell. Res.

(1999)

R. Polikar

Ensemble based systems in decision making

IEEE Circuits Syst. Mag.

(2006)

S. Bashbaghi et al.

Dynamic selection of exemplar-svms for watch-list screening through domain adaptation

Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods (ICPRAM)

(2017)

P.R.L. de Almeida et al.

Music genre classification using dynamic selection of ensemble of classifiers

Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on

(2012)

M. Galar et al.

A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches

IEEE Trans. Syst. Man. Cybern. Part C

(2012)

M. Jahrer et al.

Combining predictions for accurate recommender systems

Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

(2010)

D. Di Nucci et al.

Dynamic selection of classifiers in bug prediction: an adaptive method

IEEE Trans. Emerg. Topics Comput. Intell.

(2017)

A. Panichella et al.

Cross-project defect prediction models: L’union fait la force

Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), 2014 Software Evolution Week-IEEE Conference on

(2014)

Cited by (0)

View full text

Dynamic classifier selection: Recent advances and perspectives

Highlights

Abstract

Introduction

Section snippets

Basic concepts

Dynamic selection

Dynamic selection techniques

Applications

Comparative study

Conclusion and perspectives

Acknowledgment

Eur. J. Oper. Res.

Appl. Soft Comput.

Inf. Sci.

Pattern Recognit. Lett.

Inf. Fus.

Inf. Fus.

Inf. Fus.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Inf. Fus.

Knowl. Based Syst.

Inf. Fus.

Knowl. Based Syst.

Pattern Recognit.

Knowl. Based Syst.

Pattern Recognit.

Pattern Recognit

Inf. Fus.

Pattern Recognit.

Inf. Process. Manage.

Inf. Fus.

Image Vis. Comput.

Image Vis. Comput.

Expert Syst. Appl.

Inf. Fus.

Appl. Soft Comput.

Artif. Intell.

Inf. Fus.

Pattern Recognit.

Neural Netw.

Pattern Recognit.

Pattern Recognit. Lett.

Neural Netw.

Neural Netw.

Neural Netw.

Fuzzy Sets Syst.

Inf. Fus.

A theoretical study on six classifier fusion strategies

IEEE Trans. Pattern Anal. Mach. Intell.

Ensemble methods in machine learning

International Workshop on Multiple Classifier Systems

Combining Pattern Classifiers: Methods and Algorithms

Do we need hundreds of classifiers to solve real world classification problems?

J. Mach. Learn. Res.

Popular ensemble methods: an empirical study

J. Artif. Intell. Res.

Ensemble based systems in decision making

IEEE Circuits Syst. Mag.

Dynamic selection of exemplar-svms for watch-list screening through domain adaptation

Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods (ICPRAM)

Music genre classification using dynamic selection of ensemble of classifiers

Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on

A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches

IEEE Trans. Syst. Man. Cybern. Part C

Combining predictions for accurate recommender systems

Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Dynamic selection of classifiers in bug prediction: an adaptive method

IEEE Trans. Emerg. Topics Comput. Intell.

Cross-project defect prediction models: L’union fait la force

Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE), 2014 Software Evolution Week-IEEE Conference on