Creating and measuring diversity in multiple classifier systems using support vector data description
Introduction
Ensemble classifiers, also called committees or multiple classifier systems, offer a solution to classification problems by optimizing base classifiers separately [1]. The idea of combining multiple classifiers is based on the observation that achieving optimal performance in combination is not necessarily consistent with obtaining the best performance for a base classifier. The rationale is that it appears convenient to optimize the design of a combination of relatively simple classifiers than to optimize the design of a single complex classifier. The increase in accuracy by using multiple classifiers is at least partially a result of diversity [2]. Therefore, a better understanding of diversity is expected to result in higher MCS accuracy.
On the other hand, many diversity measures have been proposed in the literature to exhibit better influence in creating MCS. Attempts to introduce different diversity measures are not only due to the need for generality, but also for gaining more efficiency in MCS creation. Various combination methods have been proposed to make MCSs including classifier selection [3], [4], [5], majority voting [6], weighted majority voting [7], decision templates [8], [9], Naïve Bayesian fusion [10], Dempster–Shafer combination [11], fuzzy integral [12], behavior knowledge space [13], Boosting [14], AdaBoost [15], Bagging [16] and several other methods [17], [18].
The concept of diversity plays an important role in the MCS generation [19] and could be achieved by manipulating the initial conditions, architecture, training data, topology and training algorithm of the base classifiers [2]. The main reason is that if there are many different classifiers, it is reasonable to expect an increase in the overall performance when combining them [20]. Then, it is intuitively accepted that classifiers to be combined should be diverse, as there is clearly no advantage to be gained from an ensemble that is composed of a set of identical classifiers [21]. Diversity is a property of an MCS with respect to a set of data. If all other factors are considered equal, diversity is greater when the classifiers spread their decision errors more evenly over the input data space.
The paper is organized in the following sections. In Section 2, major related works on diversity is reviewed. In Section 3, the new method is introduced with a description on measuring diversity and creating MCS. In Section 4 the method is analyzed using a number of known benchmark datasets with Section 5 as conclusion of this work.
Section snippets
Related works
It has been shown that combining the outputs of several classifiers is only useful if they disagree on some inputs [22], [23]. The measure of disagreement is referred to as diversity of the MCS. For regression problems, mean squared error is generally used to measure the accuracy, while variance is used for diversity. Although it is known that among base classifiers, diversity is a necessary condition for improving MCS performance, there is no general agreement about how to quantify the notion
The new kernel based diversity method
The method introduced constructs MCS by special attention to diversity as well as accuracy. These methods are categorized in two general groups. Some are called diversity driven because they monitor diversity during construction of MCS such that the value of diversity can affect the construction process. Some other methods are called accuracy driven because the main parameter for guiding construction process is its accuracy although in some methods both accuracy and diversity are considered
Experiments and analysis
Two main requirements were needed for evaluation of the proposed method: first, known datasets for evaluating the performance with respect to the other methods and second, a test procedure to carry out a comparison between the proposed and other methods.
Table 3 shows specifications of some datasets used for testing the new method. These datasets are selected from the repository of machine intelligence department of UCI [49]. Since the new diversity creating method employs SVDDs with RBF kernel
Conclusion
A new method for measuring diversity based on SVDT is proposed in this work to be named GWD. A new method for making MCS named GWDC is also presented which is based on the new measure of diversity. In addition, a classifier fusion method using SVDT is proposed to construct MCS. The measure of diversity and the fusion method proposed for constructing MCS are defined based on the properties of both DT and SVDD. Since the SVDD maps the data into the kernel space by a kernel function, different
References (50)
- et al.
Using diversity of errors for selecting members of a committee classifier
Pattern Recognition
(2006) - et al.
Decision templates for multiple classifier fusion: an experimental comparison
Pattern Recognition
(2001) - et al.
A decision – theoretic generalization of on-line learning and an application to boosting
(1997) - et al.
Adaptive fusion and co-operative training for classifie rensembles
Pattern Recognition
(2006) - et al.
Controlling the diversity in classifier ensembles through a measure of agreement
Pattern Recognition
(2005) - et al.
Relationships between combination methods and measures of diversity in combining classifiers
Information Fusion
(2002) - et al.
Software diversity: practical statistics for its measurement and exploitation
Information & Software Technology
(1997) - et al.
A generalized adaptive ensemble generation and aggregation approach fo rmultiple classifier systems
Pattern Recognition
(2009) - et al.
Support vector domain description
Pattern Recognition Letters
(1999) - et al.
A boundary method for outlier detection based on support vector domain description
Pattern Recognition
(2009)
Simplifying particle swarm optimization
Applied Soft Computing
Tuning SVM parameters by using a hybrid CLPSO–BFGS algorithm
Neurocomputing
An overview of classifier fusion methods
Computing and Information Systems
Measures of diversity in classifier ensembles
Machine Learning
Dynamic classifier selection based on multiple classifie rbehaviour
Pattern Recognition
Combination of multiple classifiers using local accuracy estimates
IEEE Transactions on Pattern Analysis and Machine Intelligence
On combining classifiers
IEEE Transactions on Pattern Analysis and Machine Intelligence
Combining Pattern Classifiers Methods and Algorithms
Switching between selection and fusion in combining classifiers: an experiment
IEEE Transactions on Systems Man and Cybernetics
Methods of combining multiple classifiers and their application to handwriting recognition
IEEE Transactions on Systems Man and Cybernetics
Use of Dempster–Shafer theory to combine classifiers which use different class boundaries
Pattern Analysis and Applications
Combining multiple neural networks by fuzzy integral and robust classification
IEEE Transactions on Systems Man and Cybernetics
A method of combining multiple experts for the recognition o funconstrained handwritten numerals
IEEE Transactions on Pattern Analysis and Machine Intelligence
Experiments with a new boosting algorithm
Bagging predictors
Machine Learning
Cited by (22)
A new method for ensemble combination based on adaptive decision making
2021, Knowledge-Based SystemsCitation Excerpt :The purpose of diversity is to ensure that each classifier contains complementary information i.e. making base classifiers independent. Independence means base classifiers are uncorrelated with maximum diversity [10–12]. On the opposite extreme case, dependence means classifiers are fully correlated with minimum diversity.
A real-time crash prediction fusion framework: An imbalance-aware strategy for collision avoidance systems
2020, Transportation Research Part C: Emerging TechnologiesPrediction of chemical composition and mechanical properties in powder metallurgical steels using multi-electromagnetic nondestructive methods and a data fusion system
2020, Journal of Magnetism and Magnetic MaterialsApplying a fuzzy interval ordered weighted averaging aggregation fusion to nondestructive determination of retained austenite phase in D2 tool steel
2019, NDT and E InternationalCitation Excerpt :These DMs use different mapping functions to produce redundant information for the DF section by receiving the same inputs but producing different values for the retained austenite due to differences in their structure and behavior. This difference in behavior of the DMs is called diversity which is a basic characteristic for data fusion [48]. The DF system combines the outputs of these DMs to produce fusion output which is the final decision for the retained austenite.
Improving nondestructive characterization of dual phase steels using data fusion
2018, Journal of Magnetism and Magnetic MaterialsCitation Excerpt :The experiments were repeated with different number of learners to indicate the effect of the number of learners on the fusion process. These learners have different internal structures (different number of hidden layers and neurons), which ensure us to have enough diversity for the fusion process [44,45]. The experiments were repeated 10 times and the averages of the results were reported.