Optimal construction of one-against-one classifier based on meta-learning
Introduction
Classification is a type of supervised learning task that involves predicting output variables consisting of a finite number of categories called classes. In a classification task, a classification algorithm defines its hypothesis space . Training a classifier is to find the hypothesis that approximates the true function f given a set of instances called a training dataset. Thus, a classifier corresponds to its hypothesis in the hypothesis space. Finding the hypothesis that is closest to the true function f is crucial for obtaining high classification accuracy. Multiple classifier system (MCS), which combines the outputs from a diverse of classifiers, has received considerable attention and has been studied by a wide range of researchers [1], [2], [3], [4], [5].
In general, an MCS offers better classification accuracy and robustness than any individual classifier. Dietterich [6] explained three fundamental reasons for why an MCS successfully performs well. The first is a statistical reason. Given a finite number of training instances, many hypotheses are equally good. Therefore, averaging these hypotheses may result in a more stable approximation of f. The second is a computational reason. Because the hypothesis space is so large, a heuristic search is conducted to find the best hypothesis. However, the search may get stuck at a local optimum. Repeating the search with several random starts provides a better chance of finding the global optimum. The third is a representational reason. The true function f may not be represented by any of the hypotheses in the hypothesis space , but may be better approximated by aggregating several hypotheses.
The concept of an MCS has also been successfully applied to multi-class classification problems. This is typically accomplished by decomposing the original problem into several binary subproblems. The base classifiers for the subproblems constitute an MCS. Regarding the decomposition strategy, the two commonly used approaches are one-against-one and one-against-rest [4], [7]. Several experimental studies argued that the one-against-one approach outperforms the one-against-rest approach [8], [9], and that such decomposition strategy is also effective for classification algorithms that are capable of dealing with multi-class classification problems directly [8], [10], [11], [12].
Recently, we proposed diversified one-against-one (DOAO) [13] which seeks to find the best classification algorithm for each class pair when applying the one-against-one approach to multi-class classification problems. With DOAO, the best classification algorithm for each class pair is selected as having the minimum validation error. The experimental results confirmed that DOAO outperforms other one-against-one classifiers that are based on individual classification algorithms or voting of them. This is because, according to the so-called no-free-lunch theorem [14], there is no single algorithm that always outperform the others for every classification problem [15], [16], [17]. Employing a variety of classification algorithms takes the advantages of different inductive biases of the algorithms, thereby yielding better classification accuracy. Such effectiveness can be also explained as the extension of the hypothesis space. An MCS with different classification algorithms is more likely to obtain a better hypothesis by searching the union of hypothesis spaces defined by different algorithms.
However, there are two major limitations to DOAO. First, the minimum validation error does not always indicate the minimum test error, especially when comparing heterogeneous classifiers [18]. Therefore, selecting the classifier based on the validation error does not guarantee the optimal. Second, several classifiers that are properly fused can outperform the single best classifier [19], [20]. To address such limitations, we consider employing a meta-classifier to find the optimal combination of base classifiers. When a meta-classifier is employed, the new instance is first classified by the base classifiers, and the results are used as inputs for the meta-classifier to determine the final classification result.
In this paper, we propose optimally diversified one-against-one (ODOAO) that improves DOAO in order to achieve better classification accuracy. ODOAO seeks to find the optimal combination of base classifiers that are built for every class pair and candidate classification algorithm according to the concept of DOAO. To do this, a meta-classifier is trained based on stacking [21], where the input variables are the predicted labels from the base classifiers on the validation dataset, and the output variable is the target label. ODOAO is further enhanced by applying a classification algorithm that can effectively deal with high dimensionality and non-linear relationship between the predictions of the base classifiers when training the meta-classifier. We investigate the effectiveness of the proposed method through experiments on multi-class benchmark datasets.
The rest of this paper is organized as follows. In Section 2, we briefly review the related work. In Section 3, we describe our proposed method. We report the experimental results in Section 4, and offer conclusions and future work in Section 5.
Section snippets
Multiple classifier systems for multi-class classification
When the number of classes in a classification problem is more than two, the problem is called a multi-class classification problem. To solve a multi-class classification problem, three strategies can be considered. The first is simply to use the classification algorithms that solve the problems directly, such as decision trees (DT), k-nearest-neighbors (kNN), and artificial neural networks (ANN).
The second strategy is to decompose the original problem into several binary subproblems. This
The proposed method
We first introduce our previous work, named DOAO [13] and discuss its drawbacks. Then, we present our proposed method ODOAO, which addressed such drawbacks.
Experiments
We investigated the effectiveness of the proposed method through experiments on the benchmark datasets. This section details the experiments and discusses the results.
Conclusions
DOAO is a recently proposed method that constructs a one-against-one classifier by selecting the best classifiers among several heterogeneous candidate classifiers, and performs well for multi-class classification problems. However, there are two major limitations. The first is that the minimum validation error does not guarantee the minimum test error. The second is that a selective fusion of the best set of classifiers would better than the single best classifier. Thus, we determined that DOAO
Acknowledgments
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MSIP) (No. 2011-0030814), and the Brain Korea 21 PLUS Project in 2014. This work was also supported by the Engineering Research Institute of SNU.
Seokho Kang received B.S. degree in 2011, and is currently a Ph.D. candidate in the Department of Industrial Engineering, College of Engineering, Seoul National University, Seoul, Korea. His research interests include kernel-based learning algorithms, multiple classifier systems, dimensionality reduction, and their data mining applications.
References (50)
- et al.
A survey of multiple classifier systems as hybrid systems
Inf. Fusion
(2014) - et al.
An overview of ensemble methods for binary classifiers in multi-class problemsexperimental study on one-vs-one and one-vs-all schemes
Pattern Recognit.
(2011) - et al.
A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems
Expert Syst. Appl.
(2009) - et al.
Constructing a multi-class classifier using one-against-one approach with different binary classifiers
Neurocomputing
(2015) A comparative assessment of classification methods
Decis. Support Syst.
(2003)- et al.
An approach to the automatic design of multiple classifier systems
Pattern Recognit. Lett.
(2001) Stacked generalization
Neural Netw.
(1992)- et al.
Growing a multi-class classifier with a reject option
Pattern Recognit. Lett.
(2008) - et al.
Clustering-based ensembles for one-class classification
Inf. Sci.
(2014) - et al.
Constructing support vector machine ensemble
Pattern Recognit.
(2003)
Troika—an improved stacking schema for classification tasks
Inf. Sci.
Approximating support vector machine with artificial neural network for fast prediction
Expert Syst. Appl.
Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data miningexperimental analysis of power
Inf. Sci.
Methods of combining multiple classifiers and their applications to handwriting recognition
IEEE Trans. Syst. Man Cybern.
Decision combination in multiple classifier systems
IEEE Trans. Pattern Anal. Mach. Intell.
On combining classifiers
IEEE Trans. Pattern Anal. Mach. Intell.
Ensemble-based classifiers
Artif. Intell. Rev.
A review on the combination of binary classifiers in multiclass problems
Artif. Intell. Rev.
A comparison of methods for multiclass support vector machines
IEEE Trans. Neural Netw.
Round robin classification
J. Mach. Learn. Res.
Single-layer learning revisited: a stepwise procedure for building and training a neural network
Meta analysis of classification algorithms for pattern recognition
IEEE Trans. Pattern Anal. Mach. Intell.
A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms
Mach. Learn.
Cited by (0)
Seokho Kang received B.S. degree in 2011, and is currently a Ph.D. candidate in the Department of Industrial Engineering, College of Engineering, Seoul National University, Seoul, Korea. His research interests include kernel-based learning algorithms, multiple classifier systems, dimensionality reduction, and their data mining applications.
Sungzoon Cho is a professor in the Department of Industrial Engineering, College of Engineering, Seoul National University, Seoul, Korea. His research interests are neural network, pattern recognition, data mining, and their applications in various areas such as response modeling and keystroke based authentication. He published over 100 papers in various journals and proceedings. He also holds a US patent and a Korean patent concerned with keystroke-based user authentication.