Elsevier

Neurocomputing

Volume 167, 1 November 2015, Pages 459-466
Neurocomputing

Optimal construction of one-against-one classifier based on meta-learning

https://doi.org/10.1016/j.neucom.2015.04.048Get rights and content

Author-Highlights

  • ODOAO seeks to construct a one-against-one classifier based on meta-learning.

  • ODOAO utilizes binary base classifiers from various classification algorithms.

  • A meta-classifier effectively combines the outputs from all the base classifiers.

  • The effectiveness of ODOAO is demonstrated through experiments.

Abstract

A commonly used strategy for solving a multi-class classification problem is to decompose the original problem into several binary subproblems. The recently proposed method, diversified one-against-one (DOAO), constructs a one-against-one classifier by selecting the best classifier for each class pair from the set of heterogeneous base classifiers. It was found to yield better classification accuracy than other one-against-one classifiers that are based on individual classification algorithms. This paper presents a novel method, called optimally diversified one-against-one (ODOAO) which is an improvement of DOAO. ODOAO is based on meta-learning, and seeks to construct a multiple classifier system where a meta-classifier effectively combines the outputs from all the heterogeneous base classifiers that are trained using various classification algorithms for every class pair. Experimental results show that ODOAO outperforms DOAO and other one-against-one based methods with statistical significance.

Introduction

Classification is a type of supervised learning task that involves predicting output variables consisting of a finite number of categories called classes. In a classification task, a classification algorithm A defines its hypothesis space HA. Training a classifier is to find the hypothesis hHA that approximates the true function f given a set of instances called a training dataset. Thus, a classifier corresponds to its hypothesis in the hypothesis space. Finding the hypothesis that is closest to the true function f is crucial for obtaining high classification accuracy. Multiple classifier system (MCS), which combines the outputs from a diverse of classifiers, has received considerable attention and has been studied by a wide range of researchers [1], [2], [3], [4], [5].

In general, an MCS offers better classification accuracy and robustness than any individual classifier. Dietterich [6] explained three fundamental reasons for why an MCS successfully performs well. The first is a statistical reason. Given a finite number of training instances, many hypotheses are equally good. Therefore, averaging these hypotheses may result in a more stable approximation of f. The second is a computational reason. Because the hypothesis space is so large, a heuristic search is conducted to find the best hypothesis. However, the search may get stuck at a local optimum. Repeating the search with several random starts provides a better chance of finding the global optimum. The third is a representational reason. The true function f may not be represented by any of the hypotheses in the hypothesis space HA, but may be better approximated by aggregating several hypotheses.

The concept of an MCS has also been successfully applied to multi-class classification problems. This is typically accomplished by decomposing the original problem into several binary subproblems. The base classifiers for the subproblems constitute an MCS. Regarding the decomposition strategy, the two commonly used approaches are one-against-one and one-against-rest [4], [7]. Several experimental studies argued that the one-against-one approach outperforms the one-against-rest approach [8], [9], and that such decomposition strategy is also effective for classification algorithms that are capable of dealing with multi-class classification problems directly [8], [10], [11], [12].

Recently, we proposed diversified one-against-one (DOAO) [13] which seeks to find the best classification algorithm for each class pair when applying the one-against-one approach to multi-class classification problems. With DOAO, the best classification algorithm for each class pair is selected as having the minimum validation error. The experimental results confirmed that DOAO outperforms other one-against-one classifiers that are based on individual classification algorithms or voting of them. This is because, according to the so-called no-free-lunch theorem [14], there is no single algorithm that always outperform the others for every classification problem [15], [16], [17]. Employing a variety of classification algorithms takes the advantages of different inductive biases of the algorithms, thereby yielding better classification accuracy. Such effectiveness can be also explained as the extension of the hypothesis space. An MCS with different classification algorithms is more likely to obtain a better hypothesis by searching the union of hypothesis spaces defined by different algorithms.

However, there are two major limitations to DOAO. First, the minimum validation error does not always indicate the minimum test error, especially when comparing heterogeneous classifiers [18]. Therefore, selecting the classifier based on the validation error does not guarantee the optimal. Second, several classifiers that are properly fused can outperform the single best classifier [19], [20]. To address such limitations, we consider employing a meta-classifier to find the optimal combination of base classifiers. When a meta-classifier is employed, the new instance is first classified by the base classifiers, and the results are used as inputs for the meta-classifier to determine the final classification result.

In this paper, we propose optimally diversified one-against-one (ODOAO) that improves DOAO in order to achieve better classification accuracy. ODOAO seeks to find the optimal combination of base classifiers that are built for every class pair and candidate classification algorithm according to the concept of DOAO. To do this, a meta-classifier is trained based on stacking [21], where the input variables are the predicted labels from the base classifiers on the validation dataset, and the output variable is the target label. ODOAO is further enhanced by applying a classification algorithm that can effectively deal with high dimensionality and non-linear relationship between the predictions of the base classifiers when training the meta-classifier. We investigate the effectiveness of the proposed method through experiments on multi-class benchmark datasets.

The rest of this paper is organized as follows. In Section 2, we briefly review the related work. In Section 3, we describe our proposed method. We report the experimental results in Section 4, and offer conclusions and future work in Section 5.

Section snippets

Multiple classifier systems for multi-class classification

When the number of classes in a classification problem is more than two, the problem is called a multi-class classification problem. To solve a multi-class classification problem, three strategies can be considered. The first is simply to use the classification algorithms that solve the problems directly, such as decision trees (DT), k-nearest-neighbors (kNN), and artificial neural networks (ANN).

The second strategy is to decompose the original problem into several binary subproblems. This

The proposed method

We first introduce our previous work, named DOAO [13] and discuss its drawbacks. Then, we present our proposed method ODOAO, which addressed such drawbacks.

Experiments

We investigated the effectiveness of the proposed method through experiments on the benchmark datasets. This section details the experiments and discusses the results.

Conclusions

DOAO is a recently proposed method that constructs a one-against-one classifier by selecting the best classifiers among several heterogeneous candidate classifiers, and performs well for multi-class classification problems. However, there are two major limitations. The first is that the minimum validation error does not guarantee the minimum test error. The second is that a selective fusion of the best set of classifiers would better than the single best classifier. Thus, we determined that DOAO

Acknowledgments

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MSIP) (No. 2011-0030814), and the Brain Korea 21 PLUS Project in 2014. This work was also supported by the Engineering Research Institute of SNU.

Seokho Kang received B.S. degree in 2011, and is currently a Ph.D. candidate in the Department of Industrial Engineering, College of Engineering, Seoul National University, Seoul, Korea. His research interests include kernel-based learning algorithms, multiple classifier systems, dimensionality reduction, and their data mining applications.

References (50)

  • E. Menahem et al.

    Troika—an improved stacking schema for classification tasks

    Inf. Sci.

    (2009)
  • S. Kang et al.

    Approximating support vector machine with artificial neural network for fast prediction

    Expert Syst. Appl.

    (2014)
  • S. García et al.

    Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data miningexperimental analysis of power

    Inf. Sci.

    (2010)
  • L. Xu et al.

    Methods of combining multiple classifiers and their applications to handwriting recognition

    IEEE Trans. Syst. Man Cybern.

    (1992)
  • T.K. Ho et al.

    Decision combination in multiple classifier systems

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1994)
  • J. Kittler et al.

    On combining classifiers

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1998)
  • L. Rokach

    Ensemble-based classifiers

    Artif. Intell. Rev.

    (2010)
  • T.G. Dietterich, Ensemble methods in machine learning, in: Multiple Classifier Systems, in: Lecture Notes in Computer...
  • A.C. Lorena et al.

    A review on the combination of binary classifiers in multiclass problems

    Artif. Intell. Rev.

    (2008)
  • C.-W. Hsu et al.

    A comparison of methods for multiclass support vector machines

    IEEE Trans. Neural Netw.

    (2002)
  • J. Fürnkranz

    Round robin classification

    J. Mach. Learn. Res.

    (2002)
  • S. Knerr et al.

    Single-layer learning revisited: a stepwise procedure for building and training a neural network

  • D.H. Wolpert, The supervised learning no-free-lunch theorems, in: Proceedings of the 6th Online World Conference on...
  • S.Y. Sohn

    Meta analysis of classification algorithms for pattern recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1999)
  • T.-S. Lim et al.

    A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms

    Mach. Learn.

    (2000)
  • Cited by (0)

    Seokho Kang received B.S. degree in 2011, and is currently a Ph.D. candidate in the Department of Industrial Engineering, College of Engineering, Seoul National University, Seoul, Korea. His research interests include kernel-based learning algorithms, multiple classifier systems, dimensionality reduction, and their data mining applications.

    Sungzoon Cho is a professor in the Department of Industrial Engineering, College of Engineering, Seoul National University, Seoul, Korea. His research interests are neural network, pattern recognition, data mining, and their applications in various areas such as response modeling and keystroke based authentication. He published over 100 papers in various journals and proceedings. He also holds a US patent and a Korean patent concerned with keystroke-based user authentication.

    View full text