Elsevier

Information Sciences

Volume 177, Issue 18, 15 September 2007, Pages 3782-3798
Information Sciences

Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classification

https://doi.org/10.1016/j.ins.2007.03.028Get rights and content

Abstract

Support vector machines (SVMs) are essentially binary classifiers. To improve their applicability, several methods have been suggested for extending SVMs for multi-classification, including one-versus-one (1-v-1), one-versus-rest (1-v-r) and DAGSVM. In this paper, we first describe how binary classification with SVMs can be interpreted using rough sets. A rough set approach to SVM classification removes the necessity of exact classification and is especially useful when dealing with noisy data. Next, by utilizing the boundary region in rough sets, we suggest two new approaches, extensions of 1-v-r and 1-v-1, to SVM multi-classification that allow for an error rate. We explicitly demonstrate how our extended 1-v-r may shorten the training time of the conventional 1-v-r approach. In addition, we show that our 1-v-1 approach may have reduced storage requirements compared to the conventional 1-v-1 and DAGSVM techniques. Our techniques also provide better semantic interpretations of the classification process. The theoretical conclusions are supported by experimental findings involving a synthetic dataset.

Introduction

Classification is one of the most important aspects of data mining. Perceptrons [20], [31] were originally used for binary classification of objects whose representations were linearly separable. That is, perceptrons find a hyperplane separating positive and negative objects. Minsky and Papert [20] presented several problems that could not be solved due to the linearly separable restriction. Support vector machines (SVMs) were proposed by Vapnik [33], [34], [35], [36], [37], [38] to overcome the linearly separable restriction in perceptrons. SVMs use a mapping to transform the input space into another feature space such that the binary classes are indeed linearly separable. Moreover, unlike perceptrons, SVMs find an optimal hyperplane maximizing the distance between the two classes. In practical applications, however, classification tends to involve more than two classes.

Several techniques have been proposed to extend SVMs for multi-classification. Vapnik [33], [34], [35], [36], [37], [38] proposed the one-versus-rest (1-v-r) approach. For each class, the 1-v-r method constructs a binary classifier that separates objects belonging to this class from objects that do not. In the one-versus-one (1-v-1) approach, suggested by Knerr et al. [13], one SVM is constructed for each pair of classes. A third approach to extend SVMs for multi-classification, called SVMDAGs, applies directed acyclic graphs to alleviate some computation during classification of objects [24]. Whereas these three works attempt to extend SVMs for multi-classification, rough sets, proposed by Pawlak [27], [28], [29], [30], have a long history in multi-classification. Rough sets are especially useful when the sample data does not define object classes in terms of precise sets. In these situations, rough sets represent classes of objects using three regions: positive, negative and boundary. Researchers have provided variations of rough set theory involving these three regions [14], [16], [21], [38], [39], [40], [41], [42], [43]. The rough set theory is also widely used for classification in practical applications [21], [32].

In this paper, we propose a rough set approach to SVM classification. We first show that binary classification with SVMs can be described using rough sets [17]. Next, by adopting the boundary region in rough sets, we suggest a new approach to SVM multi-classification. The proposed framework possesses several salient features. A boundary region allows for classification error. This property is useful when dealing with noisy data. Use of rough sets may shorten the training time of the 1-v-r approach, which has been said to have a long training time [2]. Similar to 1-v-r, our approach considers the N classes one at a time. For each class, however, we can safely remove the rough set positive region (the lower bound) from future consideration, as all objects in this positive region definitely belong to the class currently under consideration [18]. Extension of 1-v-1 method using rough sets leads to reduced storage requirements over the conventional 1-v-1 and DAGSVM methods [19]. Whereas the latter two approaches store N×(N-1)2 SVMs [1], our approach only needs to store 2N rules. More specifically, one rule is stored for the rough set positive region and another for the rough set boundary region. Finally, our technique provides a better semantic interpretation of the classification process. Rough sets provide an explanation of the classification process [17], [18], [19]. This is important, since da Rocha and Yager [6] advocate that describing the relationship between black-box approaches like SVMs with the logical rules approaches can lead to semantically enhanced network based classifiers.

This paper is organized as follows. Background knowledge is given in Section 2. In Section 3, we present our rough set approach to SVM binary classification. Our rough set approach to SVM multi-classification is suggested in Section 4. In Section 5, we provide several advantages of our proposed rough set approach to SVM classification. Experimental illustration of the proposed approaches is reported in Section 6. The conclusion is given in Section 7.

Section snippets

Background knowledge

In this section, we review support vector machines and rough sets.

Rough sets for SVM binary classification

Vapnik [36] has recognized the margin in SVM approach as an important issue in further theoretical development. This section describes a rough set interpretation of SVM binary classification proposed by Lingras and Butz [17].

It is assumed that the objects have already been mapped using the same mapping Φ to transform the problem to a linear separable case. For the remainder of this paper, we assume that all the computations take place in this enhanced feature space. We first consider the ideal

Rough sets for SVM multi-classification

Before presenting a rough set approach to SVM multi-classification, we first review other techniques proposed for extending SVMs from binary classification to multi-classification.

The problem of multi-classification, especially for systems like SVMs, does not present an easy solution [25]. It is generally simpler to construct classifier theory and algorithms for two mutually-exclusive classes than it is for N mutually-exclusive classes. Platt et al. [24] claimed that constructing N-class SVMs

Discussion

In this section, we provide several salient features of representing SVM binary classification and SVM multi-classification in terms of rough sets.

Implementation of the proposed approaches

In this section, we will discuss the operational details of implementing rough set based 1-v-1 and 1-v-r SVM multi-classifications. We do so with the help of the Gist software tools [22], [23] for SVM classification that are downloadable from http://microarray.cpmc.columbia.edu/gist/. The SVM portion of Gist is also available via an interactive web server at http://svm.sdsc.edu [22].

As shown in Fig. 13, we created a synthetic feature space with two dimensions and consisting of 150 objects.

Conclusions

In this paper, we suggested a rough set approach to both SVM binary classification and SVM multi-classification. While SVM binary classification transforms the feature space into another space so that the objects are linearly separable, Cristianini [5] states that this process leads not only to very high dimensions and their associated computational costs. Moreover, it is also easy to over-fit the data in high dimensional spaces. Our rough set approach to SVM binary classification allows for an

References (43)

  • F. Chang, C.-C. Lin, C.-J. Chen. Applying a hybrid method to handwritten character recognition, in: Proceedings of 17th...
  • F. Chang, C.-H. Chou, C.-C. Lin, C.-J. Chen, A prototype classification method and its application to handwritten...
  • P-H Chen, C.-J. Lin, B. Scholkopf, A Tutorial on Support Vector Machines, 2002....
  • N. Cristianini et al.

    An Introduction to Support Vector Machines (and Other Kernel-based Learning Methods)

    (2000)
  • N. Cristianini, Support vector and kernel methods for pattern recognition, 2003....
  • A.F. da-Rocha et al.

    Neural nets and fuzzy logic

  • H. Dietl, S. Weiss, A novel approach to detect frequency-specific cochlear hearing loss, in: IMA Conference on...
  • J.H. Friedman, Another approach to polychotomous classification, Technical report, Stanford Department of Statistics,...
  • H. Gómez Moreno et al.

    Color images segmentation using the Support Vector Machines

  • R. Hecht-Nielsen

    Neurocomputing

    (1990)
  • V.C. Hoffmann, Learning Theory and Support Vector Machines, 2003....
  • Cited by (0)

    View full text