Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classification
Introduction
Classification is one of the most important aspects of data mining. Perceptrons [20], [31] were originally used for binary classification of objects whose representations were linearly separable. That is, perceptrons find a hyperplane separating positive and negative objects. Minsky and Papert [20] presented several problems that could not be solved due to the linearly separable restriction. Support vector machines (SVMs) were proposed by Vapnik [33], [34], [35], [36], [37], [38] to overcome the linearly separable restriction in perceptrons. SVMs use a mapping to transform the input space into another feature space such that the binary classes are indeed linearly separable. Moreover, unlike perceptrons, SVMs find an optimal hyperplane maximizing the distance between the two classes. In practical applications, however, classification tends to involve more than two classes.
Several techniques have been proposed to extend SVMs for multi-classification. Vapnik [33], [34], [35], [36], [37], [38] proposed the one-versus-rest (1-v-r) approach. For each class, the 1-v-r method constructs a binary classifier that separates objects belonging to this class from objects that do not. In the one-versus-one (1-v-1) approach, suggested by Knerr et al. [13], one SVM is constructed for each pair of classes. A third approach to extend SVMs for multi-classification, called SVMDAGs, applies directed acyclic graphs to alleviate some computation during classification of objects [24]. Whereas these three works attempt to extend SVMs for multi-classification, rough sets, proposed by Pawlak [27], [28], [29], [30], have a long history in multi-classification. Rough sets are especially useful when the sample data does not define object classes in terms of precise sets. In these situations, rough sets represent classes of objects using three regions: positive, negative and boundary. Researchers have provided variations of rough set theory involving these three regions [14], [16], [21], [38], [39], [40], [41], [42], [43]. The rough set theory is also widely used for classification in practical applications [21], [32].
In this paper, we propose a rough set approach to SVM classification. We first show that binary classification with SVMs can be described using rough sets [17]. Next, by adopting the boundary region in rough sets, we suggest a new approach to SVM multi-classification. The proposed framework possesses several salient features. A boundary region allows for classification error. This property is useful when dealing with noisy data. Use of rough sets may shorten the training time of the 1-v-r approach, which has been said to have a long training time [2]. Similar to 1-v-r, our approach considers the N classes one at a time. For each class, however, we can safely remove the rough set positive region (the lower bound) from future consideration, as all objects in this positive region definitely belong to the class currently under consideration [18]. Extension of 1-v-1 method using rough sets leads to reduced storage requirements over the conventional 1-v-1 and DAGSVM methods [19]. Whereas the latter two approaches store SVMs [1], our approach only needs to store 2N rules. More specifically, one rule is stored for the rough set positive region and another for the rough set boundary region. Finally, our technique provides a better semantic interpretation of the classification process. Rough sets provide an explanation of the classification process [17], [18], [19]. This is important, since da Rocha and Yager [6] advocate that describing the relationship between black-box approaches like SVMs with the logical rules approaches can lead to semantically enhanced network based classifiers.
This paper is organized as follows. Background knowledge is given in Section 2. In Section 3, we present our rough set approach to SVM binary classification. Our rough set approach to SVM multi-classification is suggested in Section 4. In Section 5, we provide several advantages of our proposed rough set approach to SVM classification. Experimental illustration of the proposed approaches is reported in Section 6. The conclusion is given in Section 7.
Section snippets
Background knowledge
In this section, we review support vector machines and rough sets.
Rough sets for SVM binary classification
Vapnik [36] has recognized the margin in SVM approach as an important issue in further theoretical development. This section describes a rough set interpretation of SVM binary classification proposed by Lingras and Butz [17].
It is assumed that the objects have already been mapped using the same mapping Φ to transform the problem to a linear separable case. For the remainder of this paper, we assume that all the computations take place in this enhanced feature space. We first consider the ideal
Rough sets for SVM multi-classification
Before presenting a rough set approach to SVM multi-classification, we first review other techniques proposed for extending SVMs from binary classification to multi-classification.
The problem of multi-classification, especially for systems like SVMs, does not present an easy solution [25]. It is generally simpler to construct classifier theory and algorithms for two mutually-exclusive classes than it is for N mutually-exclusive classes. Platt et al. [24] claimed that constructing N-class SVMs
Discussion
In this section, we provide several salient features of representing SVM binary classification and SVM multi-classification in terms of rough sets.
Implementation of the proposed approaches
In this section, we will discuss the operational details of implementing rough set based 1-v-1 and 1-v-r SVM multi-classifications. We do so with the help of the Gist software tools [22], [23] for SVM classification that are downloadable from http://microarray.cpmc.columbia.edu/gist/. The SVM portion of Gist is also available via an interactive web server at http://svm.sdsc.edu [22].
As shown in Fig. 13, we created a synthetic feature space with two dimensions and consisting of 150 objects.
Conclusions
In this paper, we suggested a rough set approach to both SVM binary classification and SVM multi-classification. While SVM binary classification transforms the feature space into another space so that the objects are linearly separable, Cristianini [5] states that this process leads not only to very high dimensions and their associated computational costs. Moreover, it is also easy to over-fit the data in high dimensional spaces. Our rough set approach to SVM binary classification allows for an
References (43)
On the structure of generalized rough sets
Information Sciences
(2006)From rough sets to soft computing: Introduction
Information Sciences
(1998)Some mathematical structures for computational information
Information Sciences
(2000)Rough classification
International Journal of Man–Machine Studies
(1984)An inquiry into anatomy of conflicts
Information Sciences
(1998)Mining diagnostic rules from clinical databases using rough sets and medical diagnostic model
Information Sciences
(2004)- et al.
Neighborhood operator systems and approximations
Information Sciences
(2002) Relational interpretations of neighborhood operators and rough set approximation operators
Information Sciences
(1998)Constructive and algebraic methods of the theory of rough sets
Information Sciences
(1998)A comparative study of fuzzy sets and rough sets
Information Sciences
(1998)