Weighted linear loss multiple birth support vector machine based on information granulation for multi-class classification
Introduction
Standard support vector machine (SVM) [1], which is based on the statistical learning theory [2] and the Vapnik–Chervonenkis (VC) dimension, classifies 2-category points by assigning them to one of two disjoint half spaces. SVM has drawn extensive attention of scholars [3], [4], [5], [6], [7], [8] and has been applied to many fields successfully [9], [10], [11], [12], [13]. Twin support vector machine (TWSVM) [14], as an excellent extension of SVM, generates two nonparallel hyperplanes such that each plane is close to one of the two classes and as far as possible from the other class. TWSVM assigns a new sample to one of the classes depending on which hyperplane the new sample is closer to. An illustrative diagram of the thought of TWSVM in 2-dimensional space is shown in Fig. 1. TWSVM solves two small-scale quadratic programming problems (QPPs), whereas SVM solves one single QPP with a large number of constraints. Because of the strategy, TWSVM is almost four times faster than standard SVM [15]. In the last several years, TWSVM has been studied extensively and greatly generalized [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27]. Recently, Shao et al. [28] proposed a novel extension of TWSVM, called weighted linear loss twin support vector machine (WLTSVM). Different from TWSVM, WLTSVM solves linear equations. The two systems of linear equations in WLTSVM for binary classification can be solved efficiently by using the well-known conjugate gradient algorithm, resulting in the ability to deal with large-scale datasets without any extra external optimizers. Many pattern recognition problems in real world are multi-class classification problems [29], [30], [31], [32], [33], [34], [35], [36], [37], [38]. WLSTVM has also been extended to multi-class classification problems. However, multiple WLTSVM uses the strategy “one-versus-rest” with high computational complexity. Multiple WLTSVM builds a binary WLTSVM classifier for each class. Each binary WLTSVM in multiple WLTSVM is constructed by considering samples in one of the classes as positive samples and the rest as negative samples and training them. Multiple WLTSVM does not keep the advantages of WLTSVM that has high performance and low computational complexity. Multiple birth support vector machine (MBSVM) [39] is another novel extension of TWSVM with high performance. MBSVM uses the strategy “all-versus-one”. The strategy “all-versus-one” considers one of the classes as negative class and all the rest classes as positive class in turn to generate a serious of binary sub-classifiers to solve the multi-class classification problem. However, MBSVM needs to deal with a serious of QPPs.
Several multi-class TWSVMs have been proposed. The strategies that can be used to extend binary TWSVMs to multi-class TWSVMs include: one-versus-rest, one-versus-one, one-versus-one-versus-rest, binary tree, rest-versus-one, directed acyclic grape (DAG). The strategy one-versus-rest is easy to understand and implement. However, the complexity of one-versus-rest based methods is high. In general, one-versus-one based multi-class TWSVMs and DAG based multi-class TWSVMs always get better classification accuracies than other methods. However, they need to build a large number of sub-classifiers. When the number of classes is big, they are complex systems. The complexity of one-versus-one-rest based methods is higher than one-versus-rest based methods. The binary TWSVMs in rest-versus-one based methods take one class as the negative class and the rest classes as the positive class, so the numbers of constrains are small. The complexity related to the number of constrains directly. Compared with other approaches, the advantage of rest-versus-one based methods is the lower complexity. In this paper, we employ the rest versus one strategy to reduce the time complexity.
Granular computing [40], [41], [42], covering all the research about theories, methods, techniques and tools of granulation, is a powerful method to handle large scale information. The essence of granular computing is to find an approximate solution, which is simple and low-cost, to replace the exact solution through using inaccurate and large scale information to achieve the tractability, robustness, low cost and better describing the real world of intelligent systems or intelligent control. The combination of granular computing with statistical learning theory is becoming a hotspot. Many effective granular SVM (GSVM) models for binary classification have been developed [41]. Wang et al. [44] proposed a GSVM model based on mixed measure; Ding et al. [45] proposed a fast fuzzy support vector machine based on information granulation; Cheng et al. [46] proposed a dynamic GSVM. However, combination of granular computing with extensions of TWSVM for multi-class classification is still an unsolved research problem.
This paper proposes a new classifier for multi-class classification, called weighted linear loss multiple birth support vector machine based on information granulation (WLMSVM) to enhance the performance of multiple WLTSVM. The proposed algorithm works as follows. Firstly, it splits the whole feature space into a set of information granules depending on the training data and signs the information granules into “pure granule” or “mixed granule” depending on the labels of training samples in them. A pure granule is a subspace in which there are only samples with same class label. A mixed granule is a subspace in which two or more classes present. Then, WLMSVM builds one multi-class sub-classifier in each mixed granule. Different from multiple WLTSVM, our approach uses the strategy “all-versus-one” which is the key idea of MBSVM. In a given mixed granule, the sub-classifier of WLMSVM generates one hyperplane for each class by solving a TWSVM-style QPP which considers the samples in one class as negative samples and all other samples in the mixed granule as positive samples. The last step is to predict the label of an unlabeled sample. Compared with other methods for multi-class classification, the proposed approach has three advantages. 1) By introducing the strategy “all-versus-one”, our approach as a whole has low computational complexity. Especially when the number of classes is large, WLMSVM can always work faster than most of the other methods; 2) WLMSVM keeps the advantage of the multiple WLTSVM classifier. WLMSVM uses a weighted linear loss instead of hinge loss. The use of weighted linear loss leads to that WLMSVM only needs to solve several systems of linear equations; 3) Granular computing technique frees each sub-classifier to focus on the local information of data in the granules.
The rest of this article is organized as follows. In the next section, a brief review to TWSVM, WLTSVM, MBSVM and granular computing is provided. In Section 3, we introduce the proposed weighted linear loss multiple birth support vector machine based on information granulation for multi-class classification in detail. Experimental results are given in Section 4. In the last section, concluding remarks and further research to be developed are presented.
Section snippets
Twin support vector machine
Assume a binary classification problem with m samples in the n-dimensional real space Rn. The set of training data points is represented by T = {(xi, yi) | i = 1,2,3,…,m}, where xi is input sample and yi ϵ {+1,−1} is corresponding output. Let m1 × n matrix A denote the samples belonging to class +1 and m2 × n matrix B denote the samples belonging to class −1. Each row of A is a sample belonging to class +1, and each row of B is a sample belonging to class −1.
For a linear binary classification
WLMSVM
In this section, we present a weighted linear loss multiple birth support vector machine based on information granulation. The approach can be divided into three steps: The first step is to split the feature space and build suitable information granules. The second step is to train sub-classifiers in mixed granules and combine them into WLMSVM as a whole. The last step is to predict a new sample. The specific flow of WLMSVM is shown in Fig. 2.
Experiments and analysis
In order to test the performance of the proposed algorithm, we do a series of tests on three artificial datasets and several popular UCI datasets. All experiments are implemented on a personal computer with Intel (R) 3.4 GHz Intel Core i5 CPU, 4 G memory and MATLAB 2012a environment. If no other special instructions, in this article, we use 10-fold cross-validation methods and select the average accuracy to measure the classification accuracy. For the nonlinear case, these problems are tested
Conclusions
In order to further improve the performance of Multiple WLTSVM and reduce the complexity of the method as a whole, this paper propose weighted linear loss multiple birth support vector machine based on information granulation. The proposed approach for multi-class classification is based on weighted linear loss and the strategy “all-versus-one”. WLMSVM first splits the feature space into several information granules and then trains a sub-classifier with weighted linear loss in each mixed
Acknowledgments
This work is supported by the National Natural Science Foundation of China (Nos. 61379101, 61672522), the National Key Basic Research Program of China (No. 2013CB329502), the Priority Academic Program Development of Jiangsu Higer Education Institutions(PAPD), and the Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology(CICAEET).
Shifei Ding, born in Qingdao, received his Ph.D. degree from Shandong University of Science and Technology in 2004. He received postdoctoral degree from Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, and Chinese Academy of Sciences in 2006. He is a professor and Ph.D. supervisor at China University of Mining and Technology as His research interests include intelligent information processing, pattern recognition, machine learning, data mining, and
References (55)
- et al.
Unsupervised spatiotemporal fMRI data analysis using support vector machines
NeuroImage
(2009) A SVM-based cursive character recognizer
Pattern Recognit.
(2007)- et al.
Incremental learning for ν-support vector regression
Neural Netw.
(2015) - et al.
An efficient weighted Lagrangian twin support vector machine for imbalanced data classification
Pattern Recognit.
(2014) - et al.
Local and global regularized twin SVM
Procedia Comput. Sci.
(2013) - et al.
Application of smoothing technique on twin support vector machines
Pattern Recognit. Lett.
(2008) - et al.
Improvements on twin parametric-margin support vector machine
Neurocomputing
(2015) - et al.
Multitask centroid twin support vector machines
Neurocomputing
(2015) - et al.
A twin-hypersphere support vector machine classifier and the fast learning algorithm
Inf. Sci.
(2013) - et al.
Non-parallel support vector classifiers with different loss functions
Neurocomputing
(2014)
TPMSVM: a novel twin parametric-margin support vector machine for pattern recognition
Pattern Recognit.
Weighted least squares projection twin support vector machines with local information
Neurocomputing
Weighted linear loss twin support vector machine for large-scale classification
Knowl.-Based Syst.
Least squares twin multi-class classification support vector machine
Pattern Recognit.
Nonparallel hyperplanes support vector machine for multi-class classification
Procedia Comput. Sci.
Multi-class classification methods of enhanced LS-TWSVM for strip steel surface defects
J. Iron Steel Res. Int.
The best separating decision tree twin support vector machine for multi-class classification
Procedia Comput. Sci.
Granular support vector machines with association rules mining for protein homology prediction
Artif. Intell. Med.
A dynamic over-sampling procedure based on sensitivity for multi-class problems
Pattern Recognit.
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognit.
A GA-based feature selection and parameters optimization for support vector machines
Expert Syst. Appl.
A GA-based model selection for smooth twin parametric-margin support vector machine
Pattern Recognit.
Support vector networks
Mach. Learn.
The Nature of Statistical Learning Theory
Least square support vector machine classifiers
Neural Process. Lett.
Normalization of Linear Support Vector Machines
IEEE Trans. Signal Process.
Incremental support vector learning for ordinal regression
IEEE Trans. Neural Netw. Learn. Syst.
Cited by (78)
A general maximal margin hyper-sphere SVM for multi-class classification
2024, Expert Systems with ApplicationsAn improved hybrid chameleon swarm algorithm for feature selection in medical diagnosis
2023, Biomedical Signal Processing and ControlLocal-to-Global Support Vector Machines (LGSVMs)
2022, Pattern RecognitionOnline Adaptive Kernel Learning with Random Features for Large-scale Nonlinear Classification
2022, Pattern RecognitionTSVM-M<sup>3</sup>: Twin support vector machine based on multi-order moment matching for large-scale multi-class classification
2022, Applied Soft ComputingCitation Excerpt :An effective way is to substitute hinge loss in the objective function of SVM and TSVM-based methods with the quadratic loss [34,35] or the weighted linear loss [36] so that the final decision hyperplane can be obtained by only solving several systems of linear equations [32,34,36]. Some methods also consider dividing the training process into several steps, each step trains part of the samples, thus reducing the computation complexity [37,38]. One disadvantage of these methods is that they do not fully use all the training data during each step.
KNN weighted reduced universum twin SVM for class imbalance learning
2022, Knowledge-Based Systems
Shifei Ding, born in Qingdao, received his Ph.D. degree from Shandong University of Science and Technology in 2004. He received postdoctoral degree from Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, and Chinese Academy of Sciences in 2006. He is a professor and Ph.D. supervisor at China University of Mining and Technology as His research interests include intelligent information processing, pattern recognition, machine learning, data mining, and granular computing etc. He has published 5 books, and more than 150 research papers in journals and international conferences.
Xiekai Zhang, received his B.Sc. degree in computer science in China University of Mining and Technology in 2013, and is currently pursuing the M.Sc. degree in the School of Computer Science and Technology, China University of Mining and Technology. His research interest includes machine learning, pattern recognition, support vector machine, and various applications.
Xuanyue An, received her B.Sc. degree in computer science in Jiangsu Normal University in 2015, and is currently pursuing the M.Sc. degree in the School of Computer Science and Technology, China University of Mining and Technology. His research interest includes machine learning, pattern recognition, support vecter machine, and various applications.
Yu Xue, received his B.Sc. degree in computer science in China University of Mining and Technology in 2013, and is currently pursuing the Ph.D. degree in the School of Computer Science and Technology, China University of Mining and Technology. His research interest includes machine learning, pattern recognition, support vector machine, and various applications.