Improvements on least squares twin multi-class classification support vector machine
Introduction
The support vector machine (SVM) approach, originally proposed by Vapnik and his colleagues [1], [2], [3], [4] for binary classification, is a promising machine learning technique when compared with other machine learning approaches, such as artificial neural networks [5]. SVM solves a quadratic programming problem (QPP) assuring that once an optimal solutions is obtained, it is the unique (global) solution; It also implements the structural risk minimization principle, which minimizes the upper bound of the generalization error [6]. The basic idea of SVM is to find an optimal separating hyperplane with a maximum margin between two parallel support hyperplanes [7].
Different from the standard SVM, which uses a single hyperplane, Mangasarian and Wild [8] proposed a generalized eigenvalue proximal support vector machine (GEPSVM), for binary classification problems, which aims to generate two nonparallel hyperplanes such that each hyperplane is closer to its class and is as far as possible from the other class. This idea leads to solving two generalized eigenvalue problems, which in turn reduces computational cost compared with SVM, that is needed to solves one quadratic QPP [9].
Thereafter, a non-parallel hyperplane classifier termed as twin support vector machine (TWSVM) was proposed by Jayadeva et al. [10] for binary classification in light of the generalized eigenvalue proximal support vector machine (GEPSVM). TWSVM aims at generating two non-parallel hyperplanes by solving a pair of QPPs such that each hyperplane is closer to the pattern in one of the two classes and is as far as possible from the other. Each QPP is smaller than the ones traditionally found in SVMs, which makes TWSVM work almost four times faster than the standard SVM classifier .
Due to its lower computational complexity, TWSVM has become one of the most popular kind of methods nowadays. Many variants of TWSVM have been proposed, such as twin bounded support vector machine (TBSVM) [11], ν-TSVM [12], robust TWSVM [13], least square TWSVM [14], projection TWSVM [15], and twin support vector regression [16].
It is well known that one significant advantage of TBSVM [11] is the implementation of the structural risk minimization by adding a regularization term with the idea of maximizing the margin. An effective method, called successive over relaxation (SOR) [17] is applied to TBSVM, in order to shorten training time. To avoid having to solve two quadratic programming problems, least squares twin support vector machines (LSTSVM) was proposed by Kumar and Gopal [14], which exchanges the convex QPPs in TWSVM with a convex linear system by using the squared loss function instead of the hinge one, leading to a very fast training speed [18].
Several methods have proposed the incorporated regularization term [9], [19], [20], [21], [22], which is utilized to avoid the singularity problem and reach better generalization ability, similar to TBSVM [11]. Robust and sparse linear programming TWSVM [19], in addition to incorporating a regularization term, the Newton-Armijo algorithm was used for solving a pair of exterior penalty problems and 1-norm replaced 2-norm, in order to generate a robust solution. However, this formulation was not extended to problems of multi-class data sets. Shao et al. [22] proposed the least squares projection twin support vector machine (LSPTSVM) by considering the equality constraints, and an extra regularization term is introduced in the primal problem of LSPTSVM. However, LSPTSVM worked with binary classifications. Recently, several multi-class TWSVM approaches have been presented [6], [7], [23], [24], [25]. Inspired by LSPTSVM and multiple recursive projection twin support vector machine (MPTSVM) [26] that worked with multi-classes, Yang et al. [9] presented an extension for multi-classes problems for LSPTSVM that solves a series of linear equations, instead of solving complex QPPs as in MPTSVM. This algorithm also deals with high-dimensional and large data sets, and show great flexibility in modeling diverse sources of data.
Angulo et al. [27] proposed a new algorithm of classification for multi-class problems, termed K-SVCR (Support Vector Classification-Regression for K-class). This learning algorithm with ternary outputs is based on Vapnik’s support vector theory, and evaluates all training samples into a structure during the decomposing phase by using a mixed classification and regression support vector machine. Compared with other classification algorithms for multi-class, the K-SVCR yields greater generalization performance, since all samples are used for the construction of the classification hyperplane. One extension of K-SVCR for multi-class TWSVM, termed Twin-KSVC, was proposed by Xu et al. [6]. Comparing the two algorithms in public data sets obtained in [28], it can be seen that in all cases the Twin-KSVC algorithm outperformed the K-SVCR, both in accuracy, and computational cost.
Recently, Nasiri et al. [23], motivated by studies [6], [10], [14], [27], proposed a version of least squares for Twin-KSVC, a shortened form by LSTKSVC, proving its effectiveness compared to K-SVCR, TWSVM, Twin-KSVC and LSTKSVC.
Motivated by the studies of [6], [23], [27], we propose Improvements on least squares twin multi-class classification support vector machine (Improvements on LSTKSVC) in this paper. The experimental results on UCI data sets show that the proposed Improvements on LSTKSVC algorithm resulted in higher classification accuracy with a computational time similar to that compared in some sources found in the literature. The following are the highlights of our Improvements on LSTKSVC:
- •
The Sherman–Morrison–Woodbury (SMW) formulation is employed to reduce the complexity of nonlinear Improvements on LSTKSVC.
- •
The solution of our Improvements on LSTKSVC requires solving two systems of linear equations, different from KSVCR and Twin-KSVC that require solving two QPPs.
- •
Our proposed algorithm evaluates all the training points into a structure, similarly to KSVCR, Twin-KSVC and LSTKSVC.
One main challenge was finding the best parameters for the grid search method; therefore, to reduce the computational complexity of parameter selection we set and in our algorithm.
The remainder of this paper is organized as follows: Section (2) outlines the SVM, TWSVM, K-SVCR and LSTKSVC and introduces the notation used in the rest of the paper. Section (3) introduces the linear and non-linear Improvements on LSTKSVC. Section (4) deals with experimental results on ten benchmark data sets. Section (5) contains concluding remarks.
Section snippets
Support vector machine
SVMs represent a learning technique that have been introduce in the framework of structural risk minimization (SRM) and in the theory of Vapnik–Chervonenkis bounds [3]. It is a state-of-the-art of machine learning algorithm. Among several tutorials on SVM literature we refer to [29], [30], [31].
Given m training pairs where with is an input vector labeled by the linear SVM classifier search for an optimal separating hyperplane where is
Improvements on least squares twin multi-class classification support vector machine
In this section, motivated by the study developed by Nasiri et al. [23] that propose LSTKSVC, we proposed our Improvements on least squares twin multi-class classification support vector machine. The principle of minimization of structural risk is implemented by introducing a regularization term as was done by Shao et al. [11]. Fig. 2 illustrates the Improvements on LSTKSVC. The present algorithm evaluates the training points in a structure . Let matrix
Numerical experiments
To evaluate the performance of our improvements on LSTKSVC, in this section, we investigate its performance on ten benchmark data sets from the UCI machine repository [28] described in Table 1. The samples are normalized before learning such that the features are located in the [0, 1] range. In our implementation, we focus on the comparison among Twin-KSVC, LSTKSVC and improvements on LSTKSVC.
All the experiments have been implemented in Matlab 2015b on a personal computer (PC) with Intel
Conclusion
In this paper, we have proposed an algorithm, termed as Improvements on LSTKSVC, that implement the structural risk minimization (SRM) principle by introducing a regularization term in the LSTKSVC version to multi-class classification problem, improving the generalization ability. Similarly to the LSTKSCV and Twin-KSVC, our algorithm evaluates all the training data into a structure, so it generates ternary output . Differently from Twin-KSVC that need to solve
Acknowledgments
The authors would like to thank the editor and anonymous reviewers whose valuable comments and feedback have helped us to improve the content and presentation of the paper.
Márcio Lima received his bachelor’s degree in mathematics in 2001 and M.Sc. degree in Mathematics in 2011. Now he is a Ph.D. student in College of computer Science at Universidade Federal de Goiás, Goiânia, Brazil. His research interests include data mining, optimization methods, support vector machine and statistical methods. He is a professor at Federal Institute of Education, Science and Technology of Goiás, Goiânia, Brazil.
References (38)
- et al.
Robust twin support vector machine for pattern classification
Pattern Recognit.
(2013) - et al.
Recursive projection twin support vector machine via within-class variance minimization
Pattern Recognit.
(2011) Tsvr: an efficient twin support vector machine for regression
Neural Netw.
(2010)- et al.
Least squares recursive projection twin support vector machine for classification
Pattern Recognit.
(2012) - et al.
Least squares twin multi-class classification support vector machine
Pattern Recognit.
(2015) - et al.
An improved multiple birth support vector machine for pattern classification
Neurocomputing
(2017) - et al.
K-svcr. a support vector machine for multi-class classification
Neurocomputing
(2003) K-nearest neighbor-based weighted multi-class twin support vector machine
Neurocomputing
(2016)- et al.
Wavelet twin support vector machines based on glowworm swarm optimization
Neurocomputing
(2017) - et al.
A training algorithm for optimal margin classifiers
Proceedings of the Fifth Annual Workshop on Computational Learning Theory
(1992)
Support-vector networks
Mach. Learn.
The Nature of Statistical Learning Theory
Statistical Learning Theory
Pattern Recognition and Neural Networks
A twin multi-class classification support vector machine
Cognit. Comput.
Multiple birth support vector machine for multi-class classification
Neural Comput. Appl.
Multisurface proximal support vector machine classification via generalized eigenvalues
IEEE Trans. Pattern Anal. Mach. Intell.
Least squares recursive projection twin support vector machine for multi-class classification
Int. J. Mach. Learn. Cybernet.
Twin support vector machines for pattern classification
IEEE Trans. Pattern Anal. Mach. Intell.
Cited by (41)
Multicycle disassembly-based decomposition algorithm to train multiclass support vector machines
2023, Pattern RecognitionSolution path algorithm for twin multi-class support vector machine
2022, Expert Systems with ApplicationsCitation Excerpt :Since only two linear equations are considered to obtain the result, the solution speed is improved a lot instead of solving two QPPs with constraints. Based on LSTSVM, many researchers have proposed different improved versions (de Lima et al., 2018; Tanveer et al., 2016; Xu et al., 2015). To solve the semi-positive definite problem in LSTSVM, which only satisfies the empirical risk minimization, Tanveer et al. (2016) have designed a robust energy-based LSTSVM.
Visual-simulation region proposal and generative adversarial network based ground military target recognition
2022, Defence TechnologyCitation Excerpt :Compared with the latter, digital image sensors have lower cost, longer detection range and richer features captured. Modern digital image-based ground military target detection methods can be divided into human-engineering-features-based methods (e.g., Histogram of Oriented Gradient (HOG) [13], Scale-Invariant Feature Transform (SIFT) [14], Support Vector Machine (SVM) [15]) and deep-learning-based methods. Human-engineering features are easier to be understood and computation efficient, but the methods based on these features are often poor in robustness and can be only adapt to simple scenario.
TSVM-M<sup>3</sup>: Twin support vector machine based on multi-order moment matching for large-scale multi-class classification
2022, Applied Soft ComputingCitation Excerpt :Because all the mentioned methods are verified in large-scale datasets, so in this subsection, to reduce computational complexity, this paper only compares our proposed TSVM-M3 with the least square loss-based methods. So, this paper designs the following comprehensive experiments compared with various methods mentioned above including the MLSTSVM [28], MWLTSVM [36], LST-KSVC [34], RLSSVC [47], reg-LSDWPTSVM [9]. Also, our proposed TSVM-M3 is divided into two versions, one is TSVM-M3-I which represents that TSVM-M3 speeds up by the traditional rectangular kernel technique, the other is TSVM-M3-II which represents that TSVM-M3 speeds up by the newly designed rectangular kernel technique.
Multi-class fuzzy support matrix machine for classification in roller bearing fault diagnosis
2022, Advanced Engineering Informatics
Márcio Lima received his bachelor’s degree in mathematics in 2001 and M.Sc. degree in Mathematics in 2011. Now he is a Ph.D. student in College of computer Science at Universidade Federal de Goiás, Goiânia, Brazil. His research interests include data mining, optimization methods, support vector machine and statistical methods. He is a professor at Federal Institute of Education, Science and Technology of Goiás, Goiânia, Brazil.
Nattane Luíza da Costa received the M.Sc. degree in Computer Science in 2016 from the Universidade Federal de Goiás, Brazil. She is currently a Ph.D. student at the Department of Computer Science, Universidade Federal de Goiás, Brazil. Her field of research includes data mining, data preprocessing, feature selection and big data.
Rommel Barbosa is a doctor of Computer Science. He is associate professor at Universidade Federal de Goiás, Brazil. His field of research includes data mining, machine learning, feature selection and big data.