Elsevier

Neurocomputing

Volume 313, 3 November 2018, Pages 196-205
Neurocomputing

Improvements on least squares twin multi-class classification support vector machine

https://doi.org/10.1016/j.neucom.2018.06.040Get rights and content

Abstract

Recently, least squares twin multi-class support vector machine (LSTKSVC) was proposed as a least squares version of twin multi-class classification support vector machine (Twin-KSVC), both based on twin support vector machine (TWSVM). In this paper, we propose a novel multi-class classifier termed as Improvements on least squares twin multi-class classification support vector machine that is motivated by LSTKSVC and Twin-KSVC. Similarly to LSTKSVC that evaluates all the training data into a ``1versus1versusrest structure, the algorithm here proposed generates ternary output {1,0,+1}. Whereas Twin-KSVC needs to solve two quadratic programming problems (QPPs), the solution of the two modified primal problems for our algorithm is reduced to two systems of linear equations. Besides that, in our algorithm the structural risk minimization (SRM) principle is implemented by introducing a regularization term, along with minimizing the empirical risk. To test the efficacy and validity of the proposed method, numerical experiments on ten UCI benchmark data sets are performed. The results obtained further corroborate the effectiveness of the proposed algorithm.

Introduction

The support vector machine (SVM) approach, originally proposed by Vapnik and his colleagues [1], [2], [3], [4] for binary classification, is a promising machine learning technique when compared with other machine learning approaches, such as artificial neural networks [5]. SVM solves a quadratic programming problem (QPP) assuring that once an optimal solutions is obtained, it is the unique (global) solution; It also implements the structural risk minimization principle, which minimizes the upper bound of the generalization error [6]. The basic idea of SVM is to find an optimal separating hyperplane with a maximum margin between two parallel support hyperplanes [7].

Different from the standard SVM, which uses a single hyperplane, Mangasarian and Wild [8] proposed a generalized eigenvalue proximal support vector machine (GEPSVM), for binary classification problems, which aims to generate two nonparallel hyperplanes such that each hyperplane is closer to its class and is as far as possible from the other class. This idea leads to solving two generalized eigenvalue problems, which in turn reduces computational cost compared with SVM, that is needed to solves one quadratic QPP [9].

Thereafter, a non-parallel hyperplane classifier termed as twin support vector machine (TWSVM) was proposed by Jayadeva et al. [10] for binary classification in light of the generalized eigenvalue proximal support vector machine (GEPSVM). TWSVM aims at generating two non-parallel hyperplanes by solving a pair of QPPs such that each hyperplane is closer to the pattern in one of the two classes and is as far as possible from the other. Each QPP is smaller than the ones traditionally found in SVMs, which makes TWSVM work almost four times faster than the standard SVM classifier .

Due to its lower computational complexity, TWSVM has become one of the most popular kind of methods nowadays. Many variants of TWSVM have been proposed, such as twin bounded support vector machine (TBSVM) [11], ν-TSVM [12], robust TWSVM [13], least square TWSVM [14], projection TWSVM [15], and twin support vector regression [16].

It is well known that one significant advantage of TBSVM [11] is the implementation of the structural risk minimization by adding a regularization term with the idea of maximizing the margin. An effective method, called successive over relaxation (SOR) [17] is applied to TBSVM, in order to shorten training time. To avoid having to solve two quadratic programming problems, least squares twin support vector machines (LSTSVM) was proposed by Kumar and Gopal [14], which exchanges the convex QPPs in TWSVM with a convex linear system by using the squared loss function instead of the hinge one, leading to a very fast training speed [18].

Several methods have proposed the incorporated regularization term [9], [19], [20], [21], [22], which is utilized to avoid the singularity problem and reach better generalization ability, similar to TBSVM [11]. Robust and sparse linear programming TWSVM [19], in addition to incorporating a regularization term, the Newton-Armijo algorithm was used for solving a pair of exterior penalty problems and 1-norm replaced 2-norm, in order to generate a robust solution. However, this formulation was not extended to problems of multi-class data sets. Shao et al. [22] proposed the least squares projection twin support vector machine (LSPTSVM) by considering the equality constraints, and an extra regularization term is introduced in the primal problem of LSPTSVM. However, LSPTSVM worked with binary classifications. Recently, several multi-class TWSVM approaches have been presented [6], [7], [23], [24], [25]. Inspired by LSPTSVM and multiple recursive projection twin support vector machine (MPTSVM) [26] that worked with multi-classes, Yang et al. [9] presented an extension for multi-classes problems for LSPTSVM that solves a series of linear equations, instead of solving complex QPPs as in MPTSVM. This algorithm also deals with high-dimensional and large data sets, and show great flexibility in modeling diverse sources of data.

Angulo et al. [27] proposed a new algorithm of classification for multi-class problems, termed K-SVCR (Support Vector Classification-Regression for K-class). This learning algorithm with ternary outputs {1,0,+1} is based on Vapnik’s support vector theory, and evaluates all training samples into a ``1versus1versusrest structure during the decomposing phase by using a mixed classification and regression support vector machine. Compared with other classification algorithms for multi-class, the K-SVCR yields greater generalization performance, since all samples are used for the construction of the classification hyperplane. One extension of K-SVCR for multi-class TWSVM, termed Twin-KSVC, was proposed by Xu et al. [6]. Comparing the two algorithms in public data sets obtained in [28], it can be seen that in all cases the Twin-KSVC algorithm outperformed the K-SVCR, both in accuracy, and computational cost.

Recently, Nasiri et al. [23], motivated by studies [6], [10], [14], [27], proposed a version of least squares for Twin-KSVC, a shortened form by LSTKSVC, proving its effectiveness compared to K-SVCR, TWSVM, Twin-KSVC and LSTKSVC.

Motivated by the studies of [6], [23], [27], we propose Improvements on least squares twin multi-class classification support vector machine (Improvements on LSTKSVC) in this paper. The experimental results on UCI data sets show that the proposed Improvements on LSTKSVC algorithm resulted in higher classification accuracy with a computational time similar to that compared in some sources found in the literature. The following are the highlights of our Improvements on LSTKSVC:

  • The Sherman–Morrison–Woodbury (SMW) formulation is employed to reduce the complexity of nonlinear Improvements on LSTKSVC.

  • The solution of our Improvements on LSTKSVC requires solving two systems of linear equations, different from KSVCR and Twin-KSVC that require solving two QPPs.

  • Our proposed algorithm evaluates all the training points into a ``1versus1versusrest" structure, similarly to KSVCR, Twin-KSVC and LSTKSVC.

One main challenge was finding the best parameters for the grid search method; therefore, to reduce the computational complexity of parameter selection we set c1=c3,c2=c4 and c5=c6 in our algorithm.

The remainder of this paper is organized as follows: Section (2) outlines the SVM, TWSVM, K-SVCR and LSTKSVC and introduces the notation used in the rest of the paper. Section (3) introduces the linear and non-linear Improvements on LSTKSVC. Section (4) deals with experimental results on ten benchmark data sets. Section (5) contains concluding remarks.

Section snippets

Support vector machine

SVMs represent a learning technique that have been introduce in the framework of structural risk minimization (SRM) and in the theory of Vapnik–Chervonenkis bounds [3]. It is a state-of-the-art of machine learning algorithm. Among several tutorials on SVM literature we refer to [29], [30], [31].

Given m training pairs (x1,y1),,(xm,ym), where xiRn, with i=1,2,m, is an input vector labeled by yi{1,+1}, the linear SVM classifier search for an optimal separating hyperplane wTx+b=0where wRn is

Improvements on least squares twin multi-class classification support vector machine

In this section, motivated by the study developed by Nasiri et al. [23] that propose LSTKSVC, we proposed our Improvements on least squares twin multi-class classification support vector machine. The principle of minimization of structural risk is implemented by introducing a regularization term as was done by Shao et al. [11]. Fig. 2 illustrates the Improvements on LSTKSVC. The present algorithm evaluates the training points in a structure ``1versus1versusrest. Let matrix ARm1×n,

Numerical experiments

To evaluate the performance of our improvements on LSTKSVC, in this section, we investigate its performance on ten benchmark data sets from the UCI machine repository [28] described in Table 1. The samples are normalized before learning such that the features are located in the [0, 1] range. In our implementation, we focus on the comparison among Twin-KSVC, LSTKSVC and improvements on LSTKSVC.

All the experiments have been implemented in Matlab 2015b on a personal computer (PC) with Intel

Conclusion

In this paper, we have proposed an algorithm, termed as Improvements on LSTKSVC, that implement the structural risk minimization (SRM) principle by introducing a regularization term in the LSTKSVC version to multi-class classification problem, improving the generalization ability. Similarly to the LSTKSCV and Twin-KSVC, our algorithm evaluates all the training data into a ``1versus1versusrest structure, so it generates ternary output {1,0,+1}. Differently from Twin-KSVC that need to solve

Acknowledgments

The authors would like to thank the editor and anonymous reviewers whose valuable comments and feedback have helped us to improve the content and presentation of the paper.

Márcio Lima received his bachelor’s degree in mathematics in 2001 and M.Sc. degree in Mathematics in 2011. Now he is a Ph.D. student in College of computer Science at Universidade Federal de Goiás, Goiânia, Brazil. His research interests include data mining, optimization methods, support vector machine and statistical methods. He is a professor at Federal Institute of Education, Science and Technology of Goiás, Goiânia, Brazil.

References (38)

  • C. Cortes et al.

    Support-vector networks

    Mach. Learn.

    (1995)
  • V. Vapnik

    The Nature of Statistical Learning Theory

    (2013)
  • V. Vapnik

    Statistical Learning Theory

    (1998)
  • B.D. Ripley

    Pattern Recognition and Neural Networks

    (2007)
  • Y. Xu et al.

    A twin multi-class classification support vector machine

    Cognit. Comput.

    (2013)
  • Z.-X. Yang et al.

    Multiple birth support vector machine for multi-class classification

    Neural Comput. Appl.

    (2013)
  • O.L. Mangasarian et al.

    Multisurface proximal support vector machine classification via generalized eigenvalues

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2006)
  • Z.-M. Yang et al.

    Least squares recursive projection twin support vector machine for multi-class classification

    Int. J. Mach. Learn. Cybernet.

    (2016)
  • Jayadeva et al.

    Twin support vector machines for pattern classification

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2007)
  • Cited by (41)

    • Solution path algorithm for twin multi-class support vector machine

      2022, Expert Systems with Applications
      Citation Excerpt :

      Since only two linear equations are considered to obtain the result, the solution speed is improved a lot instead of solving two QPPs with constraints. Based on LSTSVM, many researchers have proposed different improved versions (de Lima et al., 2018; Tanveer et al., 2016; Xu et al., 2015). To solve the semi-positive definite problem in LSTSVM, which only satisfies the empirical risk minimization, Tanveer et al. (2016) have designed a robust energy-based LSTSVM.

    • Visual-simulation region proposal and generative adversarial network based ground military target recognition

      2022, Defence Technology
      Citation Excerpt :

      Compared with the latter, digital image sensors have lower cost, longer detection range and richer features captured. Modern digital image-based ground military target detection methods can be divided into human-engineering-features-based methods (e.g., Histogram of Oriented Gradient (HOG) [13], Scale-Invariant Feature Transform (SIFT) [14], Support Vector Machine (SVM) [15]) and deep-learning-based methods. Human-engineering features are easier to be understood and computation efficient, but the methods based on these features are often poor in robustness and can be only adapt to simple scenario.

    • TSVM-M<sup>3</sup>: Twin support vector machine based on multi-order moment matching for large-scale multi-class classification

      2022, Applied Soft Computing
      Citation Excerpt :

      Because all the mentioned methods are verified in large-scale datasets, so in this subsection, to reduce computational complexity, this paper only compares our proposed TSVM-M3 with the least square loss-based methods. So, this paper designs the following comprehensive experiments compared with various methods mentioned above including the MLSTSVM [28], MWLTSVM [36], LST-KSVC [34], RLSSVC [47], reg-LSDWPTSVM [9]. Also, our proposed TSVM-M3 is divided into two versions, one is TSVM-M3-I which represents that TSVM-M3 speeds up by the traditional rectangular kernel technique, the other is TSVM-M3-II which represents that TSVM-M3 speeds up by the newly designed rectangular kernel technique.

    View all citing articles on Scopus

    Márcio Lima received his bachelor’s degree in mathematics in 2001 and M.Sc. degree in Mathematics in 2011. Now he is a Ph.D. student in College of computer Science at Universidade Federal de Goiás, Goiânia, Brazil. His research interests include data mining, optimization methods, support vector machine and statistical methods. He is a professor at Federal Institute of Education, Science and Technology of Goiás, Goiânia, Brazil.

    Nattane Luíza da Costa received the M.Sc. degree in Computer Science in 2016 from the Universidade Federal de Goiás, Brazil. She is currently a Ph.D. student at the Department of Computer Science, Universidade Federal de Goiás, Brazil. Her field of research includes data mining, data preprocessing, feature selection and big data.

    Rommel Barbosa is a doctor of Computer Science. He is associate professor at Universidade Federal de Goiás, Brazil. His field of research includes data mining, machine learning, feature selection and big data.

    View full text