Least squares twin support vector hypersphere (LS-TSVH) for pattern recognition

https://doi.org/10.1016/j.eswa.2010.05.045Get rights and content

Abstract

The twin support vector hypersphere (TSVH) is a novel efficient pattern recognition tool, because it determines a pair of hyperspheres by solving two related SVM-type problems, each of which is smaller than in a classical SVM. In this paper we formulate a least squares version for this classifier, termed as the least squares twin support vector hypersphere (LS-TSVH). This formulation leads to extremely simple and fast algorithm for generating binary classifier based on a pair of hyperspheres. Due to equality type constraints in the formulation, the solution follows from solving two sets of nonlinear equations, instead of the two dual quadratic programming problems (QPPs) for TSVH. We show that the two sets of nonlinear equations are solved using the well-known Newton downhill algorithm. The effectiveness of proposed LS-TSVH is demonstrated by experimental results on several artificial and benchmark datasets.

Introduction

Support vector machine (SVM) is an excellent kernel-based tool for binary data classification (Burges, 1998, Christianini and Shawe-Taylor, 2002, Vapnik, 1995, Vapnik, 1998). This learning strategy introduced by Vapnik, 1995, Vapnik, 1998 is a principled and very powerful method in machine learning algorithms. Within a few years after its introduction SVM has already outperformed most other systems in a wide variety of applications. These include a wide spectrum of research areas, ranging from pattern recognition (Osuna, Freund, & Girosi, 1997a), text categorization (Joachims, Ndellec, & Rouveriol, 1998), biomedicine (Brown, Grundy, & Lin, 1997), brain–computer interface (Ebrahimi, Garcia, & Vesin, 2003), and financial applications (Ince & Trafalis, 2002), etc.

The theory of SVM is based on the idea of structural risk minimization (SRM) principle (Burges, 1998, Vapnik, 1995, Vapnik, 1998). Generally, the hyperplane is obtained by solving a quadratic programming problem (QPP). One of the main challenges in classical SVM is that it requires large training time for huge database as it has to optimize a computationally expensive cost function. The performance of a trained SVM classifier also depends on the optimal parameter set which is usually found by cross-validation on a tuning set. The large training time of SVM also prevents one to locate optimal parameter set from a very fine grid of parameters over large span. To remove these drawbacks, various algorithms and versions of SVM have been reported with comparable classification abilities, including the Chunking algorithm (Cortes & Vapnik, 1995), decomposition method (Osuna, Freund, & Girosi, 1997b), sequential minimal optimization (SMO) approach (Keerthi et al., 2001, Platt, 1999), geometric algorithms (Keerthi et al., 2000, Mavroforakis and Theodoridis, 2007, Tao et al., 2008), and least squares SVM (LS-SVM) (Suykens and Vandewalle, 1999, Suykens et al., 1999), etc.

All the above classifiers discriminate a pattern by determining in which half space it lies. Recently, Jayadeva, Khemchandani, and Chandra (2007) have proposed a twin support vector machine (TSVM) classifier for binary data classification, which is in the spirit of generalized eigenvalue proximal support vector machine (GEPSVM) (Mangasarian & Wild, 2006). The formulation of TSVM is very much similar to the classical SVM except that it aims at generating two non-parallel planes such that each plane is closer to one class and is as far as possible from the other. TSVM has become one of the popular methods in machine learning because of its low computational complexity, such as Ghorai et al., 2009, Kumar and Gopal, 2008, Kumar and Gopal, 2009. However, TSVM also requires inversion of matrix of size (l + 1) × (l + 1) twice along with two QPPs to be solved.

Recently, we have proposed a new hypersphere classifier, termed as the twin support vector hypersphere (TSVH) (Peng, submitted for publication). TSVH aims at generating two hyperspheres in the feature space such that each hypersphere contains as much as possible of samples of one of the two classes and is as far as possible from the other. Similar to TSVM, TSVH solves two smaller sized QPPs instead of solving large one as in the classical SVM. However, the formulation of TSVH is totally different from that of TSVM, in which the matric inversions in the objective functions of dual QPPs of TSVM are avoided, indicating a low computational cost to train it. Besides, it derives the uniform formulations for the linear and nonlinear cases compared with TSVM. These two differences means that TSVH not only runs faster than TSVM, but also has a more concise computer programming than TSVM.

In the spirit of Suykens and Vandewalle, 1999, Suykens et al., 1999, in this paper we formulate a least squares version of TSVH for classification problems, namely as the least squares twin support vector hypersphere (LS-TSVH). We first consider the primal QPPs of TSVH in least squares sense and solve them with equality constraints instead of inequalities of TSVH. As a result the solution of LS-TSVH follows directly from solving two sets of nonlinear equations as opposed to solving two dual QPPs in TSVH. We then show that the pair of sets of nonlinear equations can be solved by using the well-known Newton downhill method. Computational comparisons of LS-TSVH against LS-TSVM (Kumar & Gopal, 2009), LS-SVM (Suykens & Vandewalle, 1999), TSVH (Peng, submitted for publication), TSVM (Jayadeva et al., 2007) and SVM, in terms of classification accuracy and computing time, have been made on several artificial and benchmark datasets, indicating this algorithm can accurately and fastly solve large datasets.

The paper is organized as follows: Section 2 briefly introduces SVM, TSVM, and our TSVH. Section 3 first proposes the least squares twin support vector hypersphere, and then presents the fast learning algorithm for LS-TSVH based on the well-known Newton downhill method. Section 4 deals with some experimental results and Section 5 concludes the paper.

Section snippets

Support vector machine

As a state-of-the-art of machine learning algorithm, SVM is based on guaranteed risk bounds of statistical learning theory (Vapnik, 1995, Vapnik, 1998) which is known as SRM principle. Compared to other methods, SVM has showed excellent performance in pattern recognition tasks. In the simplest binary pattern recognition tasks, SVM uses a linear separating hyperplane to create a classifier with maximal margin. Consider a binary classification problem with data set D = {(x1, y1),  , (xl, yl)}, where xi

Least squares twin support vector hypersphere

In this section we introduce a least squares version to TSVH classifier using the same idea as LS-SVM (Suykens and Vandewalle, 1999, Suykens et al., 1999) by formulating the classification problem as:min12iI+φ(xi)-c+2-ν1R+2+C1jI-ξj2s.t.φ(xj)-c+2=R+2-ξj,jI-,min12jI-φ(xj)-c-2-ν2R-2+C2iI-ξi2s.t.φ(xi)-c-2=R-2-ξi,iI+.Here the pair of QPPs (24), (25) use the square of 2-norm of slack variables in the objective functions instead of 1-norm as used in (16), (17), which makes the

Experimental results

To test the performance of our LS-TSVH we investigate results in terms of accuracy and execution time on several artificial and publicly available benchmark data sets from the UCI Repository (Blake & Merz, 1998) which are commonly used in testing machine learning algorithms. All the classification methods are implemented in MATLAB 6.5 (MATLAB, 1994) on Windows XP running on a PC with system configuration Intel P4 processor (2.4 GHz) with 1 GB of RAM. We compare the performances of SVM (Vapnik,

Conclusion and further work

The recently proposed twin support vector hypersphere (TSVH) is a novel efficient classifier. It determines a pair of hyperspheres by solving two related SVM-type problems, each of which is smaller than in a classical SVM. The formulation of TSVH is totally different from that of TSVM, in which the matric inversions in the objective function of dual QPPs of TSVM are avoided. In this paper, in the spirit of LS-SVM, we have formulated a novel least squares TSVH (LS-TSVH). LS-TSVH is an extremely

Acknowledgements

This work has been partly supported by the Shanghai Leading Academic Discipline Project (No. S30405), and the Natural Science Foundation of Shanghai Normal University (No. SK200937).

References (26)

  • S. Ghorai et al.

    Nonparallel plane proximal classifier

    Signal Processing

    (2009)
  • M.A. Kumar et al.

    Application of smoothing technique on twin support vector machines

    Pattern Recognition Letters

    (2008)
  • Q. Tao et al.

    A general soft method for learning SVM classifiers with L1-norm penalty

    Pattern Recognition

    (2008)
  • Blake, C. I., & Merz, C. J. (1998). UCI repository for machine learning databases....
  • M.P.S. Brown et al.

    Knowledge-based analysis of microarray gene expression data by using support vector machine

    Proceedings of the National Academy of Sciences of the United States of America

    (1997)
  • C.J.C. Burges

    A tutorial on support vector machines for pattern recognition

    Data Mining and Knowledge Discovery

    (1998)
  • V. Christianini et al.

    An introduction to support vector machines

    (2002)
  • C. Cortes et al.

    Support vector networks

    Machine Learning

    (1995)
  • T. Ebrahimi et al.

    Joint time-frequency-space classification of EEG in a brain–computer interface application

    Journal on Applied Signal Processing

    (2003)
  • Ince, H., & Trafalis, T. B. (2002). Support vector machine for regression and applications to financial forecasting. In...
  • Jayadeva et al.

    Twin support vector machines for pattern classification

    IEEE Transactions on Pattern Analysis and Machine Intelligence

    (2007)
  • Joachims, T., Ndellec, C., & Rouveriol, C. (1998). Text categorization with support vector machines: Learning with many...
  • S.S. Keerthi et al.

    A fast iterative nearest point algorithm for support vector machine classifier design

    IEEE Transactions on Neural Networks

    (2000)
  • Cited by (32)

    • Hessian scatter regularized twin support vector machine for semi-supervised classification

      2023, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      It implements the principle of structural risk minimization (SRM) rather than the principle of empirical risk minimization (ERM) (Brown et al., 2000; Sastry, 2002; Cortes and Vapnik, 1995). Based on the solid mathematical theoretical foundation of SVM, many researchers have proposed many excellent SVM variational methods from different perspectives, which have been widely used in many fields (Suykens and Vandewalle, 1999; Mangasarian and Wild, 2006; Jayadeva et al., 2007; Kumar and Gopal, 2009; Shao et al., 2011; Peng and Xu, 2014; Peng, 2010). However, although SVM can achieve good classification performance, it needs to solve the large-scale quadratic programming problem (QPP), which seriously hinders the application of SVM in large-scale classification tasks to a certain extent. 1

    • Capped L<inf>1</inf>-norm distance metric-based fast robust twin bounded support vector machine

      2020, Neurocomputing
      Citation Excerpt :

      This embodies the essence of statistical learning theory, so this modification can improve the classification performance of TSVM. Due to the performance of TSVM, some excellent algorithms based on TSVM have been proposed for pattern recognition and regression problems in recent years [8–10]. For example, Kumar et al. [8] modify TSVM to replace inequality constraints with equality constraints and propose a least squares version of TSVM (LSTSVM).

    • Twin Neural Networks for the classification of large unbalanced datasets

      2019, Neurocomputing
      Citation Excerpt :

      The TWSVM has also been used for regression [8–10], and has proven to be efficient even in the primal formulation [11,12]. Some of the extensions of the TWSVM include Twin Spheres SVM [13], Knowledge based Least Squares TWSVM [14], Margin Based TWSVM [15], ϵ-TWSVM [16], Twin Parametric Margin Classifier [17], Least-squares Twin Parametric-Margin Classifier [18], Twin Support Vector Hypersphere Classifier and its variants [19–24], Structural TWSVM [25], Wavelet TWSVM [26], Lagrangian TWSVM for classification [27] and regression [28,29], Laplacian TWSVM and its variants [30–35], pinball loss TWSVM [36], L2 P-norm distance TWSVM [37], angle TWSVM [38,39], fuzzy TWSVM and its variants[40,41] among others. Large-scale variants of the TWSVM have also been presented, such as the Stochastic Gradient Descent TWSVM [42,43], coordinate descent TWSVM [44] and hashing based TWSVM [45].

    View all citing articles on Scopus
    View full text