Elsevier

Pattern Recognition

Volume 43, Issue 10, October 2010, Pages 3468-3482
Pattern Recognition

An online core vector machine with adaptive MEB adjustment

https://doi.org/10.1016/j.patcog.2010.05.020Get rights and content

Abstract

Support vector machine (SVM) is a widely used classification technique. However, it is difficult to use SVMs to deal with very large data sets efficiently. Although decomposed SVMs (DSVMs) and core vector machines (CVMs) have been proposed to overcome this difficulty, they cannot be applied to online classification (or classification with learning ability) because, when new coming samples are misclassified, the classifier has to be adjusted based on the new coming misclassified samples and all the training samples. The purpose of this paper is to address this issue by proposing an online CVM classifier with adaptive minimum-enclosing-ball (MEB) adjustment, called online CVMs (OCVMs). The OCVM algorithm has two features: (1) many training samples are permanently deleted during the training process, which would not influence the final trained classifier; (2) with a limited number of selected samples obtained in the training step, the adjustment of the classifier can be made online based on new coming misclassified samples. Experiments on both synthetic and real-world data have shown the validity and effectiveness of the OCVM algorithm.

Introduction

So far, SVMs have been widely used in many real-world applications due to their good performance. These applications include face recognition [1], [2], gene expression data clustering [3], pedestrian detection [4], handwriting recognition [5], as well as the classification tasks of text [6], [7], fingerprint [8] and texture [9]. The main advantages of the SVMs can be summarized as follows [10]:

  • a compromise between minimizing empirical risk and preventing overfitting is taken by implementing the structural risk minimization;

  • the process of computing the classification hyperplane involves a convex quadratic optimization problem which can be solved efficiently and has a global solution;

  • the obtained classifier is completely determined by the support vectors and the type of kernel functions which are used for training.

Despite the above advantages, SVMs have also the following two main disadvantages which limit their application in real-time pattern recognition problems:

  • the convex quadratic optimization problem arising in SVMs is a large-scale problem for very large data sets, so it is difficult for SVMs to deal with very large data effectively;

  • SVMs handle training samples in a batch mode; when a new training sample arrives, the whole training process has to be implemented once again to adjust the classifier; thus, it is not practical for SVMs to be used for online learning.

Recently, many algorithms have been proposed to address the fast computation issue of large-scale SVMs (see [18], [19] for a good literature survey on this). Among these algorithms are the decomposed SVMs (DSVMs) [11], [12], [13], [14], [15] and core vector machines (CVMs) [16], [17]. Their main ideas can be briefly summarized as follows.

  • DSVMs essentially repeats two operations until some optimality condition is satisfied: one is to select a working set and the other is to minimize the original objective function of the quadratic programming (QP) problem arising in SVMs by updating the variables only associated with the working set. The key step in DSVMs is how to select a suitable working set at each iteration.

  • CVMs reformulate SVMs as the minimum enclosing ball (MEB) problems in computational geometry. Then an approximate optimal solution to the original optimization problem can be obtained by utilizing efficient approximate MEB approaches. Reported experimental results on very large data sets have shown that classification results obtained by CVMs are as accurate as those obtained by SVMs, while the computation speed of the former is much faster than that of the latter since the computational complexity of the MEB problem is independent of the dimension and number of data samples.

DSVMs and CVMs have been successfully applied to solve many large-scale classification problems. However, the online learning issue of the DSVM and CVM classifiers is still not addressed. In these two algorithms, data are processed in a batch mode. When a new training sample arrives, the whole training process should be implemented once again to adjust the classifier. Thus, online adjustment of the classifier is impossible.

Online learning ability of a classifier is very important in real-time pattern recognition systems such as pedestrian detection system and aircraft visual navigation system. In such systems, data are input in a consecutive sequence. The classifier needs to be adjusted online with misclassified samples to achieve more accurate classification results. Recently, several successful approaches have been proposed to address the online learning issue of SVMs [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33] which will be reviewed in the next section. However, very few work is concerned with deleting training samples effectively without influencing the final trained classifier. Efficient samples deletion is very important for online classification. For very large data sets, a very large amount of training samples will be used in order to get a good trained classifier, so if no samples are deleted then online adjustment of the classifier would be very difficult since all training samples will be needed to re-train the classifier.

In this paper, we propose an online CVM classification algorithm with adaptive MEB adjustment, based on an efficient redundant samples deletion technique. An advantage of our approach over the existing ones is that our online CVM algorithm, called OCVM, can be applied to deal with very large data sets efficiently, as shown by the experimental results on both synthetic and real-world data sets. Our OCVM consists of the following two steps:

  • Off-line samples deletion: an upper bound is given of the distance between the center of the approximate MEB at each iteration and the accurate MEB of all the training samples and then used to identify training samples which definitely lie in the final computed MEB; such data samples are permanently deleted from the set of training samples to accelerate the speed of MEB computation.

  • Online classifier adjustment: the selected training samples after the off-line samples deletion step, together with new coming misclassified samples, are used to compute the new classifier coefficients; then online updating of the classifier can be achieved since only very limited training samples are maintained in this process due to the efficient samples deletion.

Experimental results on both synthetic and real-world data sets have been presented to illustrate the validity and effectiveness of the proposed method.

The rest of the paper is organized as follows. A literature review on the existing online learning algorithms for SVMs is presented in Section 2. In Section 3, the classical CVM algorithm is briefly reviewed. The redundant samples deletion algorithm is described in Section 4, and the new OCVM algorithm is presented in Section 5. In Section 6, experiments are conducted on both synthetic and real-world data to illustrate the validity and effectiveness of the proposed method. Some concluding remarks are given in Section 7.

Section snippets

Literature review on existing online SVM algorithms

In this section, we briefly review the recent progress on the online learning issue of SVMs. Cheng and Shih [20] proposed an incremental training algorithm of SVMs by using active query. A subset of training samples is first selected by the K-Means clustering to compute the initial separating hyperplane. At each iteration, the classifier is updated based on the training samples selected by active query. The iteration stops until there are no unused informative training samples. Syed et al. [21]

Brief review of core vector machines

In this section, the CVM algorithm [16] and the generalized CVM (GCVM) algorithm [17] will be reviewed, respectively. Both of these two algorithms utilize the relationship between MEB and SVM to solve the QP problem arising in SVM with the approximate MEB method. In this section, we first review the approximate MEB method and then introduce the CVM and GCVM algorithms. We also depict the kernel-related problems which CVM and GCVM can be applied to.

The redundant samples deletion algorithm

In the CVM algorithm, it is necessary to compute the distance ctφ(x) for each xP in order to find the furthest point from ct, where c=xiStαiφ(xi). This can be done by noting thatctφ(x)2=xiStαiφ(xi)φ(x)2=xi,xjStαiαjk(xi,xj)2xiStαik(xi,x)+k(x,x).In the GCVM algorithm, it is also needed to calculate the distance ct0φ(x)δfor each xP in order to find the furthest point from [ct0]. It can be seen thatct0φ(x)δ2=xiStαiφ(xi)φ(x)2+δ2=xi,xjStαiαjk(xi,xj)2xiStαik(xi,x)+k(x,x)+δ2.

Online CVM with adaptive MEB adjustment

Based on the redundant samples deletion (RSD) algorithm proposed in Section 4, online updating of the CVM classifier can be achieved. In this section, we first discuss how MEB can be learned online based on the selected samples after samples deletion, that is, those samples which are preserved after deletion, and then show how the online approximate MEB algorithm can be implemented in CVM to achieve online adjustment of the classifier.

Experiments

In this section, we carry out experiments on both synthetic and real-world data sets to show the effectiveness of the proposed algorithms.

We first consider the one-class two Gaussian data to illustrate the online learning procedure of the OCVM algorithm, which is shown in Fig. 2.

As the training data, a set of 400 data samples are first generated from a mixture of two two-dimensional Gaussian distributions, which are marked with blue disks. Then the off-line CVM with the RSD algorithm is

Conclusion

In this paper, an online CVM classifier with online MEB adjustment is proposed. During the training process, many redundant training samples are efficiently removed, which leads to a significant saving of training time of the CVM classifier. During the online classification process, the classifier is updated online based on the preserved samples and the newly coming misclassified samples.

From the theoretical point of view, the deleted samples are proved to be enclosed in the accurate MEB so

Acknowledgements

This work was partly supported by the NNSF of China grant no. 90820007, 60975002, the Outstanding Youth Fund of the NNSF of China Grant no. 60725310, the 863 Program of China Grant no. 2007AA04Z228 and the 973 Program of China Grant no. 2007CB311002. The authors thank the referees for their invaluable comments and suggestions which helped improve the paper greatly.

References (40)

  • K.W. Lau et al.

    Online training of support vector classifier

    Pattern Recognition

    (2003)
  • F. Orabona et al.

    On-line independent support vector machines

    Pattern Recognition

    (2010)
  • M. Bădoiu et al.

    Optimal core-sets for balls

    Computational Geometry

    (2008)
  • E. Osuna, R. Freund, F. Girosi, Training support vector machines: an application to face detection, in: Proceedings of...
  • X.B. Cao et al.

    A low-cost pedestrian detection system with a single optical camera

    IEEE Transactions on Intelligent Transportation Systems

    (2008)
  • T. Joachims, Text categorization with support vector machines: learning with many relevant features, in: Proceedings of...
  • A. Shilton et al.

    Incremental training of support vector machines

    IEEE Transactions on Neural Networks

    (2005)
  • T. Joachims, Making large-scale SVM learning practical, in: Advances in Kernel Methods: Support Vector Learning, 1999,...
  • J.C. Platt, Fast training of support vector machines using sequential minimal optimization, in: Advances in Kernel...
  • C.J. Lin

    On the convergence of the decomposition method for support vector machines

    IEEE Transactions on Neural Networks

    (2001)
  • Cited by (23)

    • Hierarchical mixing linear support vector machines for nonlinear classification

      2016, Pattern Recognition
      Citation Excerpt :

      To address this issue, several methods for selecting a set of basis vectors are proposed. They include the Nystrom method [24], variants of the incomplete Cholesky factorization [25,26] and the core vector machine (CVM) [27,28]. Tipping et al. [29] introduce the relevance vector machine (RVM) by exploiting a probabilistic Bayesian learning framework and a sparseness assumption for the expansion coefficients.

    • Support vector machine based on hierarchical and dynamical granulation

      2016, Neurocomputing
      Citation Excerpt :

      At present, SVM has become a research hotspot of machine learning. In the applications of SVM, researchers pay considerable attention to its learning efficiency and generalization performance, and some scholars have already proposed novel approaches to improve them of SVM [2–9]. Although some achievements have been made, unlike traditional pattern recognition and machine learning, real-world data mining applications often involve large numbers of data records.

    • An adaptive online learning approach for Support Vector Regression: Online-SVR-FID

      2016, Mechanical Systems and Signal Processing
      Citation Excerpt :

      The two criteria proposed for verification of new patterns and changed patterns can also help avoiding the overfitting problem bothering SVR and reducing the influence of the intrinsic noise in the data. The structure of the proposed Online-SVR-FID is similar to that of the method proposed in [23]. However, the method proposed in [23] is based on MEB whereas Online-SVR-FID is based on FVS.

    View all citing articles on Scopus
    View full text