An online core vector machine with adaptive MEB adjustment
Introduction
So far, SVMs have been widely used in many real-world applications due to their good performance. These applications include face recognition [1], [2], gene expression data clustering [3], pedestrian detection [4], handwriting recognition [5], as well as the classification tasks of text [6], [7], fingerprint [8] and texture [9]. The main advantages of the SVMs can be summarized as follows [10]:
- •
a compromise between minimizing empirical risk and preventing overfitting is taken by implementing the structural risk minimization;
- •
the process of computing the classification hyperplane involves a convex quadratic optimization problem which can be solved efficiently and has a global solution;
- •
the obtained classifier is completely determined by the support vectors and the type of kernel functions which are used for training.
Despite the above advantages, SVMs have also the following two main disadvantages which limit their application in real-time pattern recognition problems:
- •
the convex quadratic optimization problem arising in SVMs is a large-scale problem for very large data sets, so it is difficult for SVMs to deal with very large data effectively;
- •
SVMs handle training samples in a batch mode; when a new training sample arrives, the whole training process has to be implemented once again to adjust the classifier; thus, it is not practical for SVMs to be used for online learning.
Recently, many algorithms have been proposed to address the fast computation issue of large-scale SVMs (see [18], [19] for a good literature survey on this). Among these algorithms are the decomposed SVMs (DSVMs) [11], [12], [13], [14], [15] and core vector machines (CVMs) [16], [17]. Their main ideas can be briefly summarized as follows.
- •
DSVMs essentially repeats two operations until some optimality condition is satisfied: one is to select a working set and the other is to minimize the original objective function of the quadratic programming (QP) problem arising in SVMs by updating the variables only associated with the working set. The key step in DSVMs is how to select a suitable working set at each iteration.
- •
CVMs reformulate SVMs as the minimum enclosing ball (MEB) problems in computational geometry. Then an approximate optimal solution to the original optimization problem can be obtained by utilizing efficient approximate MEB approaches. Reported experimental results on very large data sets have shown that classification results obtained by CVMs are as accurate as those obtained by SVMs, while the computation speed of the former is much faster than that of the latter since the computational complexity of the MEB problem is independent of the dimension and number of data samples.
DSVMs and CVMs have been successfully applied to solve many large-scale classification problems. However, the online learning issue of the DSVM and CVM classifiers is still not addressed. In these two algorithms, data are processed in a batch mode. When a new training sample arrives, the whole training process should be implemented once again to adjust the classifier. Thus, online adjustment of the classifier is impossible.
Online learning ability of a classifier is very important in real-time pattern recognition systems such as pedestrian detection system and aircraft visual navigation system. In such systems, data are input in a consecutive sequence. The classifier needs to be adjusted online with misclassified samples to achieve more accurate classification results. Recently, several successful approaches have been proposed to address the online learning issue of SVMs [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33] which will be reviewed in the next section. However, very few work is concerned with deleting training samples effectively without influencing the final trained classifier. Efficient samples deletion is very important for online classification. For very large data sets, a very large amount of training samples will be used in order to get a good trained classifier, so if no samples are deleted then online adjustment of the classifier would be very difficult since all training samples will be needed to re-train the classifier.
In this paper, we propose an online CVM classification algorithm with adaptive MEB adjustment, based on an efficient redundant samples deletion technique. An advantage of our approach over the existing ones is that our online CVM algorithm, called OCVM, can be applied to deal with very large data sets efficiently, as shown by the experimental results on both synthetic and real-world data sets. Our OCVM consists of the following two steps:
- •
Off-line samples deletion: an upper bound is given of the distance between the center of the approximate MEB at each iteration and the accurate MEB of all the training samples and then used to identify training samples which definitely lie in the final computed MEB; such data samples are permanently deleted from the set of training samples to accelerate the speed of MEB computation.
- •
Online classifier adjustment: the selected training samples after the off-line samples deletion step, together with new coming misclassified samples, are used to compute the new classifier coefficients; then online updating of the classifier can be achieved since only very limited training samples are maintained in this process due to the efficient samples deletion.
The rest of the paper is organized as follows. A literature review on the existing online learning algorithms for SVMs is presented in Section 2. In Section 3, the classical CVM algorithm is briefly reviewed. The redundant samples deletion algorithm is described in Section 4, and the new OCVM algorithm is presented in Section 5. In Section 6, experiments are conducted on both synthetic and real-world data to illustrate the validity and effectiveness of the proposed method. Some concluding remarks are given in Section 7.
Section snippets
Literature review on existing online SVM algorithms
In this section, we briefly review the recent progress on the online learning issue of SVMs. Cheng and Shih [20] proposed an incremental training algorithm of SVMs by using active query. A subset of training samples is first selected by the K-Means clustering to compute the initial separating hyperplane. At each iteration, the classifier is updated based on the training samples selected by active query. The iteration stops until there are no unused informative training samples. Syed et al. [21]
Brief review of core vector machines
In this section, the CVM algorithm [16] and the generalized CVM (GCVM) algorithm [17] will be reviewed, respectively. Both of these two algorithms utilize the relationship between MEB and SVM to solve the QP problem arising in SVM with the approximate MEB method. In this section, we first review the approximate MEB method and then introduce the CVM and GCVM algorithms. We also depict the kernel-related problems which CVM and GCVM can be applied to.
The redundant samples deletion algorithm
In the CVM algorithm, it is necessary to compute the distance for each in order to find the furthest point from ct, where . This can be done by noting thatIn the GCVM algorithm, it is also needed to calculate the distance for each in order to find the furthest point from . It can be seen that
Online CVM with adaptive MEB adjustment
Based on the redundant samples deletion (RSD) algorithm proposed in Section 4, online updating of the CVM classifier can be achieved. In this section, we first discuss how MEB can be learned online based on the selected samples after samples deletion, that is, those samples which are preserved after deletion, and then show how the online approximate MEB algorithm can be implemented in CVM to achieve online adjustment of the classifier.
Experiments
In this section, we carry out experiments on both synthetic and real-world data sets to show the effectiveness of the proposed algorithms.
We first consider the one-class two Gaussian data to illustrate the online learning procedure of the OCVM algorithm, which is shown in Fig. 2.
As the training data, a set of 400 data samples are first generated from a mixture of two two-dimensional Gaussian distributions, which are marked with blue disks. Then the off-line CVM with the RSD algorithm is
Conclusion
In this paper, an online CVM classifier with online MEB adjustment is proposed. During the training process, many redundant training samples are efficiently removed, which leads to a significant saving of training time of the CVM classifier. During the online classification process, the classifier is updated online based on the preserved samples and the newly coming misclassified samples.
From the theoretical point of view, the deleted samples are proved to be enclosed in the accurate MEB so
Acknowledgements
This work was partly supported by the NNSF of China grant no. 90820007, 60975002, the Outstanding Youth Fund of the NNSF of China Grant no. 60725310, the 863 Program of China Grant no. 2007AA04Z228 and the 973 Program of China Grant no. 2007CB311002. The authors thank the referees for their invaluable comments and suggestions which helped improve the paper greatly.
References (40)
- et al.
Face detection using discriminating feature analysis and support vector machine
Pattern Recognition
(2006) - et al.
Towards improving fuzzy clustering using support vector machine: application to gene expression data
Pattern Recognition
(2009) - et al.
Model selection for the LS-SVM. Application to handwriting recognition
Pattern Recognition
(2009) - et al.
Support vector machine-based text detection in digital video
Pattern Recognition
(2001) - et al.
Fingerprint classification using one-vs-all support vector machines dynamically ordered with naïve Bayes classifiers
Pattern Recognition
(2008) - et al.
Texture classification using the support vector machines
Pattern Recognition
(2003) - et al.
A simple decomposition algorithm for support vector machines with polynomial-time convergence
Pattern Recognition
(2007) - et al.
An improved incremental training algorithm for support vector machines using active query
Pattern Recognition
(2007) - et al.
Kernel-based online machine learning and support vector reduction
Neurocomputing
(2008) - et al.
An online support vector machine for abnormal events detection
Signal Processing
(2006)
Online training of support vector classifier
Pattern Recognition
On-line independent support vector machines
Pattern Recognition
Optimal core-sets for balls
Computational Geometry
A low-cost pedestrian detection system with a single optical camera
IEEE Transactions on Intelligent Transportation Systems
Incremental training of support vector machines
IEEE Transactions on Neural Networks
On the convergence of the decomposition method for support vector machines
IEEE Transactions on Neural Networks
Cited by (23)
Hierarchical mixing linear support vector machines for nonlinear classification
2016, Pattern RecognitionCitation Excerpt :To address this issue, several methods for selecting a set of basis vectors are proposed. They include the Nystrom method [24], variants of the incomplete Cholesky factorization [25,26] and the core vector machine (CVM) [27,28]. Tipping et al. [29] introduce the relevance vector machine (RVM) by exploiting a probabilistic Bayesian learning framework and a sparseness assumption for the expansion coefficients.
Support vector machine based on hierarchical and dynamical granulation
2016, NeurocomputingCitation Excerpt :At present, SVM has become a research hotspot of machine learning. In the applications of SVM, researchers pay considerable attention to its learning efficiency and generalization performance, and some scholars have already proposed novel approaches to improve them of SVM [2–9]. Although some achievements have been made, unlike traditional pattern recognition and machine learning, real-world data mining applications often involve large numbers of data records.
An adaptive online learning approach for Support Vector Regression: Online-SVR-FID
2016, Mechanical Systems and Signal ProcessingCitation Excerpt :The two criteria proposed for verification of new patterns and changed patterns can also help avoiding the overfitting problem bothering SVR and reducing the influence of the intrinsic noise in the data. The structure of the proposed Online-SVR-FID is similar to that of the method proposed in [23]. However, the method proposed in [23] is based on MEB whereas Online-SVR-FID is based on FVS.
Minimum class spread constrained support vector machine
2015, NeurocomputingA novel Frank-Wolfe algorithm. Analysis and applications to large-scale SVM training
2014, Information Sciences