Abstract
In this paper we propose some improvements to a recent decomposition technique for the large quadratic program arising in training support vector machines. As standard decomposition approaches, the technique we consider is based on the idea to optimize, at each iteration, a subset of the variables through the solution of a quadratic programming subproblem. The innovative features of this approach consist in using a very effective gradient projection method for the inner subproblems and a special rule for selecting the variables to be optimized at each step. These features allow to obtain promising performance by decomposing the problem into few large subproblems instead of many small subproblems as usually done by other decomposition schemes. We improve this technique by introducing a new inner solver and a simple strategy for reducing the computational cost of each iteration. We evaluate the effectiveness of these improvements by solving large-scale benchmark problems and by comparison with a widely used decomposition package.
Similar content being viewed by others
References
Barzilai J, Borwein JM (1988) Two-point step size gradient methods. IMA J Numer Anal 8(1):141–148
Bertsekas DP (1999) Nonlinear programming. Athena Scientific, Belmont, MA
Birgin EG, Martínez JM, Raydan M (2000) Nonmonotone spectral projected gradient methods on convex sets. SIAM J Optim 10(4):1196–1211
Boser B, Guyon I, Vapnik VN (1992). A training algorithm for optimal margin classifiers. In: Haussler D (eds). Proceedings of the 5th annual ACM workshop on computational learning theory. ACM Press, Pittsburgh, PA, pp 144–152
Chang CC, Lin CJ (2001), LIBSVM: a library for support vector machines. www.csie.ntu.edu.tw/∼cjlin/libsvm
Collobert R, Bengio S (2001) SVMTorch: support vector machines for large-scale regression problems. J Mach Learn Res 1:143–160
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other Kernel-based learning methods. Cambridge University Press, Cambridge, UK
Dai YH, Fletcher R (2005a) New algorithms for singly linearly constrained quadratic programs subject to lower and upper bounds. Math Program (To appear). Also published as research report NA/216, Department of Mathematics, University of Dundee, Dundee, UK
Dai YH, Fletcher R (2005b) Projected Barzilai–Borwein methods for large-scale box-constrained quadratic programming. Numer Math 100(1):21–47
Fletcher R (2001) On the Barzilai–Borwein method. Research report NA 207, Department of Mathematics, University of Dundee, Dundee UK
Goldfarb D, Idnani A (1983) A numerically stable dual method for solving strictly convex quadratic programs. Math Program 27:1–33
Grippo L, Lampariello F, Lucidi S (1986) A nonmonotone line search technique for Newton’s method. SIAM J Numer Anal 23:707–716
Grippo L, Sciandrone M (2002) Nonmonotone globalization techniques for the Barzilai–Borwein gradient method. Comput Optim Appl 23:143–169
Hsu CW, Lin CJ (2002) A simple decomposition method for support vector machines. Mach Learn 46:291–314
Joachims T (1998). Making large-scale SVM learning practical. In: Schölkopf B, Burges C, Smola A (eds). Advances in kernel methods – support vector learning. MIT, Cambridge
Keerthi S, Gilbert E (2002) Convergence of a generalized SMO algorithm for SVM classifier design. Mach Learn 46:351–360
Keerthi S, Shevade S, Bhattacharyya C, Murthy K (2001) Improvements to platt’s SMO algorithm for SVM classifier design. Neural Comput 13:637–649
LeCun Y (1998) The MNIST database of handwritten digits yann.lecun.com/exdb/mnist
Lin CJ (2001a) On the convergence of the decomposition method for support vector machines. IEEE Trans Neural Netw 12:1288–1298
Lin CJ (2001b) Linear convergence of a decomposition method for support vector machines. Technical report, Department of Computer Science and Information Engineering, National Taiwan University, Taipei Taiwan
Lin CJ (2002) Asymptotic convergence of an SMO algorithm without any assumptions. IEEE Trans Neural Netw 13:248–250
Murphy P, Aha D (1992) UCI repository of machine learning data-bases. www.ics.uci.edu/∼mlearn/MLRepository.html
Osuna E, Freund R, Girosi F (1997) Training support vector machines: an application to face detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR97). IEEE Computer Society, New York, pp 130–136
Palagi L, Sciandrone M (2005) On the convergence of a modified version of SVMl ight algorithm. Optim Methods Softw 20:317–334
Pardalos PM, Kovoor N (1990) An algorithm for a singly constrained class of quadratic programs subject to upper and lower bounds. Math Program 46:321–328
Platt JC (1998). Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges C, Smola A (eds). Advances in kernel methods – support vector learning. MIT, Cambridge, MA
Platt JC (1999). Using analytic QP and sparseness to speed training of support vector machines. In: Kearns M et al. (eds). Advances in neural information processing systems, vol 11. MIT, Cambridge, MA
Ruggiero V, Zanni L (2000a) A modified projection algorithm for large strictly convex quadratic programs. J Optim Theory Appl 104(2):281–299
Ruggiero V, Zanni L (2000b) Variable projection methods for large convex quadratic programs. In: Trigiante D (ed) Recent trends in numerical analysis, advances in the theory of computational mathematics, vol 3. Nova Science Publisher, pp 299–313
Schittkowski K (2003) QL: A Fortran code for convex quadratic programming – user’s guide. Technical report, Department of Mathematics, University of Bayreuth, Germany
Serafini T, Zanghirati G, Zanni L (2005) Gradient projection methods for quadratic programs and applications in training support vector machines. Optim Methods Softw 20:353–378
Serafini T, Zanni L (2005) On the working set selection in gradient projection-based decomposition techniques for support vector machines. Optim Methods Softw 20:583–596
Vapnik VN (1998) Statistical learning theory. Wiley, New York
Zanghirati G, Zanni L (2003) A parallel solver for large quadratic programs in training support vector machines. Parallel Comput 29:535–551
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zanni, L. An Improved Gradient Projection-based Decomposition Technique for Support Vector Machines. CMS 3, 131–145 (2006). https://doi.org/10.1007/s10287-005-0004-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10287-005-0004-6