Skip to main content
Log in

An Improved Gradient Projection-based Decomposition Technique for Support Vector Machines

  • Original Paper
  • Published:
Computational Management Science Aims and scope Submit manuscript

Abstract

In this paper we propose some improvements to a recent decomposition technique for the large quadratic program arising in training support vector machines. As standard decomposition approaches, the technique we consider is based on the idea to optimize, at each iteration, a subset of the variables through the solution of a quadratic programming subproblem. The innovative features of this approach consist in using a very effective gradient projection method for the inner subproblems and a special rule for selecting the variables to be optimized at each step. These features allow to obtain promising performance by decomposing the problem into few large subproblems instead of many small subproblems as usually done by other decomposition schemes. We improve this technique by introducing a new inner solver and a simple strategy for reducing the computational cost of each iteration. We evaluate the effectiveness of these improvements by solving large-scale benchmark problems and by comparison with a widely used decomposition package.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Barzilai J, Borwein JM (1988) Two-point step size gradient methods. IMA J Numer Anal 8(1):141–148

    Article  Google Scholar 

  • Bertsekas DP (1999) Nonlinear programming. Athena Scientific, Belmont, MA

    Google Scholar 

  • Birgin EG, Martínez JM, Raydan M (2000) Nonmonotone spectral projected gradient methods on convex sets. SIAM J Optim 10(4):1196–1211

    Article  Google Scholar 

  • Boser B, Guyon I, Vapnik VN (1992). A training algorithm for optimal margin classifiers. In: Haussler D (eds). Proceedings of the 5th annual ACM workshop on computational learning theory. ACM Press, Pittsburgh, PA, pp 144–152

    Chapter  Google Scholar 

  • Chang CC, Lin CJ (2001), LIBSVM: a library for support vector machines. www.csie.ntu.edu.tw/∼cjlin/libsvm

  • Collobert R, Bengio S (2001) SVMTorch: support vector machines for large-scale regression problems. J Mach Learn Res 1:143–160

    Article  Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other Kernel-based learning methods. Cambridge University Press, Cambridge, UK

    Google Scholar 

  • Dai YH, Fletcher R (2005a) New algorithms for singly linearly constrained quadratic programs subject to lower and upper bounds. Math Program (To appear). Also published as research report NA/216, Department of Mathematics, University of Dundee, Dundee, UK

  • Dai YH, Fletcher R (2005b) Projected Barzilai–Borwein methods for large-scale box-constrained quadratic programming. Numer Math 100(1):21–47

    Article  Google Scholar 

  • Fletcher R (2001) On the Barzilai–Borwein method. Research report NA 207, Department of Mathematics, University of Dundee, Dundee UK

    Google Scholar 

  • Goldfarb D, Idnani A (1983) A numerically stable dual method for solving strictly convex quadratic programs. Math Program 27:1–33

    Article  Google Scholar 

  • Grippo L, Lampariello F, Lucidi S (1986) A nonmonotone line search technique for Newton’s method. SIAM J Numer Anal 23:707–716

    Article  Google Scholar 

  • Grippo L, Sciandrone M (2002) Nonmonotone globalization techniques for the Barzilai–Borwein gradient method. Comput Optim Appl 23:143–169

    Article  Google Scholar 

  • Hsu CW, Lin CJ (2002) A simple decomposition method for support vector machines. Mach Learn 46:291–314

    Article  Google Scholar 

  • Joachims T (1998). Making large-scale SVM learning practical. In: Schölkopf B, Burges C, Smola A (eds). Advances in kernel methods – support vector learning. MIT, Cambridge

    Google Scholar 

  • Keerthi S, Gilbert E (2002) Convergence of a generalized SMO algorithm for SVM classifier design. Mach Learn 46:351–360

    Article  Google Scholar 

  • Keerthi S, Shevade S, Bhattacharyya C, Murthy K (2001) Improvements to platt’s SMO algorithm for SVM classifier design. Neural Comput 13:637–649

    Article  Google Scholar 

  • LeCun Y (1998) The MNIST database of handwritten digits yann.lecun.com/exdb/mnist

  • Lin CJ (2001a) On the convergence of the decomposition method for support vector machines. IEEE Trans Neural Netw 12:1288–1298

    Article  Google Scholar 

  • Lin CJ (2001b) Linear convergence of a decomposition method for support vector machines. Technical report, Department of Computer Science and Information Engineering, National Taiwan University, Taipei Taiwan

    Google Scholar 

  • Lin CJ (2002) Asymptotic convergence of an SMO algorithm without any assumptions. IEEE Trans Neural Netw 13:248–250

    Article  Google Scholar 

  • Murphy P, Aha D (1992) UCI repository of machine learning data-bases. www.ics.uci.edu/∼mlearn/MLRepository.html

  • Osuna E, Freund R, Girosi F (1997) Training support vector machines: an application to face detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR97). IEEE Computer Society, New York, pp 130–136

  • Palagi L, Sciandrone M (2005) On the convergence of a modified version of SVMl ight algorithm. Optim Methods Softw 20:317–334

    Article  Google Scholar 

  • Pardalos PM, Kovoor N (1990) An algorithm for a singly constrained class of quadratic programs subject to upper and lower bounds. Math Program 46:321–328

    Article  Google Scholar 

  • Platt JC (1998). Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges C, Smola A (eds). Advances in kernel methods – support vector learning. MIT, Cambridge, MA

    Google Scholar 

  • Platt JC (1999). Using analytic QP and sparseness to speed training of support vector machines. In: Kearns M et al. (eds). Advances in neural information processing systems, vol 11. MIT, Cambridge, MA

    Google Scholar 

  • Ruggiero V, Zanni L (2000a) A modified projection algorithm for large strictly convex quadratic programs. J Optim Theory Appl 104(2):281–299

    Article  Google Scholar 

  • Ruggiero V, Zanni L (2000b) Variable projection methods for large convex quadratic programs. In: Trigiante D (ed) Recent trends in numerical analysis, advances in the theory of computational mathematics, vol 3. Nova Science Publisher, pp 299–313

  • Schittkowski K (2003) QL: A Fortran code for convex quadratic programming – user’s guide. Technical report, Department of Mathematics, University of Bayreuth, Germany

    Google Scholar 

  • Serafini T, Zanghirati G, Zanni L (2005) Gradient projection methods for quadratic programs and applications in training support vector machines. Optim Methods Softw 20:353–378

    Article  Google Scholar 

  • Serafini T, Zanni L (2005) On the working set selection in gradient projection-based decomposition techniques for support vector machines. Optim Methods Softw 20:583–596

    Article  Google Scholar 

  • Vapnik VN (1998) Statistical learning theory. Wiley, New York

    Google Scholar 

  • Zanghirati G, Zanni L (2003) A parallel solver for large quadratic programs in training support vector machines. Parallel Comput 29:535–551

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luca Zanni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zanni, L. An Improved Gradient Projection-based Decomposition Technique for Support Vector Machines. CMS 3, 131–145 (2006). https://doi.org/10.1007/s10287-005-0004-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10287-005-0004-6

Keywords