Skip to main content
Log in

A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

Support vector machines (SVMs) training may be posed as a large quadratic program (QP) with bound constraints and a single linear equality constraint. We propose a (block) coordinate gradient descent method for solving this problem and, more generally, linearly constrained smooth optimization. Our method is closely related to decomposition methods currently popular for SVM training. We establish global convergence and, under a local error bound assumption (which is satisfied by the SVM QP), linear rate of convergence for our method when the coordinate block is chosen by a Gauss-Southwell-type rule to ensure sufficient descent. We show that, for the SVM QP with n variables, this rule can be implemented in O(n) operations using Rockafellar’s notion of conformal realization. Thus, for SVM training, our method requires only O(n) operations per iteration and, in contrast to existing decomposition methods, achieves linear convergence without additional assumptions. We report our numerical experience with the method on some large SVM QP arising from two-class data classification. Our experience suggests that the method can be efficient for SVM training with nonlinear kernel.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Berman, P., Kovoor, N., Pardalos, P.M.: Algorithms for the least distance problem. In: Pardalos, P.M. (ed.) Complexity in Numerical Optimization, pp. 33–56. World Scientific, Singapore (1993)

    Google Scholar 

  2. Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)

    MATH  Google Scholar 

  3. Brucker, P.: An O(n) algorithm for quadratic knapsack problems. Oper. Res. Lett. 3, 163–166 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  4. Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001). Available from http://www.csie.ntu.edu.tw/~cjlin/libsvm

  5. Chang, C.-C., Hsu, C.-W., Lin, C.-J.: The analysis of decomposition methods for support vector machines. IEEE Trans. Neural Netw. 11, 1003–1008 (2000)

    Article  Google Scholar 

  6. Chen, P.-H., Fan, R.-E., Lin, C.-J.: A study on SMO-type decomposition methods for support vector machines. IEEE Trans. Neural Netw. 17, 893–908 (2006)

    Article  Google Scholar 

  7. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  8. Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using second order information for training support vector machines. J. Mach. Learn. Res. 6, 1889–1918 (2005)

    MathSciNet  Google Scholar 

  9. Ferris, M.C., Munson, T.S.: Interior-point methods for massive support vector machines. SIAM J. Optim. 13, 783–804 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  10. Ferris, M.C., Munson, T.S.: Semismooth support vector machines. Math. Program. 101, 185–204 (2004)

    MATH  MathSciNet  Google Scholar 

  11. Fine, S., Scheinberg, K.: Efficient SVM training using low-rank kernel representations. J. Mach. Learn. Res. 2, 243–264 (2001)

    Article  Google Scholar 

  12. Fletcher, R.: Practical Methods of Optimization, 2nd edn. Wiley, New York (1987)

    MATH  Google Scholar 

  13. Glasmachers, T., Igel, C.: Maximum-gain working set selection for SVMs. J. Mach. Learn. Res. 7, 1437–1466 (2006)

    MathSciNet  Google Scholar 

  14. Hush, D., Scovel, C.: Polynomial-time decomposition algorithms for support vector machines. Mach. Learn. 51, 51–71 (2003)

    Article  MATH  Google Scholar 

  15. Hush, D., Kelly, P., Scovel, C., Steinwart, I.: QP algorithms with guaranteed accuracy and run time for support vector machines. J. Mach. Learn. Res. 7, 733–769 (2006)

    MathSciNet  Google Scholar 

  16. Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods—Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1999)

    Google Scholar 

  17. Keerthi, S.S., Gilbert, E.G.: Convergence of a generalized SMO algorithm for SVM classifier design. Mach. Learn. 46, 351–360 (2002)

    Article  MATH  Google Scholar 

  18. Keerthi, S.S., Ong, C.J.: On the role of the threshold parameter in SVM training algorithm. Technical Report CD-00-09, Department of Mathematical and Production Engineering, National University of Singapore, Singapore (2000)

  19. Keerthi, S.S., Shevade, S.K.: SMO algorithm for least-squares SVM formulations. Neural Comput. 15, 487–507 (2003)

    Article  MATH  Google Scholar 

  20. Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput. 13, 637–649 (2001)

    Article  MATH  Google Scholar 

  21. Kiwiel, K.C.: On linear time algorithms for the continuous quadratic knapsack problem. J. Optim. Theory Appl. 134, 549–554 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  22. Lin, C.-J.: On the convergence of the decomposition method for support vector machines. IEEE Trans. Neural Netw. 12, 1288–1298 (2001)

    Article  Google Scholar 

  23. Lin, C.-J.: Linear convergence of a decomposition method for support vector machines. Technical Report, Department of Computer Science and Information Engineering, Taiwan University, Taipei, Taiwan (2001)

  24. Lin, C.-J.: Asymptotic convergence of an SMO algorithm without any assumptions. IEEE Trans. Neural Netw. 13, 248–250 (2002)

    Article  Google Scholar 

  25. Lin, C.-J., Lucidi, S., Palagi, L., Risi, A., Sciandrone, M.: A decomposition algorithm model for singly linearly constrained problems subject to lower and upper bounds. Technical Report, DIS-Università di Roma “La Sapienza”, Rome, January (2007). To appear in J. Optim. Theory Appl.

  26. List, N., Simon, H.U.: A general convergence theorem for the decomposition method. In: Proceedings of the 17th Annual Conference on Learning Theory, pp. 363–377 (2004)

  27. List, N., Simon, H.U.: General polynomial time decomposition algorithms. In: Lecture Notes in Computer Science, vol. 3559, pp. 308–322. Springer, Berlin (2005)

    Google Scholar 

  28. Lucidi, S., Palagi, L., Risi, A., Sciandrone, M.: On the convergence of hybrid decomposition methods for SVM training. Technical Report, DIS-Università di Roma “La Sapienza”, Rome, July 2006. Submitted to IEEE Trans. Neural Netw.

  29. Luo, Z.-Q., Tseng, P.: Error bounds and the convergence analysis of matrix splitting algorithms for the affine variational inequality problem. SIAM J. Optim. 2, 43–54 (1992)

    Article  MATH  MathSciNet  Google Scholar 

  30. Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46, 157–178 (1993)

    Article  MathSciNet  Google Scholar 

  31. Mangasarian, O.L., Musicant, D.R.: Successive overrelaxation for support vector machines. IEEE Trans. Neural Netw. 10, 1032–1037 (1999)

    Article  Google Scholar 

  32. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (1999)

    Book  MATH  Google Scholar 

  33. Osuna, E., Freund, R., Girosi, F.: Improved training algorithm for support vector machines. In: Proc. IEEE NNSP’97 (1997)

  34. Palagi, L., Sciandrone, M.: On the convergence of a modified version of SVMlight algorithm. Optim. Methods Softw. 20, 317–334 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  35. Platt, J.: Sequential minimal optimization: A fast algorithm for training support vector machines. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods-Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)

    Google Scholar 

  36. Rockafellar, R.T.: The elementary vectors of a subspace of R N. In: Bose, R.C., Dowling, T.A. (eds.) Combinatorial Mathematics and Its Applications, Proc. of the Chapel Hill Conference 1967, pp. 104–127. Univ. North Carolina Press, Chapel Hill (1969)

    Google Scholar 

  37. Rockafellar, R.T.: Network Flows and Monotropic Optimization. Wiley, New York, 1984. Republished by Athena Scientific, Belmont (1998)

    Google Scholar 

  38. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, New York (1998)

    Book  MATH  Google Scholar 

  39. Saunders, C., Stitson, M.O., Weston, J., Bottou, L., Schölkopf., B., Smola, A.J.: Support vector machine—reference manual. Report CSD-TR-98-03, Department of Computer Science, Royal Holloway, University of London, Egham, UK (1998)

  40. Scheinberg, K.: An efficient implementation of an active set method for SVM. J. Mach. Learn. Res. 7, 2237–2257 (2006)

    MathSciNet  Google Scholar 

  41. Schölkopf, B., Smola, A.J., Williamson, R.C., Bartlett, P.L.: New support vector algorithms. Neural Comput. 12, 1207–1245 (2000)

    Article  Google Scholar 

  42. Simon, H.U.: On the complexity of working set selection. In: Proceedings of the 15th International Conference on Algorithmic Learning Theory, pp. 324–337 (2004)

  43. Suykens, J.A.K., Van Gestel, T., De Brabanter, J., De Moor, B., Vandewalle, J.: Least Squares Support Vector Machines. World Scientific, Singapore (2002)

    Book  MATH  Google Scholar 

  44. Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. B 117, 387–423 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  45. Vapnik, V.: Estimation of Dependences Based on Empirical Data. Springer, New York (1982)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul Tseng.

Additional information

This research is supported by the National Science Foundation, Grant No. DMS-0511283.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tseng, P., Yun, S. A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training. Comput Optim Appl 47, 179–206 (2010). https://doi.org/10.1007/s10589-008-9215-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-008-9215-4

Keywords

Navigation