A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training

Tseng, Paul; Yun, Sangwoon

doi:10.1007/s10589-008-9215-4

A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training

Published: 30 October 2008

Volume 47, pages 179–206, (2010)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Paul Tseng¹ &
Sangwoon Yun¹

517 Accesses
38 Citations
Explore all metrics

Abstract

Support vector machines (SVMs) training may be posed as a large quadratic program (QP) with bound constraints and a single linear equality constraint. We propose a (block) coordinate gradient descent method for solving this problem and, more generally, linearly constrained smooth optimization. Our method is closely related to decomposition methods currently popular for SVM training. We establish global convergence and, under a local error bound assumption (which is satisfied by the SVM QP), linear rate of convergence for our method when the coordinate block is chosen by a Gauss-Southwell-type rule to ensure sufficient descent. We show that, for the SVM QP with n variables, this rule can be implemented in O(n) operations using Rockafellar’s notion of conformal realization. Thus, for SVM training, our method requires only O(n) operations per iteration and, in contrast to existing decomposition methods, achieves linear convergence without additional assumptions. We report our numerical experience with the method on some large SVM QP arising from two-class data classification. Our experience suggests that the method can be efficient for SVM training with nonlinear kernel.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gradient-type penalty method with inertial effects for solving constrained convex optimization problems with smooth data

Article Open access 14 June 2017

Online Budgeted Stochastic Coordinate Ascent for Large-Scale Kernelized Dual Support Vector Machine Training

Hyper-parameter optimization for support vector machines using stochastic gradient descent and dual coordinate descent

Article 19 June 2019

References

Berman, P., Kovoor, N., Pardalos, P.M.: Algorithms for the least distance problem. In: Pardalos, P.M. (ed.) Complexity in Numerical Optimization, pp. 33–56. World Scientific, Singapore (1993)
Google Scholar
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)
MATH Google Scholar
Brucker, P.: An O(n) algorithm for quadratic knapsack problems. Oper. Res. Lett. 3, 163–166 (1984)
Article MATH MathSciNet Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001). Available from http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chang, C.-C., Hsu, C.-W., Lin, C.-J.: The analysis of decomposition methods for support vector machines. IEEE Trans. Neural Netw. 11, 1003–1008 (2000)
Article Google Scholar
Chen, P.-H., Fan, R.-E., Lin, C.-J.: A study on SMO-type decomposition methods for support vector machines. IEEE Trans. Neural Netw. 17, 893–908 (2006)
Article Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Google Scholar
Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using second order information for training support vector machines. J. Mach. Learn. Res. 6, 1889–1918 (2005)
MathSciNet Google Scholar
Ferris, M.C., Munson, T.S.: Interior-point methods for massive support vector machines. SIAM J. Optim. 13, 783–804 (2003)
Article MATH MathSciNet Google Scholar
Ferris, M.C., Munson, T.S.: Semismooth support vector machines. Math. Program. 101, 185–204 (2004)
MATH MathSciNet Google Scholar
Fine, S., Scheinberg, K.: Efficient SVM training using low-rank kernel representations. J. Mach. Learn. Res. 2, 243–264 (2001)
Article Google Scholar
Fletcher, R.: Practical Methods of Optimization, 2nd edn. Wiley, New York (1987)
MATH Google Scholar
Glasmachers, T., Igel, C.: Maximum-gain working set selection for SVMs. J. Mach. Learn. Res. 7, 1437–1466 (2006)
MathSciNet Google Scholar
Hush, D., Scovel, C.: Polynomial-time decomposition algorithms for support vector machines. Mach. Learn. 51, 51–71 (2003)
Article MATH Google Scholar
Hush, D., Kelly, P., Scovel, C., Steinwart, I.: QP algorithms with guaranteed accuracy and run time for support vector machines. J. Mach. Learn. Res. 7, 733–769 (2006)
MathSciNet Google Scholar
Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods—Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1999)
Google Scholar
Keerthi, S.S., Gilbert, E.G.: Convergence of a generalized SMO algorithm for SVM classifier design. Mach. Learn. 46, 351–360 (2002)
Article MATH Google Scholar
Keerthi, S.S., Ong, C.J.: On the role of the threshold parameter in SVM training algorithm. Technical Report CD-00-09, Department of Mathematical and Production Engineering, National University of Singapore, Singapore (2000)
Keerthi, S.S., Shevade, S.K.: SMO algorithm for least-squares SVM formulations. Neural Comput. 15, 487–507 (2003)
Article MATH Google Scholar
Keerthi, S.S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput. 13, 637–649 (2001)
Article MATH Google Scholar
Kiwiel, K.C.: On linear time algorithms for the continuous quadratic knapsack problem. J. Optim. Theory Appl. 134, 549–554 (2007)
Article MATH MathSciNet Google Scholar
Lin, C.-J.: On the convergence of the decomposition method for support vector machines. IEEE Trans. Neural Netw. 12, 1288–1298 (2001)
Article Google Scholar
Lin, C.-J.: Linear convergence of a decomposition method for support vector machines. Technical Report, Department of Computer Science and Information Engineering, Taiwan University, Taipei, Taiwan (2001)
Lin, C.-J.: Asymptotic convergence of an SMO algorithm without any assumptions. IEEE Trans. Neural Netw. 13, 248–250 (2002)
Article Google Scholar
Lin, C.-J., Lucidi, S., Palagi, L., Risi, A., Sciandrone, M.: A decomposition algorithm model for singly linearly constrained problems subject to lower and upper bounds. Technical Report, DIS-Università di Roma “La Sapienza”, Rome, January (2007). To appear in J. Optim. Theory Appl.
List, N., Simon, H.U.: A general convergence theorem for the decomposition method. In: Proceedings of the 17th Annual Conference on Learning Theory, pp. 363–377 (2004)
List, N., Simon, H.U.: General polynomial time decomposition algorithms. In: Lecture Notes in Computer Science, vol. 3559, pp. 308–322. Springer, Berlin (2005)
Google Scholar
Lucidi, S., Palagi, L., Risi, A., Sciandrone, M.: On the convergence of hybrid decomposition methods for SVM training. Technical Report, DIS-Università di Roma “La Sapienza”, Rome, July 2006. Submitted to IEEE Trans. Neural Netw.
Luo, Z.-Q., Tseng, P.: Error bounds and the convergence analysis of matrix splitting algorithms for the affine variational inequality problem. SIAM J. Optim. 2, 43–54 (1992)
Article MATH MathSciNet Google Scholar
Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46, 157–178 (1993)
Article MathSciNet Google Scholar
Mangasarian, O.L., Musicant, D.R.: Successive overrelaxation for support vector machines. IEEE Trans. Neural Netw. 10, 1032–1037 (1999)
Article Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (1999)
Book MATH Google Scholar
Osuna, E., Freund, R., Girosi, F.: Improved training algorithm for support vector machines. In: Proc. IEEE NNSP’97 (1997)
Palagi, L., Sciandrone, M.: On the convergence of a modified version of SVM^light algorithm. Optim. Methods Softw. 20, 317–334 (2005)
Article MATH MathSciNet Google Scholar
Platt, J.: Sequential minimal optimization: A fast algorithm for training support vector machines. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods-Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)
Google Scholar
Rockafellar, R.T.: The elementary vectors of a subspace of R ^N. In: Bose, R.C., Dowling, T.A. (eds.) Combinatorial Mathematics and Its Applications, Proc. of the Chapel Hill Conference 1967, pp. 104–127. Univ. North Carolina Press, Chapel Hill (1969)
Google Scholar
Rockafellar, R.T.: Network Flows and Monotropic Optimization. Wiley, New York, 1984. Republished by Athena Scientific, Belmont (1998)
Google Scholar
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, New York (1998)
Book MATH Google Scholar
Saunders, C., Stitson, M.O., Weston, J., Bottou, L., Schölkopf., B., Smola, A.J.: Support vector machine—reference manual. Report CSD-TR-98-03, Department of Computer Science, Royal Holloway, University of London, Egham, UK (1998)
Scheinberg, K.: An efficient implementation of an active set method for SVM. J. Mach. Learn. Res. 7, 2237–2257 (2006)
MathSciNet Google Scholar
Schölkopf, B., Smola, A.J., Williamson, R.C., Bartlett, P.L.: New support vector algorithms. Neural Comput. 12, 1207–1245 (2000)
Article Google Scholar
Simon, H.U.: On the complexity of working set selection. In: Proceedings of the 15th International Conference on Algorithmic Learning Theory, pp. 324–337 (2004)
Suykens, J.A.K., Van Gestel, T., De Brabanter, J., De Moor, B., Vandewalle, J.: Least Squares Support Vector Machines. World Scientific, Singapore (2002)
Book MATH Google Scholar
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. B 117, 387–423 (2009)
Article MATH MathSciNet Google Scholar
Vapnik, V.: Estimation of Dependences Based on Empirical Data. Springer, New York (1982)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Washington, Seattle, WA, 98195, USA
Paul Tseng & Sangwoon Yun

Authors

Paul Tseng
View author publications
You can also search for this author in PubMed Google Scholar
Sangwoon Yun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul Tseng.

Additional information

This research is supported by the National Science Foundation, Grant No. DMS-0511283.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tseng, P., Yun, S. A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training. Comput Optim Appl 47, 179–206 (2010). https://doi.org/10.1007/s10589-008-9215-4

Download citation

Received: 25 April 2007
Revised: 10 October 2008
Published: 30 October 2008
Issue Date: October 2010
DOI: https://doi.org/10.1007/s10589-008-9215-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training

Abstract

Access this article

Similar content being viewed by others

Gradient-type penalty method with inertial effects for solving constrained convex optimization problems with smooth data

Online Budgeted Stochastic Coordinate Ascent for Large-Scale Kernelized Dual Support Vector Machine Training

Hyper-parameter optimization for support vector machines using stochastic gradient descent and dual coordinate descent

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training

Abstract

Access this article

Similar content being viewed by others

Gradient-type penalty method with inertial effects for solving constrained convex optimization problems with smooth data

Online Budgeted Stochastic Coordinate Ascent for Large-Scale Kernelized Dual Support Vector Machine Training

Hyper-parameter optimization for support vector machines using stochastic gradient descent and dual coordinate descent

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation