Skip to main content
Log in

Incremental accelerated gradient methods for SVM classification: study of the constrained approach

  • Original Paper
  • Published:
Computational Management Science Aims and scope Submit manuscript

Abstract

We investigate constrained first order techniques for training support vector machines (SVM) for online classification tasks. The methods exploit the structure of the SVM training problem and combine ideas of incremental gradient technique, gradient acceleration and successive simple calculations of Lagrange multipliers. Both primal and dual formulations are studied and compared. Experiments show that the constrained incremental algorithms working in the dual space achieve the best trade-off between prediction accuracy and training time. We perform comparisons with an unconstrained large scale learning algorithm (Pegasos stochastic gradient) to emphasize that our choice can remain competitive for large scale learning due to the very special structure of the training problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Balakrishna P, Raman S, Santosa B, Trafalis TB (2008) Support vector regression for determining the minimum zone sphericity. Int J Adv Manuf Technol 35:916–923

    Article  Google Scholar 

  • Bertsekas DP (1996) A New Class of Incremental Gradient Methods for Least Squares Problems, Technical Report, Department of Electrical Engineering and Computer Science. MIT, Cambridge

  • Bertsekas DP (2010) Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey, Technical Report LIDS-2848. Laboratory for Information and Decision Systems, MIT, Cambridge

  • Bordes A, Ertekin S, Weston J, Bottou L (2005) Fast Kernel classifiers with online and active learning. J Mach Learn Res 6:1579–1619

    Google Scholar 

  • Bottou L, Bousquet O (2008) The tradeoffs of large scale learning, advances in neural information processing systems, 20. MIT Press, Cambridge

    Google Scholar 

  • Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2:121–167

    Article  Google Scholar 

  • Chapelle O (2007) Training a support vector machine in the primal. Neural Comp 19:1155–1178

    Article  Google Scholar 

  • Chang C-C, Lin C-J (2010) LIBSVM—A Library for Support Vector Machines. http://www.csie.ntu.edu.tw/cjlin/libsvmtools/datasets/ Department of Computer Science National Taiwan University, Taipei 106, Taiwan

  • Couellan NP, Trafalis TB (2013) Online SVM learning with an incremental primal-dual technique. Optim Method Softw 28(2):256–275

    Article  Google Scholar 

  • Couellan NP, Trafalis TB (2013) An incremental primal-dual method for nonlinear programming with special structure, Optim Lett 7(1): 51–62

    Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machine. Cambridge University Press, Cambridge

    Google Scholar 

  • Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1924

    Article  Google Scholar 

  • Frank A, Asuncion A (2010) UCI Machine Learning Repository. http://archive.ics.uci.edu/ml, University of California, School of Information and Computer Science, Irvine

  • Gonzaga CC, Karas EW (2008) Optimal Steepest descent algorithms for unconstrained convex problems: fine tuning Nesterov’s method. Federal University of Paran, Brazil Technical Report

  • Gonzaga CC, Karas EW, Rossetto DR (2011) An Optimal Algorithm for Constrained Differentiable Convex Optimization. Technical Report, Federal University of Paraná, Brazil

  • Joachims T (2006) Training Linear SVMs in Linear Time. In: Proceedings of the ACM conference on knowledge discovery and data mining (KDD). ACM, USA

  • Lee YJ, Mangasarian OL (2001) RSVM: Reduced Support Vector Machines. In: Proceedings of the SIAM international conference on data mining. SIAM, Philadelphia

  • Lee YJ, Mangasarian OL, Wolberg WH (2000) Breast Cancer Survival and Chemotherapy: A Support Vector Machine Analysis, Data Mining Institute Technical Report 99–10, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol 55. American Mathematical Society, pp 1–10

  • Mangasarian OL, Musicant DR (1999) Successive overrelaxation for support vector machines. IEEE Trans Neural Netw 10:1032–1037

    Article  Google Scholar 

  • Matlab (1994–2010) http://www.mathworks.com, The Math-Works, Inc., Natwick

  • Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281

    Article  Google Scholar 

  • Nesterov Y (2004) Introductory lectures on convex optimization. Applied optimization. Kluwer Academic Publishers, Boston

    Book  Google Scholar 

  • Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B and Burges, Christopher JC and Smola Alexander J (eds) Advances in kernel methods. MIT Press, Cambridge, pp 185–208

  • Schölkopf B, Smola A (2002) Learning with Kernels. MIT, Cambridge

    Google Scholar 

  • Shalev-Shwartz SS, Singer Y, Srebro N, Cotter A (2011) Pegasos: primal estimated sub-gradient solver for SVM. Math Prog Ser B 127:3–30

    Article  Google Scholar 

  • Son H-J, Trafalis TB (2006) Detection of tornados using an incremental revised support vector machine with filters. Int Conf Comput Sci 3:506–513

    Google Scholar 

  • Sra S, Nowozin S, Wright SJ (eds) (2011) Optimization for Machine Learning. MIT Press, Cambridge

  • Trafalis TB, Adrianto I, Richman MB (2010) Machine Learning Techniques for Imbalanced Data: An Application for Tornado Detection, In: Dagli CH (ed) Part IV: General Engineering Systems from: Intelligent Engineering Systems through Artificial Neural Networks, vol 20

  • Trafalis TB, Couellan NP, Li P-I, Stumpf G, White A (1997) Affine scaling neural network training algorithm for prediction of tornados. In: Intelligent engineering systems through artificial neural networks. New York, pp 213–218

  • Tseng P (1998) An incremental gradient(-projection) method with momentum term and adaptive stepsize rule. SIAM J Optim 8(2):506–531

    Article  Google Scholar 

  • Vapnik V (1998) Statistical learning theory. Wiley, New York

    Google Scholar 

  • Zhang X, Saha A, Vishwanathan SVN (2010) Lower Bounds on Rate of Convergence of Cutting Plane Methods. In: Proceeding of advances in neural information processing systems 23: 24th annual conference on neural information processing systems 2010. Vancouver, British Columbia

  • Zhou T, Tao D, Wu X (2010) NESVM: a fast gradient method for support vector machines In: Proceedings of the 2010 IEEE international conference on data mining. IEEE Computer Society, Washington, DC, pp 679–688

Download references

Acknowledgments

We would like to thank the various anonymous reviewers of this article for their useful comments and suggestions.

The Tornado dataset was kindly provided by M.B. Richman and T.B. Trafalis Trafalis et al. (2010) respectively from the School of Meterology and the School of Industrial and Systems Engineering from the University of Oklahoma, USA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nicolas Couellan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Couellan, N., Jan, S. Incremental accelerated gradient methods for SVM classification: study of the constrained approach. Comput Manag Sci 11, 419–444 (2014). https://doi.org/10.1007/s10287-013-0186-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10287-013-0186-2

Keywords

Mathematics Subject Classification (2000)

Navigation