Abstract
We investigate constrained first order techniques for training support vector machines (SVM) for online classification tasks. The methods exploit the structure of the SVM training problem and combine ideas of incremental gradient technique, gradient acceleration and successive simple calculations of Lagrange multipliers. Both primal and dual formulations are studied and compared. Experiments show that the constrained incremental algorithms working in the dual space achieve the best trade-off between prediction accuracy and training time. We perform comparisons with an unconstrained large scale learning algorithm (Pegasos stochastic gradient) to emphasize that our choice can remain competitive for large scale learning due to the very special structure of the training problem.
Similar content being viewed by others
References
Balakrishna P, Raman S, Santosa B, Trafalis TB (2008) Support vector regression for determining the minimum zone sphericity. Int J Adv Manuf Technol 35:916–923
Bertsekas DP (1996) A New Class of Incremental Gradient Methods for Least Squares Problems, Technical Report, Department of Electrical Engineering and Computer Science. MIT, Cambridge
Bertsekas DP (2010) Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey, Technical Report LIDS-2848. Laboratory for Information and Decision Systems, MIT, Cambridge
Bordes A, Ertekin S, Weston J, Bottou L (2005) Fast Kernel classifiers with online and active learning. J Mach Learn Res 6:1579–1619
Bottou L, Bousquet O (2008) The tradeoffs of large scale learning, advances in neural information processing systems, 20. MIT Press, Cambridge
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2:121–167
Chapelle O (2007) Training a support vector machine in the primal. Neural Comp 19:1155–1178
Chang C-C, Lin C-J (2010) LIBSVM—A Library for Support Vector Machines. http://www.csie.ntu.edu.tw/cjlin/libsvmtools/datasets/ Department of Computer Science National Taiwan University, Taipei 106, Taiwan
Couellan NP, Trafalis TB (2013) Online SVM learning with an incremental primal-dual technique. Optim Method Softw 28(2):256–275
Couellan NP, Trafalis TB (2013) An incremental primal-dual method for nonlinear programming with special structure, Optim Lett 7(1): 51–62
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machine. Cambridge University Press, Cambridge
Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1924
Frank A, Asuncion A (2010) UCI Machine Learning Repository. http://archive.ics.uci.edu/ml, University of California, School of Information and Computer Science, Irvine
Gonzaga CC, Karas EW (2008) Optimal Steepest descent algorithms for unconstrained convex problems: fine tuning Nesterov’s method. Federal University of Paran, Brazil Technical Report
Gonzaga CC, Karas EW, Rossetto DR (2011) An Optimal Algorithm for Constrained Differentiable Convex Optimization. Technical Report, Federal University of Paraná, Brazil
Joachims T (2006) Training Linear SVMs in Linear Time. In: Proceedings of the ACM conference on knowledge discovery and data mining (KDD). ACM, USA
Lee YJ, Mangasarian OL (2001) RSVM: Reduced Support Vector Machines. In: Proceedings of the SIAM international conference on data mining. SIAM, Philadelphia
Lee YJ, Mangasarian OL, Wolberg WH (2000) Breast Cancer Survival and Chemotherapy: A Support Vector Machine Analysis, Data Mining Institute Technical Report 99–10, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol 55. American Mathematical Society, pp 1–10
Mangasarian OL, Musicant DR (1999) Successive overrelaxation for support vector machines. IEEE Trans Neural Netw 10:1032–1037
Matlab (1994–2010) http://www.mathworks.com, The Math-Works, Inc., Natwick
Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281
Nesterov Y (2004) Introductory lectures on convex optimization. Applied optimization. Kluwer Academic Publishers, Boston
Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B and Burges, Christopher JC and Smola Alexander J (eds) Advances in kernel methods. MIT Press, Cambridge, pp 185–208
Schölkopf B, Smola A (2002) Learning with Kernels. MIT, Cambridge
Shalev-Shwartz SS, Singer Y, Srebro N, Cotter A (2011) Pegasos: primal estimated sub-gradient solver for SVM. Math Prog Ser B 127:3–30
Son H-J, Trafalis TB (2006) Detection of tornados using an incremental revised support vector machine with filters. Int Conf Comput Sci 3:506–513
Sra S, Nowozin S, Wright SJ (eds) (2011) Optimization for Machine Learning. MIT Press, Cambridge
Trafalis TB, Adrianto I, Richman MB (2010) Machine Learning Techniques for Imbalanced Data: An Application for Tornado Detection, In: Dagli CH (ed) Part IV: General Engineering Systems from: Intelligent Engineering Systems through Artificial Neural Networks, vol 20
Trafalis TB, Couellan NP, Li P-I, Stumpf G, White A (1997) Affine scaling neural network training algorithm for prediction of tornados. In: Intelligent engineering systems through artificial neural networks. New York, pp 213–218
Tseng P (1998) An incremental gradient(-projection) method with momentum term and adaptive stepsize rule. SIAM J Optim 8(2):506–531
Vapnik V (1998) Statistical learning theory. Wiley, New York
Zhang X, Saha A, Vishwanathan SVN (2010) Lower Bounds on Rate of Convergence of Cutting Plane Methods. In: Proceeding of advances in neural information processing systems 23: 24th annual conference on neural information processing systems 2010. Vancouver, British Columbia
Zhou T, Tao D, Wu X (2010) NESVM: a fast gradient method for support vector machines In: Proceedings of the 2010 IEEE international conference on data mining. IEEE Computer Society, Washington, DC, pp 679–688
Acknowledgments
We would like to thank the various anonymous reviewers of this article for their useful comments and suggestions.
The Tornado dataset was kindly provided by M.B. Richman and T.B. Trafalis Trafalis et al. (2010) respectively from the School of Meterology and the School of Industrial and Systems Engineering from the University of Oklahoma, USA.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Couellan, N., Jan, S. Incremental accelerated gradient methods for SVM classification: study of the constrained approach. Comput Manag Sci 11, 419–444 (2014). https://doi.org/10.1007/s10287-013-0186-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10287-013-0186-2
Keywords
- Support vector machines
- Kernel technique
- Machine learning
- Incremental gradient method
- Accelerated gradient
- Constrained gradient
- Nonlinear programming