Abstract
The support vector machine (SVM) was originally designed for binary classifications. A lot of effort has been put to generalize the binary SVM to multiclass SVM (MSVM) which are more complex problems. Initially, MSVMs were solved by considering their dual formulations which are quadratic programs and can be solved by standard second-order methods. However, the duals of MSVMs with regularizers are usually more difficult to formulate and computationally very expensive to solve. This paper focuses on several regularized MSVMs and extends the alternating direction method of multiplier (ADMM) to these MSVMs. Using a splitting technique, all considered MSVMs are written as two-block convex programs, for which the ADMM has global convergence guarantees. Numerical experiments on synthetic and real data demonstrate the high efficiency and accuracy of our algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We do not include the constraints \(\mathbf {W}\mathbf{e}=\mathbf {0},\ \mathbf{e}^\top \mathbf{b}=0\) in the augmented Lagrangian, but instead we include them in \((\mathbf {W},\mathbf{b})\)-subproblem; see the update (5a).
- 2.
For the case of \(n\ll p\), we found that using the Woodbury matrix identity can be about 100 times faster than preconditioned conjugate gradient (pcg) with moderate tolerance \(10^{-6}\) for the solving the linear system (7).
References
Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Bottou, L., Cortes, C., Denker, J.S., Drucker, H., Guyon, I., Jackel, L.D., LeCun, Y., Muller, U.A., Sackinger, E., Simard, P., et al.: Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 2, pp. 77–82 (1994)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2010)
Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Proceedings of the Fifteenth International Conference of Machine Learning (ICML 1998), pp. 82–90 (1998)
Chen, X., Pan, W., Kwok, J.T., Carbonell, J.G.: Accelerated gradient method for multi-task sparse learning problem. In: Proceedings of the Ninth International Conference on Data Mining (ICDM 2009), pp. 746–751. IEEE (2009)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2002)
Deng, W., Yin, W.: On the global and linear convergence of the generalized alternating direction method of multipliers. Rice technical report TR12-14 (2012)
Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc. 97(457), 77–87 (2002)
Glowinski, R.: Numerical Methods for Nonlinear Variational Problems. Springer, Heidelberg (2008)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Grant, M., Boyd, S.: CVX - Matlab software for disciplined convex programming, version 2.1 (2014). http://cvxr.com/cvx
Hager, W.W.: Updating the inverse of a matrix. SIAM Rev. 31, 221–239 (1989)
Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)
Khan, J., Wei, J.S., Ringnér, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., et al.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7(6), 673–679 (2001)
Lee, Y., Lin, Y., Wahba, G.: Multicategory support vector machines. J. Am. Stat. Assoc. 99(465), 67–81 (2004)
Platt, J.C., Cristianini, N., Shawe-Taylor, J.: Large margin dags for multiclass classification. Adv. Neural Inf. Process. Syst. 12(3), 547–553 (2000)
Sturm, J.: Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optim. Methods Softw. 11(1–4), 625–653 (1999)
Wang, L., Shen, X.: On \({L}_1\)-norm multiclass support vector machines. J. Am. Stat. Assoc. 102(478), 583–594 (2007)
Wang, L., Zhu, J., Zou, H.: Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics 24(3), 412–419 (2008)
Ye, G.B., Chen, Y., Xie, X.: Efficient variable selection in support vector machines via the alternating direction method of multipliers. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (2011)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc. Ser. B (Stat. Method.) 68(1), 49–67 (2006)
Zhang, H., Liu, Y., Wu, Y., Zhu, J.: Variable selection for the multicategory SVM via adaptive sup-norm regularization. Electron. J. Stat. 2, 149–167 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Xu, Y., Akrotirianakis, I., Chakraborty, A. (2015). Alternating Direction Method of Multipliers for Regularized Multiclass Support Vector Machines. In: Pardalos, P., Pavone, M., Farinella, G., Cutello, V. (eds) Machine Learning, Optimization, and Big Data. MOD 2015. Lecture Notes in Computer Science(), vol 9432. Springer, Cham. https://doi.org/10.1007/978-3-319-27926-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-27926-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27925-1
Online ISBN: 978-3-319-27926-8
eBook Packages: Computer ScienceComputer Science (R0)