Skip to main content

Alternating Direction Method of Multipliers for Regularized Multiclass Support Vector Machines

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Big Data (MOD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9432))

Included in the following conference series:

Abstract

The support vector machine (SVM) was originally designed for binary classifications. A lot of effort has been put to generalize the binary SVM to multiclass SVM (MSVM) which are more complex problems. Initially, MSVMs were solved by considering their dual formulations which are quadratic programs and can be solved by standard second-order methods. However, the duals of MSVMs with regularizers are usually more difficult to formulate and computationally very expensive to solve. This paper focuses on several regularized MSVMs and extends the alternating direction method of multiplier (ADMM) to these MSVMs. Using a splitting technique, all considered MSVMs are written as two-block convex programs, for which the ADMM has global convergence guarantees. Numerical experiments on synthetic and real data demonstrate the high efficiency and accuracy of our algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    We do not include the constraints \(\mathbf {W}\mathbf{e}=\mathbf {0},\ \mathbf{e}^\top \mathbf{b}=0\) in the augmented Lagrangian, but instead we include them in \((\mathbf {W},\mathbf{b})\)-subproblem; see the update (5a).

  2. 2.

    For the case of \(n\ll p\), we found that using the Woodbury matrix identity can be about 100 times faster than preconditioned conjugate gradient (pcg) with moderate tolerance \(10^{-6}\) for the solving the linear system (7).

References

  1. Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2006)

    MATH  Google Scholar 

  2. Bottou, L., Cortes, C., Denker, J.S., Drucker, H., Guyon, I., Jackel, L.D., LeCun, Y., Muller, U.A., Sackinger, E., Simard, P., et al.: Comparison of classifier methods: a case study in handwritten digit recognition. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 2, pp. 77–82 (1994)

    Google Scholar 

  3. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2010)

    Article  MATH  Google Scholar 

  4. Bradley, P.S., Mangasarian, O.L.: Feature selection via concave minimization and support vector machines. In: Proceedings of the Fifteenth International Conference of Machine Learning (ICML 1998), pp. 82–90 (1998)

    Google Scholar 

  5. Chen, X., Pan, W., Kwok, J.T., Carbonell, J.G.: Accelerated gradient method for multi-task sparse learning problem. In: Proceedings of the Ninth International Conference on Data Mining (ICDM 2009), pp. 746–751. IEEE (2009)

    Google Scholar 

  6. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  7. Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2002)

    MATH  Google Scholar 

  8. Deng, W., Yin, W.: On the global and linear convergence of the generalized alternating direction method of multipliers. Rice technical report TR12-14 (2012)

    Google Scholar 

  9. Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classification of tumors using gene expression data. J. Am. Stat. Assoc. 97(457), 77–87 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  10. Glowinski, R.: Numerical Methods for Nonlinear Variational Problems. Springer, Heidelberg (2008)

    MATH  Google Scholar 

  11. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)

    Article  Google Scholar 

  12. Grant, M., Boyd, S.: CVX - Matlab software for disciplined convex programming, version 2.1 (2014). http://cvxr.com/cvx

  13. Hager, W.W.: Updating the inverse of a matrix. SIAM Rev. 31, 221–239 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  14. Hsu, C.W., Lin, C.J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)

    Article  Google Scholar 

  15. Khan, J., Wei, J.S., Ringnér, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., et al.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7(6), 673–679 (2001)

    Article  Google Scholar 

  16. Lee, Y., Lin, Y., Wahba, G.: Multicategory support vector machines. J. Am. Stat. Assoc. 99(465), 67–81 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  17. Platt, J.C., Cristianini, N., Shawe-Taylor, J.: Large margin dags for multiclass classification. Adv. Neural Inf. Process. Syst. 12(3), 547–553 (2000)

    Google Scholar 

  18. Sturm, J.: Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optim. Methods Softw. 11(1–4), 625–653 (1999)

    Article  MathSciNet  Google Scholar 

  19. Wang, L., Shen, X.: On \({L}_1\)-norm multiclass support vector machines. J. Am. Stat. Assoc. 102(478), 583–594 (2007)

    Article  MATH  Google Scholar 

  20. Wang, L., Zhu, J., Zou, H.: Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics 24(3), 412–419 (2008)

    Article  Google Scholar 

  21. Ye, G.B., Chen, Y., Xie, X.: Efficient variable selection in support vector machines via the alternating direction method of multipliers. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (2011)

    Google Scholar 

  22. Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Roy. Stat. Soc. Ser. B (Stat. Method.) 68(1), 49–67 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  23. Zhang, H., Liu, Y., Wu, Y., Zhu, J.: Variable selection for the multicategory SVM via adaptive sup-norm regularization. Electron. J. Stat. 2, 149–167 (2008)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ioannis Akrotirianakis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Xu, Y., Akrotirianakis, I., Chakraborty, A. (2015). Alternating Direction Method of Multipliers for Regularized Multiclass Support Vector Machines. In: Pardalos, P., Pavone, M., Farinella, G., Cutello, V. (eds) Machine Learning, Optimization, and Big Data. MOD 2015. Lecture Notes in Computer Science(), vol 9432. Springer, Cham. https://doi.org/10.1007/978-3-319-27926-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27926-8_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27925-1

  • Online ISBN: 978-3-319-27926-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics