Skip to main content
Log in

Primal and dual predicted decrease approximation methods

  • Full Length Paper
  • Series B
  • Published:
Mathematical Programming Submit manuscript

Abstract

We introduce the notion of predicted decrease approximation (PDA) for constrained convex optimization, a flexible framework which includes as special cases known algorithms such as generalized conditional gradient, proximal gradient, greedy coordinate descent for separable constraints and working set methods for linear equality constraints with bounds. The new scheme allows the development of a unified convergence analysis for these methods. We further consider a partially strongly convex nonsmooth model and show that dual application of PDA-based methods yields new sublinear convergence rate estimates in terms of both primal and dual objectives. As an example of an application, we provide an explicit working set selection rule for SMO-type methods for training the support vector machine with an improved primal convergence analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Bach, F.: Duality between subgradient and conditional gradient methods. SIAM J. Optim. 25(1), 115–129 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  2. Beck, A.: The 2-coordinate descent method for solving double-sided simplex constrained minimization problems. J. Optim. Theory Appl. 162(3), 892–919 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  3. Beck, A.: On the convergence of alternating minimization for convex programming with applications to iteratively reweighted least squares and decomposition schemess. SIAM J. Optim. 25(1), 185–209 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  4. Beck, A., Pauwels, E., Sabach, S.: The cyclic block conditional gradient method for convex optimization problems. SIAM J. Optim. 25(4), 2024–2049 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  5. Beck, A., Teboulle, M.: A conditional gradient method with linear rate of convergence for solving convex linear systems. Math. Methods Oper. Res. 59(2), 235–247 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  6. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  7. Canon, M.D., Cullum, C.D.: A tight upper bound on the rate of convergence of Frank–Wolfe algorithm. SIAM J. Control 6(4), 509–516 (1968)

    Article  MathSciNet  MATH  Google Scholar 

  8. Chang, C.-C., Hsu, C.-W., Lin, C.-J.: The analysis of decomposition methods for support vector machines. IEEE Trans. Neural Netw. 11(4), 1003–1008 (2000)

    Article  Google Scholar 

  9. Chang, C.-C., Lin, C.-J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)

    Google Scholar 

  10. Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward–backward splitting. Multiscale Model. Simul. 4(4), 1168–1200 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  11. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  12. Dem’yanov, V.F., Rubinov, A.M.: The minimization of a smooth convex functional on a convex set. SIAM J. Control 5(2), 280–294 (1967)

    Article  MathSciNet  Google Scholar 

  13. Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A., Vapnik, V.: Support vector regression machines. Adv. Neural Inf. Process. Syst. 9, 155–161 (1997)

    Google Scholar 

  14. Dunn, J.C., Harshbarger, S.: Conditional gradient algorithms with open loop step size rules. J. Math. Anal. Appl. 62(2), 432–444 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  15. Frank, M., Wolfe, P.: An algorithm for quadratic programming. Naval Res. Logist. Quart. 3(1–2), 95–110 (1956)

    Article  MathSciNet  Google Scholar 

  16. Hush, D., Kelly, P., Scovel, C., Steinwart, I.: QP algorithms with guaranteed accuracy and run time for support vector machines. J. Mach. Learn. Res. 7, 733–769 (2006)

    MathSciNet  MATH  Google Scholar 

  17. Jaggi, M.: Revisiting Frank–Wolfe: projection-free sparse convex optimization. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), vol. 28, pp. 427–435 (2013)

  18. Joachims, T.: Making large-scale support vector machine learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods, pp. 169–184. MIT Press, Cambridge (1999)

  19. Karloff, H.: Linear Programming. Progress in Theoretical Computer Science. Birkhäuser Boston Inc, Boston (1991)

    Google Scholar 

  20. Korte, B., Vygen, J.: Combinatorial Optimization. Springer, Berlin (2002)

    Book  MATH  Google Scholar 

  21. Lacoste-Julien, S., Jaggi, M.: An affine invariant linear convergence analysis for Frank–Wolfe algorithms. In: NIPS 2013 Workshop on Greedy Algorithms, Frank–Wolfe and Friends (2014)

  22. Lacoste-Julien, S., Jaggi, M., Schmidt, M., Pletscher, P.: Block-coordinate Frank–Wolfe optimization for structural SVMs. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), vol. 28, pp. 53–61 (2013)

  23. Lacoste-Julien, S., Schmidt, M., Bach, F.: A simpler approach to obtaining an o (1/t) convergence rate for the projected stochastic subgradient method. arXiv preprint arXiv:1212.2002 (2012)

  24. Levitin, E.S., Poljak, B.T.: Minimization methods in the presence of constraints. USSR Comput. Math. Math. Phys. 6(5), 787–823 (1966)

    Article  MathSciNet  MATH  Google Scholar 

  25. List, N., Hush, D., Scovel, C., Steinwart, I.: Gaps in support vector optimization. In: Bshouty, N., Gentile, C. (eds.) Proceedings of the 20th conference on learning theory, pp. 336–348. Springer, Berlin (2007)

  26. List, N., Simon, H.U.: General polynomial time decomposition algorithms. J. Mach. Learn. Res. 8, 303–321 (2007)

    MathSciNet  MATH  Google Scholar 

  27. Moreau, J.J.: Proximitéet dualité dans un espace hilbertien. Bull. Soc. Math. Fr. 93, 273–299 (1965)

    Article  MATH  Google Scholar 

  28. Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines. In: Neural Networks for Signal Processing [1997] VII. Proceedings of the 1997 IEEE Workshop, pp. 276–285. IEEE (1997)

  29. Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods-Support Vector Learning, vol. 3, pp. 185–208 (1999)

  30. Rockafellar, R .T.: Convex Analysis. Princeton Mathematical Series, No. 28. Princeton University Press, Princeton (1970)

  31. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis, Volume 317 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer, Berlin (1998)

  32. Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)

    Article  MATH  Google Scholar 

  33. Simon, H.U.: On the complexity of working set selection. Theoret. Comput. Sci. 382(3), 262–279 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  34. Strang, G.: Linear Algebra and Its Applications, 2nd edn. Academic Press, New York (1980)

    MATH  Google Scholar 

  35. Tseng, P., Yun, S.: A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training. Comput. Optim. Appl. 47(2), 179–206 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  36. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    Book  MATH  Google Scholar 

  37. Weston, J., Watkins, C.; Multi-class support vector machines. Technical Report CSD-TR-98-04, Department of Computer Science, Royal Holloway, University of London (1998)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amir Beck.

Additional information

The research of Amir Beck was partially supported by the Israel Science Foundation Grant 1821/16. The research of Edouard Pauwels was partially sponsored by a grant from the Air Force Office of Scientific Research, Air Force Material Command (Grant No. FA9550-15-1-0500). Most of this work took place during Edouard Pauwels postdoctoral stay at the Technion, Haifa, Israel.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Beck, A., Pauwels, E. & Sabach, S. Primal and dual predicted decrease approximation methods. Math. Program. 167, 37–73 (2018). https://doi.org/10.1007/s10107-017-1108-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-017-1108-9

Keywords

Mathematics Subject Classification

Navigation