Primal and dual predicted decrease approximation methods

Beck, Amir; Pauwels, Edouard; Sabach, Shoham

doi:10.1007/s10107-017-1108-9

Primal and dual predicted decrease approximation methods

Full Length Paper
Series B
Published: 03 February 2017

Volume 167, pages 37–73, (2018)
Cite this article

Mathematical Programming Submit manuscript

Amir Beck¹,
Edouard Pauwels² &
Shoham Sabach¹

678 Accesses
Explore all metrics

Abstract

We introduce the notion of predicted decrease approximation (PDA) for constrained convex optimization, a flexible framework which includes as special cases known algorithms such as generalized conditional gradient, proximal gradient, greedy coordinate descent for separable constraints and working set methods for linear equality constraints with bounds. The new scheme allows the development of a unified convergence analysis for these methods. We further consider a partially strongly convex nonsmooth model and show that dual application of PDA-based methods yields new sublinear convergence rate estimates in terms of both primal and dual objectives. As an example of an application, we provide an explicit working set selection rule for SMO-type methods for training the support vector machine with an improved primal convergence analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Self-adaptive algorithms for quasiconvex programming and applications to machine learning

Article 19 May 2024

PDA-Based Method for Convex Optimization

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition

References

Bach, F.: Duality between subgradient and conditional gradient methods. SIAM J. Optim. 25(1), 115–129 (2015)
Article MathSciNet MATH Google Scholar
Beck, A.: The 2-coordinate descent method for solving double-sided simplex constrained minimization problems. J. Optim. Theory Appl. 162(3), 892–919 (2014)
Article MathSciNet MATH Google Scholar
Beck, A.: On the convergence of alternating minimization for convex programming with applications to iteratively reweighted least squares and decomposition schemess. SIAM J. Optim. 25(1), 185–209 (2015)
Article MathSciNet MATH Google Scholar
Beck, A., Pauwels, E., Sabach, S.: The cyclic block conditional gradient method for convex optimization problems. SIAM J. Optim. 25(4), 2024–2049 (2015)
Article MathSciNet MATH Google Scholar
Beck, A., Teboulle, M.: A conditional gradient method with linear rate of convergence for solving convex linear systems. Math. Methods Oper. Res. 59(2), 235–247 (2004)
Article MathSciNet MATH Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Canon, M.D., Cullum, C.D.: A tight upper bound on the rate of convergence of Frank–Wolfe algorithm. SIAM J. Control 6(4), 509–516 (1968)
Article MathSciNet MATH Google Scholar
Chang, C.-C., Hsu, C.-W., Lin, C.-J.: The analysis of decomposition methods for support vector machines. IEEE Trans. Neural Netw. 11(4), 1003–1008 (2000)
Article Google Scholar
Chang, C.-C., Lin, C.-J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar
Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward–backward splitting. Multiscale Model. Simul. 4(4), 1168–1200 (2005)
Article MathSciNet MATH Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Dem’yanov, V.F., Rubinov, A.M.: The minimization of a smooth convex functional on a convex set. SIAM J. Control 5(2), 280–294 (1967)
Article MathSciNet Google Scholar
Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A., Vapnik, V.: Support vector regression machines. Adv. Neural Inf. Process. Syst. 9, 155–161 (1997)
Google Scholar
Dunn, J.C., Harshbarger, S.: Conditional gradient algorithms with open loop step size rules. J. Math. Anal. Appl. 62(2), 432–444 (1978)
Article MathSciNet MATH Google Scholar
Frank, M., Wolfe, P.: An algorithm for quadratic programming. Naval Res. Logist. Quart. 3(1–2), 95–110 (1956)
Article MathSciNet Google Scholar
Hush, D., Kelly, P., Scovel, C., Steinwart, I.: QP algorithms with guaranteed accuracy and run time for support vector machines. J. Mach. Learn. Res. 7, 733–769 (2006)
MathSciNet MATH Google Scholar
Jaggi, M.: Revisiting Frank–Wolfe: projection-free sparse convex optimization. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), vol. 28, pp. 427–435 (2013)
Joachims, T.: Making large-scale support vector machine learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods, pp. 169–184. MIT Press, Cambridge (1999)
Karloff, H.: Linear Programming. Progress in Theoretical Computer Science. Birkhäuser Boston Inc, Boston (1991)
Google Scholar
Korte, B., Vygen, J.: Combinatorial Optimization. Springer, Berlin (2002)
Book MATH Google Scholar
Lacoste-Julien, S., Jaggi, M.: An affine invariant linear convergence analysis for Frank–Wolfe algorithms. In: NIPS 2013 Workshop on Greedy Algorithms, Frank–Wolfe and Friends (2014)
Lacoste-Julien, S., Jaggi, M., Schmidt, M., Pletscher, P.: Block-coordinate Frank–Wolfe optimization for structural SVMs. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), vol. 28, pp. 53–61 (2013)
Lacoste-Julien, S., Schmidt, M., Bach, F.: A simpler approach to obtaining an o (1/t) convergence rate for the projected stochastic subgradient method. arXiv preprint arXiv:1212.2002 (2012)
Levitin, E.S., Poljak, B.T.: Minimization methods in the presence of constraints. USSR Comput. Math. Math. Phys. 6(5), 787–823 (1966)
Article MathSciNet MATH Google Scholar
List, N., Hush, D., Scovel, C., Steinwart, I.: Gaps in support vector optimization. In: Bshouty, N., Gentile, C. (eds.) Proceedings of the 20th conference on learning theory, pp. 336–348. Springer, Berlin (2007)
List, N., Simon, H.U.: General polynomial time decomposition algorithms. J. Mach. Learn. Res. 8, 303–321 (2007)
MathSciNet MATH Google Scholar
Moreau, J.J.: Proximitéet dualité dans un espace hilbertien. Bull. Soc. Math. Fr. 93, 273–299 (1965)
Article MATH Google Scholar
Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines. In: Neural Networks for Signal Processing [1997] VII. Proceedings of the 1997 IEEE Workshop, pp. 276–285. IEEE (1997)
Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods-Support Vector Learning, vol. 3, pp. 185–208 (1999)
Rockafellar, R .T.: Convex Analysis. Princeton Mathematical Series, No. 28. Princeton University Press, Princeton (1970)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis, Volume 317 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer, Berlin (1998)
Schölkopf, B., Platt, J.C., Shawe-Taylor, J., Smola, A.J., Williamson, R.C.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)
Article MATH Google Scholar
Simon, H.U.: On the complexity of working set selection. Theoret. Comput. Sci. 382(3), 262–279 (2007)
Article MathSciNet MATH Google Scholar
Strang, G.: Linear Algebra and Its Applications, 2nd edn. Academic Press, New York (1980)
MATH Google Scholar
Tseng, P., Yun, S.: A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training. Comput. Optim. Appl. 47(2), 179–206 (2010)
Article MathSciNet MATH Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Book MATH Google Scholar
Weston, J., Watkins, C.; Multi-class support vector machines. Technical Report CSD-TR-98-04, Department of Computer Science, Royal Holloway, University of London (1998)

Download references

Author information

Authors and Affiliations

Faculty of Industrial Engineering and Management, Technion, Haifa, Israel
Amir Beck & Shoham Sabach
IRIT, Université Toulouse III - Paul Sabatier, Toulouse, France
Edouard Pauwels

Authors

Amir Beck
View author publications
You can also search for this author inPubMed Google Scholar
Edouard Pauwels
View author publications
You can also search for this author inPubMed Google Scholar
Shoham Sabach
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Amir Beck.

Additional information

The research of Amir Beck was partially supported by the Israel Science Foundation Grant 1821/16. The research of Edouard Pauwels was partially sponsored by a grant from the Air Force Office of Scientific Research, Air Force Material Command (Grant No. FA9550-15-1-0500). Most of this work took place during Edouard Pauwels postdoctoral stay at the Technion, Haifa, Israel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Beck, A., Pauwels, E. & Sabach, S. Primal and dual predicted decrease approximation methods. Math. Program. 167, 37–73 (2018). https://doi.org/10.1007/s10107-017-1108-9

Download citation

Received: 11 November 2015
Accepted: 02 January 2017
Published: 03 February 2017
Issue Date: January 2018
DOI: https://doi.org/10.1007/s10107-017-1108-9

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Primal and dual predicted decrease approximation methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Self-adaptive algorithms for quasiconvex programming and applications to machine learning

PDA-Based Method for Convex Optimization

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now