Accelerating block coordinate descent methods with identification strategies

Lopes, R.; Santos, S. A.; Silva, P. J. S.

doi:10.1007/s10589-018-00056-8

Accelerating block coordinate descent methods with identification strategies

Published: 02 January 2019

Volume 72, pages 609–640, (2019)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

451 Accesses
3 Citations
Explore all metrics

Abstract

This work is about active set identification strategies aimed at accelerating block-coordinate descent methods (BCDM) applied to large-scale problems. We start by devising an identification function tailored for bound-constrained composite minimization together with an associated version of the BCDM, called Active BCDM, that is also globally convergent. The identification function gives rise to an efficient practical strategy for Lasso and $\ell _1$-regularized logistic regression. The computational performance of Active BCDM is contextualized using comparative sets of experiments that are based on the solution of problems with data from deterministic instances from the literature. These results have been compared with those of well-established and state-of-the-art methods that are particularly suited for the classes of applications under consideration. Active BCDM has proved useful in achieving fast results due to its identification strategy. Besides that, an extra second-order step was used, with favorable cost-benefit.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Active-Set Proximal-Newton Algorithm for $$\ell _1$$ Regularized Optimization Problems with Box Constraints

Article 19 November 2020

BROCCOLI: overlapping and outlier-robust biclustering through proximal stochastic gradient descent

Article Open access 11 August 2021

Fast and scalable Lasso via stochastic Frank–Wolfe methods with a convergence guarantee

Article 21 July 2016

Notes

The code is available at http://www.ime.unicamp.br/~pjssilva/code/abcd.

References

Andrew, G., Gao, J.: Scalable training of L1-regularized log-linear models. In: Proceedings of the 24th international conference on machine learning, ICML ’07, pp. 33–40. ACM, New York, NY, USA (2007). https://doi.org/10.1145/1273496.1273501
Beck, A., Tetruashvili, L.: On the convergence of block coordinate descent type methods. SIAM J. Optim. 23(4), 2037–2060 (2013). https://doi.org/10.1137/120887679
Article MathSciNet MATH Google Scholar
Berg, E.V., Friedlander, M.P., Hennenfent, G., Herrmann, F., Saab, R., Yılmaz, Ö.: SPARCO: a testing framework for sparse reconstruction. Technical Report TR-2007-20, Department of Computer Science, University of British Columbia, Vancouver (2007)
Boisvert, R.F., Pozo, R., Remington, K., Barrett, R.F., Dongarra, J.J.: Matrix market: a web resource for test matrix collections, pp. 125–137. Springer US, Boston, MA (1997). https://doi.org/10.1007/978-1-5041-2940-4_9
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2010)
Article MATH Google Scholar
Bradley, J.K., Kyrola, A., Bickson, D., Guestrin, C.: Parallel coordinate descent for $\ell _1$-regularized loss minimization. In: ICML2011 (ed.) Proceedings of the 28th international conference on machine learning, pp. 1–8. The International Machine Learning Society, Bellevue, Washington, USA (2011). http://www.icml-2011.org/papers/231_icmlpaper.pdf
Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717–772 (2009). https://doi.org/10.1007/s10208-009-9045-5
Article MathSciNet MATH Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar
Chen, T., Curtis, F., Robinson, D.: A reduced-space algorithm for minimizing $\ell _1$-regularized convex functions. SIAM J. Optim. 27(3), 1583–1610 (2017). https://doi.org/10.1137/16M1062259
Article MathSciNet MATH Google Scholar
Csiba, D., Qu, Z., Richtárik, P.: Stochastic dual coordinate ascent with adaptive probabilities. In: Proceedings of the 32nd international conference on international conference on machine learning, ICML’15, vol. 37, pp. 674–683. JMLR.org, Lille, France (2015). http://dl.acm.org/citation.cfm?id=3045118.3045191
Davis, T.A., Hu, Y.: The University of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1:1–1:25 (2011). https://doi.org/10.1145/2049662.2049663
MathSciNet MATH Google Scholar
De Santis, M., Lucidi, S., Rinaldi, F.: A fast active set block coordinate descent algorithm for $\ell _1$-regularized least squares. SIAM J. Optim. 26(1), 781–809 (2016). https://doi.org/10.1137/141000737
Article MathSciNet MATH Google Scholar
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002). https://doi.org/10.1007/s101070100263
Article MathSciNet MATH Google Scholar
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006). https://doi.org/10.1109/TIT.2006.871582
Article MathSciNet MATH Google Scholar
Facchinei, F., Fischer, A., Kanzow, C.: On the accurate identification of active constraints. SIAM J. Optim. 9(1), 14–32 (1998). https://doi.org/10.1137/S1052623496305882
Article MathSciNet MATH Google Scholar
Fercoq, O., Richtárik, P.: Accelerated, parallel and proximal coordinate descent. SIAM J. Optim. 25(4), 1997–2013 (2015). https://doi.org/10.1137/130949993
Article MathSciNet MATH Google Scholar
Fountoulakis, K., Tappenden, R.: A flexible coordinate descent method. Comput. Optim. Appl. 70(2), 351–394 (2018). https://doi.org/10.1007/s10589-018-9984-3
Article MathSciNet MATH Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010). https://doi.org/10.18637/jss.v033.i01
Article Google Scholar
Glasmachers, T., Dogan, U.: Accelerated coordinate descent with adaptive coordinate frequencies. In: Proceedings of the 5th Asian conference on machine learning (ACML), Proc. Mach. Learn. Res., vol. 29, pp. 72–86. PMLR, Australian National University, Canberra, Australia (2013). http://proceedings.mlr.press/v29/Glasmachers13.html
Kim, D., Sra, S., Dhillon, I.S.: A non-monotonic method for large-scale non-negative least squares. Optim. Methods Softw. 28(5), 1012–1039 (2013). https://doi.org/10.1080/10556788.2012.656368
Article MathSciNet MATH Google Scholar
Kim, S.J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: An interior-point method for large-scale $\ell _1$-regularized least squares. IEEE J. Sel. Top. Signal Process. 1(4), 606–617 (2007)
Article Google Scholar
Komarek, P.: Paul Komarek’s webpage. http://komarix.org/ac/ds/. Accessed 29 January 2017
Lichman, M.: UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml. Last updated 23 July 2017. Accessed 01 September 2017
Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, Cambridge (2012)
MATH Google Scholar
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012). https://doi.org/10.1137/100802001
Article MathSciNet MATH Google Scholar
Ng, A.Y.: Feature selection, ${L}_1$ vs. ${L}_2$ regularization and rotational invariance. In: Proceedings of the 21st international conference on machine learning, p. 354 (2004). http://www.machinelearning.org/proceedings/icml2004/papers/354.pdf
Patrascu, A., Necoara, I.: Efficient random coordinate descent algorithms for large-scale structured nonconvex optimization. J. Glob. Optim. 61(1), 19–46 (2015). https://doi.org/10.1007/s10898-014-0151-9
Article MathSciNet MATH Google Scholar
Qu, Z., Richtárik, P.: Coordinate descent with arbitrary sampling I: algorithms and complexity. Optim. Methods Softw. 31(5), 829–857 (2016). https://doi.org/10.1080/10556788.2016.1190360
Article MathSciNet MATH Google Scholar
Richtárik, P., Takáč, M.: Efficient serial and parallel coordinate descent methods for huge-scale truss topology design, pp. 27–32. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29210-1_5
Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144(1), 1–38 (2014). https://doi.org/10.1007/s10107-012-0614-z
Article MathSciNet MATH Google Scholar
Richtárik, P., Takáč, M.: Parallel coordinate descent methods for big data optimization. Math. Program. 156(1), 433–484 (2016). https://doi.org/10.1007/s10107-015-0901-6
Article MathSciNet MATH Google Scholar
Schmidt, M.: Graphical model structure learning with l1-regularization. Ph.D. thesis, University of British Columbia, Vancouver (2010)
Slawski, M.: Problem-specific analysis of non-negative least squares solvers with a focus on instances with sparse solutions (working paper) (2013). https://sites.google.com/site/slawskimartin/publications. Accessed 01 September 2017
Tappenden, R., Richtárik, P., Gondzio, J.: Inexact coordinate descent: complexity and preconditioning. J. Optim. Theory Appl. 170(1), 144–176 (2016). https://doi.org/10.1007/s10957-016-0867-4
Article MathSciNet MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117(1), 387–423 (2009). https://doi.org/10.1007/s10107-007-0170-0
Article MathSciNet MATH Google Scholar
Wen, Z., Yin, W., Zhang, H., Goldfarb, D.: On the convergence of an active-set method for $\ell _1$ minimization. Optim. Methods Softw. 27(6), 1127–1146 (2012). https://doi.org/10.1080/10556788.2011.591398
Article MathSciNet MATH Google Scholar
Wright, S.J.: Coordinate descent algorithms. Math. Program. 151(1), 3–34 (2015). https://doi.org/10.1007/s10107-015-0892-3
Article MathSciNet MATH Google Scholar
Wright, S.J., Nowak, R.D., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 57(7), 2479–2493 (2009). https://doi.org/10.1109/TSP.2009.2016892
Article MathSciNet MATH Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 68(1), 49–67 (2006). https://doi.org/10.1111/j.1467-9868.2005.00532.x
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We are thankful to the comments and suggestions of two anonymous referees, which helped us to improve the presentation of our work.

Author information

Authors and Affiliations

Institute of Mathematics, University of Campinas, Rua Sergio Buarque de Holanda, 651, Campinas, SP, 13083-859, Brazil
R. Lopes, S. A. Santos & P. J. S. Silva

Authors

R. Lopes
View author publications
You can also search for this author in PubMed Google Scholar
S. A. Santos
View author publications
You can also search for this author in PubMed Google Scholar
P. J. S. Silva
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. A. Santos.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Partially supported by FAPESP Grants 2014/14228-6, 2013/05475-7, and 2013/07375-0 and CNPq Grants 306986/2016-7, and 302915/2016-8.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lopes, R., Santos, S.A. & Silva, P.J.S. Accelerating block coordinate descent methods with identification strategies. Comput Optim Appl 72, 609–640 (2019). https://doi.org/10.1007/s10589-018-00056-8

Download citation

Received: 22 November 2017
Published: 02 January 2019
Issue Date: 15 April 2019
DOI: https://doi.org/10.1007/s10589-018-00056-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerating block coordinate descent methods with identification strategies

Abstract

Access this article

Similar content being viewed by others

An Active-Set Proximal-Newton Algorithm for $$\ell _1$$ Regularized Optimization Problems with Box Constraints

BROCCOLI: overlapping and outlier-robust biclustering through proximal stochastic gradient descent

Fast and scalable Lasso via stochastic Frank–Wolfe methods with a convergence guarantee

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Accelerating block coordinate descent methods with identification strategies

Abstract

Access this article

Similar content being viewed by others

An Active-Set Proximal-Newton Algorithm for $$\ell _1$$ Regularized Optimization Problems with Box Constraints

BROCCOLI: overlapping and outlier-robust biclustering through proximal stochastic gradient descent

Fast and scalable Lasso via stochastic Frank–Wolfe methods with a convergence guarantee

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation