Skip to main content
Log in

Coordinate descent algorithm for covariance graphical lasso

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Bien and Tibshirani (Biometrika, 98(4):807–820, 2011) have proposed a covariance graphical lasso method that applies a lasso penalty on the elements of the covariance matrix. This method is definitely useful because it not only produces sparse and positive definite estimates of the covariance matrix but also discovers marginal independence structures by generating exact zeros in the estimated covariance matrix. However, the objective function is not convex, making the optimization challenging. Bien and Tibshirani (Biometrika, 98(4):807–820, 2011) described a majorize-minimize approach to optimize it. We develop a new optimization method based on coordinate descent. We discuss the convergence property of the algorithm. Through simulation experiments, we show that the new algorithm has a number of advantages over the majorize-minimize approach, including its simplicity, computing speed and numerical stability. Finally, we show that the cyclic version of the coordinate descent algorithm is more efficient than the greedy version.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. An unpublished Ph.D. dissertation Lin (2010) may consider the covariance graphical lasso method earlier than Bien and Tibshirani (2011).

References

  • Bien, J., Tibshirani, R.J.: Sparse estimation of a covariance matrix. Biometrika 98(4), 807–820 (2011). doi:10.1093/biomet/asr054

    Article  MATH  MathSciNet  Google Scholar 

  • Breheny, P., Huang, J.: Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Stat. 1, 232–253 (2011)

    Article  MathSciNet  Google Scholar 

  • Dempster, A.: Covariance selection. Biometrics 28, 157–175 (1972)

    Article  Google Scholar 

  • Friedman, J., Hastie, T., Höfling, H., Tibshirani, R.: Pathwise coordinate optimization. Ann. Appl. Stat. 1(2), 302–332 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)

    Article  MATH  Google Scholar 

  • Fu, W.J.: Penalized regressions: the bridge versus the lasso. J. Comput. Graph. Stat. 7(3), 397–416 (1998)

    Google Scholar 

  • Hunter, D.R., Lange, K.: A tutorial on MM algorithms. Am. Stat. 58(1) (2004). doi:10.2307/27643496

  • Lin, N.: A penalized likelihood approach in covariance graphical model selection. Ph.D. Thesis, National University of Singapore (2010)

  • Sardy, S., Bruce, A.G., Tseng, P.: Block coordinate relaxation methods for nonparametric wavelet denoising. J. Comput. Graph. Stat. 9(2), 361–379 (2000)

    MathSciNet  Google Scholar 

  • Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  • Wu, T.T., Lange, K.: Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2(1), 224–244 (2008)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Wang.

Appendix: Details of the majorize-minimize MM algorithm in Sect. 4

Appendix: Details of the majorize-minimize MM algorithm in Sect. 4

Consider \(\sqrt{\sigma_{ij}^{2}+\epsilon}\) as an approximation to |σ ij | for a small ϵ>0. Consequently, the original objective function (1) can be approximated by

$$ \operatorname{log}(\det\boldsymbol{\Sigma})+\operatorname {tr}\bigl(\mathbf{S}\boldsymbol{\Sigma}^{-1}\bigr) + 2\rho \sum _{i<j}\sqrt{\sigma_{ij}^2+ \epsilon} + \rho\sum_{i} \sigma_{ii}. $$
(9)

Note the inequality

$$\sqrt{\sigma_{ij}^2+\epsilon}\leq\sqrt{ \bigl(\sigma_{ij}^{(k)}\bigr)^2+\epsilon} + {\sigma_{ij}^2- (\sigma_{ij}^{(k)})^2 \over2 \sqrt{(\sigma_{ij}^{(k)})^2+\epsilon} }, $$

for a fixed \(\sigma_{ij}^{(k)}\) and all σ ij . Then (9) is majorized by

(10)

The minimize-step in MM then minimizes (10) along each column (row) of Σ. Without loss of generality, consider the last column and row. Partition Σ and S as in (2) and consider the same transformation from (σ 12,σ 22) to \((\boldsymbol{\beta}= \boldsymbol{\sigma}_{12},\gamma=\sigma_{22} - \boldsymbol{\sigma}_{12}^{\prime}\boldsymbol{\Sigma}_{11}^{-1} \boldsymbol{\sigma}_{12})\). The four terms in (10) can be written as functions of (β,γ)

where c 1, c 2 and c 3 are constants not involving (β,γ). Dropping off c 1,c 2 and c 3 from (10), we only have to minimize

(11)

For γ, it is easy to derive from (11) that the conditional minimum point given β is the same as in (5). For β, (11) can be written as a function of β,

$$Q\bigl(\boldsymbol{\beta}\mid\gamma, \varSigma^{(k)}\bigr) = \boldsymbol{\beta }^\prime\bigl(\mathbf{V}+\rho\mathbf{D}^{-1} \bigr) \boldsymbol{\beta}-2 \mathbf{u}^\prime\boldsymbol{\beta}, $$

where V and u are defined in (6). This implies that the conditional minimum point of β is β=(V+ρ D −1)−1 u. Cycling through every column always drives down the approximated objective function (9). In our implementation, the value of ϵ is chosen as follows. The approximation error of (9) to (1) is

Note that the algorithm stops when the change of the objective function is less than 10−3. We choose ϵ such that \(\rho p(p-1) \sqrt{\epsilon}=0.001\), i.e., ϵ=(0.001/(ρ(p−1)p))2, to ensure that the choice of ϵ has no more influence on the estimated Σ than the stopping rule.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, H. Coordinate descent algorithm for covariance graphical lasso. Stat Comput 24, 521–529 (2014). https://doi.org/10.1007/s11222-013-9385-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-013-9385-5

Keywords

Navigation