Split Bregman algorithms for sparse group Lasso with application to MRI reconstruction

Zou, Jian; Fu, Yuli

doi:10.1007/s11045-014-0282-7

Split Bregman algorithms for sparse group Lasso with application to MRI reconstruction

Published: 12 February 2014

Volume 26, pages 787–802, (2015)
Cite this article

Multidimensional Systems and Signal Processing Aims and scope Submit manuscript

Jian Zou¹ &
Yuli Fu²

782 Accesses
12 Citations
Explore all metrics

Abstract

Sparse group lasso, concerning with group-wise and within-group sparsity, is generally considered difficult to solve due to the mixed-norm structure. In this paper, we propose efficient algorithms based on split Bregman iteration to solve sparse group lasso problems, including a synthesis prior form and an analysis prior form. These algorithms have low computational complexity and are suitable for large scale problems. The convergence of the proposed algorithms is also discussed. Moreover, the proposed algorithms are used for magnetic resonance imaging reconstruction. Numerical results show the effectiveness of the proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Group sparse recovery via group square-root elastic net and the iterative multivariate thresholding-based algorithm

Article 08 April 2022

$$\ell _{1/2,1}$$ group sparse regularization for compressive sensing

Article 20 October 2015

Solving constrained nonsmooth group sparse optimization via group Capped- $$\ell _1$$ relaxation and group smoothing proximal gradient algorithm

Article 22 October 2022

References

Bilen, C., Wang, Y., & Selesnick, I. W. (2012). High-speed compressed sensing reconstruction in dynamic parallel MRI using augmented lagrangian and parallel processing. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2(3), 370–379.
Article Google Scholar
Bruckstein, A. M., Donoho, D. L., & Elad, M. (2009). From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Review, 51(1), 34–81.
Article MATH MathSciNet Google Scholar
Cai, J. F., Osher, S., & Shen, Z. (2009). Split Bregman methods and frame based image restoration. Multiscale Modeling and Simulation, 8(2), 337–369.
Article MathSciNet Google Scholar
Candès, E. J., & Wakin, M. B. (2008). An introduction to compressive sampling. IEEE Signal Processing Magazine, 25(2), 21–30.
Article Google Scholar
Chatterjee, S., Banerjee, A., & Ganguly, A. (2011). Sparse group lasso for regression on land climate variables. In 2011 IEEE 11th international conference on data mining workshops (ICDMW) (pp. 1–8). IEEE.
Deng, W., Yin, W., & Zhang, Y. (2011). Group sparse optimization by alternating direction method. TR11-06, Department of Computational and Applied Mathematics, Rice University.
Friedman, J., Hastie, T., & Tibshirani, R. (2010). A note on the group lasso and a sparse group lasso. arXiv, preprint arXiv:10010736.
Goldstein, T., & Osher, S. (2009). The split Bregman method for $\ell _1$-regularized problems. SIAM Journal on Imaging Sciences, 2(2), 323–343.
Article MATH MathSciNet Google Scholar
Goldstein, T., Bresson, X., & Osher, S. (2010). Geometric applications of the split Bregman method: Segmentation and surface reconstruction. Journal of Scientific Computing, 45(1–3), 272–293.
Article MATH MathSciNet Google Scholar
He, Z., Xie, S., Ding, S., & Cichocki, A. (2007). Convolutive blind source separation in the frequency domain based on sparse representation. IEEE Transactions on Audio, Speech, and Language Processing, 15(5), 1551–1563.
Article Google Scholar
He Z, Cichocki, A., Zdunek, R., & Cao, J. (2008). CG-M-FOCUSS and its application to distributed compressed sensing. In: Advances in neural networks-ISNN 2008 (pp. 237–245). IEEE.
He, Z., Cichocki, A., Li, Y., Xie, S., & Sanei, S. (2009a). K-Hyperline clustering learning for sparse component analysis. Signal Processing, 89(6), 1011–1022.
Article MATH Google Scholar
He, Z., Cichocki, A., Zdunek, R., & Xie, S. (2009b). Improved FOCUSS method with conjugate gradient iterations. IEEE Transactions on Signal Processing, 57(1), 399–404.
Article MathSciNet Google Scholar
Huang, J., & Zhang, T. (2010). The benefit of group sparsity. The Annals of Statistics, 38(4), 1978–2004.
Article MATH MathSciNet Google Scholar
Jiang, L., & Yin, H. (2012). Bregman iteration algorithm for sparse nonnegative matrix factorizations via alternating $\ell _1$-norm minimization. Multidimensional Systems and Signal Processing, 23(3), 315–328.
Article MATH MathSciNet Google Scholar
Lee, D. H., Hong, C. P., & Lee, M. W. (2013a). Sparse magnetic resonance imaging reconstruction using the Bregman iteration. Journal of the Korean Physical Society, 62(2), 328–332.
Article Google Scholar
Lee, D. H., Hong, C. P., Lee, M. W., & Han, B. S. (2013b). Rapid 2D phase-contrast magnetic resonance angiography reconstruction algorithm via compressed sensing. Journal of the Korean Physical Society, 63(5), 1072–1076.
Article Google Scholar
Liu, B., King, K., Steckner, M., Xie, J., Sheng, J., & Ying, L. (2009a). Regularized sensitivity encoding (SENSE) reconstruction using bregman iterations. Magnetic Resonance in Medicine, 61(1), 145–152.
Article Google Scholar
Liu, J., Ji, S., & Ye, J. (2009b). SLEP: Sparse learning with efficient projections. Arizona State University.
Lustig, M., Donoho, D., & Pauly, J. M. (2007). Sparse MRI: The application of compressed sensing for rapid MR imaging. Magnetic Resonance in Medicine, 58(6), 1182–1195.
Article Google Scholar
Lustig, M., Donoho, D. L., Santos, J. M., & Pauly, J. M. (2008). Compressed sensing MRI. IEEE Signal Processing Magazine, 25(2), 72–82.
Article Google Scholar
Ma, S., Goldfarb, D., & Chen, L. (2011). Fixed point and Bregman iterative methods for matrix rank minimization. Mathematical Programming, 128(1–2), 321–353.
Article MATH MathSciNet Google Scholar
Meier, L., Van De Geer, S., & Bühlmann, P. (2008). The group lasso for logistic regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70(1), 53–71.
Article MATH MathSciNet Google Scholar
Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2013). A sparse-group lasso. Journal of Computational and Graphical Statistics, 22(2), 231–245.
Article MathSciNet Google Scholar
Smith, D. S., Gore, J. C., Yankeelov, T. E., & Welch, E. B. (2012). Real-time compressive sensing MRI reconstruction using GPU computing and split bregman methods. International Journal of Biomedical Imaging, 2012, 1–6.
Article Google Scholar
Stone, S. S., Haldar, J. P., Tsao, S. C., Wm, Hwu, Sutton, B. P., Liang, Z. P., et al. (2008). Accelerating advanced MRI reconstructions on GPUs. Journal of Parallel and Distributed Computing, 68(10), 1307–1318.
Article Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological) (pp. 267–288).
Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.
Article Google Scholar
Xu, J., Feng, X., & Hao, Y. (2014). A coupled variational model for image denoising using a duality trategy and split Bregman. Multidimensional Systems and Signal Processing, 25(1), 83–94.
Article MATH Google Scholar
Ye, G. B., & Xie, X. (2011). Split Bregman method for large scale fused lasso. Computational Statistics and Data Analysis, 55(4), 1552–1569.
Article MathSciNet Google Scholar
Yin, W., Osher, S., Goldfarb, D., & Darbon, J. (2008). Bregman iterative algorithms for $\ell _1$-minimization with applications to compressed sensing. SIAM Journal on Imaging Sciences, 1(1), 143–168.
Article MATH MathSciNet Google Scholar
Yuan, M., & Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1), 49–67.
Article MATH MathSciNet Google Scholar
Zhao, P., & Yu, B. (2006). On model selection consistency of lasso. The Journal of Machine Learning Research, 7, 2541–2563.
MATH MathSciNet Google Scholar
Zhu, X., Huang, Z., Cui, J., & Shen, H. (2013). Video-to-shot tag propagation by graph sparse group lasso. IEEE Transactions on Multimedia, 15(3), 633–646.
Article Google Scholar
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American statistical association, 101(476), 1418–1429.
Article MATH MathSciNet Google Scholar
Zou, J., Fu, Y., & Xie, S. (2012). A block fixed point continuation algorithm for block-sparse reconstruction. IEEE Signal Processing Letters, 19(6), 364–367.
Article Google Scholar

Download references

Acknowledgments

The work is supported partially by Guangdong and National ministry of education ii projection (Grant No. 2012B091100331), International cooperation project of Guangdong Natural Science Foundation (Grant No. 2009B050700020), NOS—Guangdong Union Project (Grant No. U0835003), NOS (Grant Nos. 61104053, 61103122).

Author information

Authors and Affiliations

School of Information and Mathematics, Yangtze University, Jingzhou, 434020, Hubei, China
Jian Zou
School of Electronic and Information Engineering, South China University of Technology, Guangzhou, 510641, Guangdong, China
Yuli Fu

Authors

Jian Zou
View author publications
You can also search for this author in PubMed Google Scholar
Yuli Fu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuli Fu.

Appendix: Proof of Theorem 1

In this appendix, we present the proof of Theorem 1.

Proof

Since subproblems (9)–(11) are convex, the first order optimality condition of Algorithm 1 gives as follows:

$$\begin{aligned} \left\{ \begin{array}{ll} \mathbf {0} = \mathbf {A}^T (\mathbf {A}\mathbf {x}^{k+1}-\mathbf {y}) + \mu _1\mathbf {D}^T(\mathbf {Dx}^{k+1}-\mathbf {u}^k+\mathbf {b}^k)+ \mu _2\mathbf {D}^T(\mathbf {Dx}^{k+1}-\mathbf {v}^k+\mathbf {c}^k), \\ \mathbf {0} = \lambda _1 \mathbf {p}^{k+1}+ \mu _1(\mathbf {u}^{k+1}-\mathbf {Dx}^k-\mathbf {b}^k),\\ \mathbf {0} = \lambda _2 \mathbf {q}^{k+1}+ \mu _2(\mathbf {v}^{k+1}-\mathbf {Dx}^k-\mathbf {c}^k), \\ \mathbf {b}^{k+1} = \mathbf {b}^k + (\mathbf {Dx}^{k+1} - \mathbf {u}^{k+1}),\\ \mathbf {c}^{k+1} = \mathbf {c}^k + (\mathbf {Dx}^{k+1} - \mathbf {v}^{k+1}),\\ \end{array} \right. \end{aligned}$$

(24)

where $\mathbf {p}^{k+1} \in \partial \Vert \mathbf {u}^{k+1} \Vert _{2,1}$ and $\mathbf {q}^{k+1} \in \partial \Vert \mathbf {v}^{k+1} \Vert _{1}$. The definition of subgradients $\partial \Vert \cdot \Vert _{2,1}$ and $\partial \Vert \cdot \Vert _{1}$ can be seen in Simon et al. (2013).

Since there exists at least one solution $\mathbf {x}^*$ of (5) by the assumption, by the first order optimality condition, $\mathbf {x}^*$ must satisfy

$$\begin{aligned} \mathbf {A}^T(\mathbf {A}\mathbf {x}^{*}-\mathbf {y}) + \lambda _1 \mathbf {D}^T\mathbf {p}^* + \lambda _2 \mathbf {D}^T\mathbf {q}^* = 0, \end{aligned}$$

(25)

where $\mathbf {p}^{*} \in \partial \Vert \mathbf {u}^{*} \Vert _{2,1}, \mathbf {q}^{*} \in \partial \Vert \mathbf {v}^{*} \Vert _{1}$ and $\mathbf {u}^{*} = \mathbf {v}^{*} = \mathbf {Dx}^{*}$.

Let $\mathbf {b}^* = \lambda _1\mathbf {p}^*/ \mu _1$ and $\mathbf {c}^* = \lambda _2\mathbf {q}^*/ \mu _2$. Then (25) can be formulated as

$$\begin{aligned} \left\{ \begin{array}{ll} \mathbf {0} = \mathbf {A}^T (\mathbf {A}\mathbf {x}^{*}-\mathbf {y}) + \mu _1\mathbf {D}^T(\mathbf {Dx}^{*}-\mathbf {u}^*+\mathbf {b}^*) + \mu _2\mathbf {D}^T(\mathbf {Dx}^{*}-\mathbf {v}^*+\mathbf {c}^*), \\ \mathbf {0} = \lambda _1 \mathbf {p}^{*}+ \mu _1(\mathbf {u}^{*}-\mathbf {Dx}^*-\mathbf {b}^*), \\ \mathbf {0} = \lambda _2 \mathbf {q}^{*}+ \mu _2(\mathbf {v}^{*}-\mathbf {Dx}^*-\mathbf {c}^*), \\ \mathbf {b}^{*} = \mathbf {b}^* + (\mathbf {Dx}^{*} - \mathbf {u}^{*}),\\ \mathbf {c}^{*} = \mathbf {c}^* + (\mathbf {Dx}^{*} - \mathbf {v}^{*}).\\ \end{array} \right. \end{aligned}$$

(26)

Comparing (26) with (24), $(\mathbf {x}^{*}, \mathbf {u}^{*}, \mathbf {v}^{*}, \mathbf {b}^{*}, \mathbf {c}^{*})$ is a fixed point of Algorithm 1. Paper Cai et al. (2009) pointed out that if the unconstrained spilt Bregman iteration converges, it converges to a solution of (4).

Denote the errors by $\mathbf {x}_e^{k} = \mathbf {x}^{k}-\mathbf {x}^{*}, \mathbf {u}_e^{k} = \mathbf {u}^{k}-\mathbf {u}^{*}, \mathbf {v}_e^{k} = \mathbf {v}^{k}-\mathbf {v}^{*}, \mathbf {b}_e^{k} = \mathbf {b}^{k}-\mathbf {b}^{*}$ and $\mathbf {c}_e^{k} = \mathbf {c}^{k}-\mathbf {c}^{*}$.

Subtracting the first equation of (24) by the first equation of (26), we obtain

$$\begin{aligned} \mathbf {0} = \mathbf {A}^T \mathbf {A}\mathbf {x}_e^{k+1} + \mu _1\mathbf {D}^T(\mathbf {Dx}_e^{k+1}-\mathbf {u}_e^k+\mathbf {b}_e^k)+ \mu _2\mathbf {D}^T(\mathbf {Dx}_e^{k+1}-\mathbf {v}_e^k+\mathbf {c}_e^k). \end{aligned}$$

(27)

Taking the inner product of the left- and right-hand sides with $\mathbf {x}_e^{k+1}$, we have

$$\begin{aligned} \mathbf {0}&= \left\| \mathbf {A}\mathbf {X}_e^{k+1} \right\| _2^2 + \mu _1 \left\| \mathbf {Dx}_e^{k+1} \right\| _2^2 - \mu _1 \left\langle \mathbf {u}_e^k, \mathbf {Dx}_e^{k+1} \right\rangle \nonumber \\&+ \mu _1 \left\langle \mathbf {b}_e^k, \mathbf {Dx}_e^{k+1} \right\rangle + \mu _2 \left\| \mathbf {Dx}_e^{k+1} \right\| _2^2 - \mu _2 \left\langle \mathbf {v}_e^k, \mathbf {Dx}_e^{k+1} \right\rangle + \mu _2 \left\langle \mathbf {c}_e^k, \mathbf {Dx}_e^{k+1} \right\rangle . \end{aligned}$$

(28)

Subtracting the second equation of (24) by the second equation of (26) and then taking the inner product of the left- and right-hand sides with $\mathbf {u}_e^{k+1}$, we have

$$\begin{aligned} \mathbf {0} = \lambda _1 \left\langle \mathbf {p}^{k+1}-\mathbf {p}^{*}, \mathbf {u}^{k+1}-\mathbf {u}^{*}\right\rangle + \mu _1 \left\| \mathbf {u}_e^{k+1} \right\| _2^2 - \mu _1 \left\langle \mathbf {u}_e^{k+1}, \mathbf {Dx}_e^{k+1} \right\rangle - \mu _1 \left\langle \mathbf {b}_e^k, \mathbf {u}_e^{k+1} \right\rangle \nonumber \\ \end{aligned}$$

(29)

where $\mathbf {p}^{k+1} \in \partial \Vert \mathbf {u}^{k+1} \Vert _{2,1}, \mathbf {p}^{*} \in \partial \Vert \mathbf {u}^{*} \Vert _{2,1}$.

Similarly, from the third equation of (24) and (26) we have

$$\begin{aligned} \mathbf {0} = \lambda _2 \left\langle \mathbf {q}^{k+1}-\mathbf {q}^{*}, \mathbf {v}^{k+1}-\mathbf {v}^{*}\right\rangle + \mu _2 \left\| \mathbf {v}_e^{k+1} \right\| _2^2 - \mu _2 \left\langle \mathbf {v}_e^{k+1}, \mathbf {Dx}_e^{k+1} \right\rangle - \mu _2 \left\langle \mathbf {c}_e^k, \mathbf {v}_e^{k+1} \right\rangle \nonumber \\ \end{aligned}$$

(30)

where $\mathbf {q}^{k+1} \in \partial \Vert \mathbf {v}^{k+1} \Vert _{1}, \mathbf {q}^{*} \in \partial \Vert \mathbf {v}^{*} \Vert _{1}$.

Adding both sides of (28)–(30), we have

$$\begin{aligned} \mathbf {0}&= \left\| \mathbf {A}\mathbf {x}_e^{k+1} \right\| _2^2 + \lambda _1\left\langle \mathbf {p}^{k+1}-\mathbf {p}^{*}, \mathbf {u}^{k+1}-\mathbf {u}^{*}\right\rangle + \lambda _2\left\langle \mathbf {q}^{k+1}-\mathbf {q}^{*}, \mathbf {v}^{k+1}-\mathbf {v}^{*}\right\rangle \nonumber \\&+ \mu _1 \left( \left\| \mathbf {Dx}_e^{k+1} \right\| _2^2 + \left\| \mathbf {u}_e^{k+1} \right\| _2^2 - \left\langle \mathbf {Dx}_e^{k+1},\mathbf {u}_e^{k+1}+\mathbf {u}_e^{k} \right\rangle + \left\langle \mathbf {b}_e^k, \mathbf {Dx}_e^{k+1}-\mathbf {u}_e^{k+1} \right\rangle \right) \nonumber \\&+ \mu _2 \left( \left\| \mathbf {Dx}_e^{k+1} \right\| _2^2 + \left\| \mathbf {v}_e^{k+1} \right\| _2^2 - \left\langle \mathbf {Dx}_e^{k+1},\mathbf {v}_e^{k+1}+\mathbf {v}_e^{k} \right\rangle + \left\langle \mathbf {c}_e^k, \mathbf {Dx}_e^{k+1}-\mathbf {v}_e^{k+1} \right\rangle \right) .\nonumber \\ \end{aligned}$$

(31)

Furthermore, by subtracting the fourth equation of (24) by the corresponding one in (26), we obtain

$$\begin{aligned} \mathbf {b}_e^{k+1} = \mathbf {b}_e^k + \mathbf {Dx}_e^{k+1} - \mathbf {u}_e^{k+1}. \end{aligned}$$

(32)

Taking square norm of both sides of (32) implies

$$\begin{aligned} \left\| \mathbf {b}_e^{k+1}\right\| _2^2 = \left\| \mathbf {b}_e^k\right\| _2^2 + \left\| \mathbf {Dx}_e^{k+1} - \mathbf {u}_e^{k+1} \right\| _2^2 + 2\left\langle \mathbf {b}_e^k, \mathbf {Dx}_e^{k+1} - \mathbf {u}_e^{k+1}\right\rangle , \end{aligned}$$

(33)

and further

$$\begin{aligned} \left\langle \mathbf {b}_e^k, \mathbf {Dx}_e^{k+1} - \mathbf {u}_e^{k+1}\right\rangle = \frac{1}{2}\left( \left\| \mathbf {b}_e^{k+1}\right\| _2^2 - \left\| \mathbf {b}_e^k\right\| _2^2 - \left\| \mathbf {Dx}_e^{k+1} - \mathbf {u}_e^{k+1}\right\| _2^2\right) . \end{aligned}$$

(34)

Similarly, from the fifth equation of (24) and (26) we have

$$\begin{aligned} \left\langle \mathbf {c}_e^k, \mathbf {Dx}_e^{k+1} - \mathbf {v}_e^{k+1}\right\rangle = \frac{1}{2}\left( \left\| \mathbf {c}_e^{k+1}\right\| _2^2 - \left\| \mathbf {c}_e^k\right\| _2^2 - \left\| \mathbf {Dx}_e^{k+1} - \mathbf {v}_e^{k+1}\right\| _2^2\right) . \end{aligned}$$

(35)

Substituting (34) and (35) into (31), we have

$$\begin{aligned} \begin{aligned}&\frac{\mu _1}{2}\left( \left\| \mathbf {b}_e^{k}\right\| _2^2 - \left\| \mathbf {b}_e^{k+1}\right\| _2^2\right) + \frac{\mu _2}{2}\left( \left\| \mathbf {c}_e^{k}\right\| _2^2 - \left\| \mathbf {c}_e^{k+1}\right\| _2^2\right) \\&\quad = \left\| \mathbf {A}\mathbf {x}_e^{k+1} \right\| _2^2 + \lambda _1 \left\langle \mathbf {p}^{k+1}-\mathbf {p}^{*}, \mathbf {u}^{k+1}-\mathbf {u}^{*}\right\rangle + \lambda _2 \left\langle \mathbf {q}^{k+1}-\mathbf {q}^{*}, \mathbf {v}^{k+1}-\mathbf {v}^{*}\right\rangle \\&\qquad + \frac{\mu _1}{2}\left( \left\| \mathbf {Dx}_e^{k+1} - \mathbf {u}_e^{k} \right\| _2^2 + \left\| \mathbf {u}_e^{k+1} \right\| _2^2 - \left\| \mathbf {u}_e^{k}\right\| _2^2\right) \\&\qquad + \frac{\mu _2}{2}\left( \left\| \mathbf {Dx}_e^{k+1} - \mathbf {v}_e^{k}\right\| _2^2 + \left\| \mathbf {v}_e^{k+1} \right\| _2^2 - \left\| \mathbf {v}_e^{k}\right\| _2^2\right) \\ \end{aligned} \end{aligned}$$

(36)

Summing (36) from $k=0$ to $k=K$ yields

$$\begin{aligned} \begin{aligned}&\frac{\mu _1}{2}\left( \left\| \mathbf {b}_e^{0}\right\| _2^2 - \left\| \mathbf {b}_e^{K+1}\right\| _2^2\right) + \frac{\mu _2}{2}\left( \left\| \mathbf {c}_e^{0}\right\| _2^2 - \left\| \mathbf {c}_e^{K+1}\right\| _2^2\right) \\&\quad = \sum _{k=0}^K \left\| \mathbf {A}\mathbf {x}_e^{k+1} \right\| _2^2 \\&\qquad + \lambda _1 \sum _{k=0}^K \left\langle \mathbf {p}^{k+1}-\mathbf {p}^{*}, \mathbf {u}^{k+1}-\mathbf {u}^{*}\right\rangle + \lambda _2 \sum _{k=0}^K \left\langle \mathbf {q}^{k+1}-\mathbf {q}^{*}, \mathbf {v}^{k+1}-\mathbf {v}^{*}\right\rangle \\&\qquad + \frac{\mu _1}{2} \left( \sum _{k=0}^K \left\| \mathbf {Dx}_e^{k+1} - \mathbf {u}_e^{k} \right\| _2^2 + \left\| \mathbf {u}_e^{K+1} \right\| _2^2 - \left\| \mathbf {u}_e^{0}\right\| _2^2\right) \\&\qquad + \frac{\mu _2}{2} \left( \sum _{k=0}^K \left\| \mathbf {Dx}_e^{k+1} - \mathbf {v}_e^{k}\right\| _2^2 + \left\| \mathbf {v}_e^{K+1} \right\| _2^2 - \left\| \mathbf {v}_e^{0}\right\| _2^2\right) . \\ \end{aligned} \end{aligned}$$

(37)

Since $\mathbf {p}^{k+1} \in \partial \Vert \mathbf {u}^{k+1} \Vert _{2,1}, \mathbf {p}^{*} \in \partial \Vert \mathbf {u}^{*} \Vert _{2,1}$ and $\Vert \cdot \Vert _{2,1}$ is convex, for $\forall k$,

$$\begin{aligned} \begin{aligned} \left\langle \mathbf {p}^{k+1}-\mathbf {p}^{*}, \mathbf {u}^{k+1}-\mathbf {u}^{*}\right\rangle&= \Vert \mathbf {u}^{k+1} \Vert _2^2 - \Vert \mathbf {u}^{*}\Vert _2^2 - \left\langle \mathbf {p}^{*}, \mathbf {u}^{k+1}-\mathbf {u}^{*}\right\rangle \\&\quad + \Vert \mathbf {u}^{*} \Vert _2^2 - \Vert \mathbf {u}^{k+1}\Vert _2^2 - \left\langle \mathbf {p}^{k+1}, \mathbf {u}^{*}-\mathbf {u}^{k+1}\right\rangle \ge 0, \end{aligned} \end{aligned}$$

(38)

where the last inequation is by the definition of subgradient. Similarly, $\Big \langle \mathbf {q}^{k+1}-\mathbf {q}^{*}, \mathbf {v}^{k+1}-\mathbf {v}^{*}\Big \rangle \ge 0.$

Therefore, all terms involved in (37) are nonnegative. We have

$$\begin{aligned} \begin{aligned}&\frac{\mu _1}{2}\left( \left\| \mathbf {b}_e^{0}\right\| _2^2 + \left\| \mathbf {u}_e^{0}\right\| _2^2\right) + \frac{\mu _2}{2}\left( \left\| \mathbf {c}_e^{0}\right\| _2^2 + \left\| \mathbf {v}_e^{0}\right\| _2^2\right) \\&\quad \ge \sum _{k=0}^K \left\| \mathbf {A}\mathbf {x}_e^{k+1} \right\| _2^2\\&\qquad + \lambda _1 \sum _{k=0}^K \left\langle \mathbf {p}^{k+1}-\mathbf {p}^{*}, \mathbf {u}^{k+1}-\mathbf {u}^{*}\right\rangle + \lambda _2 \sum _{k=0}^K \left\langle \mathbf {q}^{k+1}-\mathbf {q}^{*}, \mathbf {v}^{k+1}-\mathbf {v}^{*}\right\rangle \\&\qquad + \frac{\mu _1}{2} \sum _{k=0}^K \left\| \mathbf {Dx}_e^{k+1} - \mathbf {u}_e^{k} \right\| _2^2 + \frac{\mu _2}{2} \sum _{k=0}^K \left\| \mathbf {Dx}_e^{k+1} - \mathbf {v}_e^{k} \right\| _2^2. \\ \end{aligned} \end{aligned}$$

(39)

From (39) and $\mu _1 >0, \mu _2 >0, \lambda _1 >0, \lambda _2 >0$, we can get

$$\begin{aligned} \sum _{k=0}^{+ \infty } \left\| \mathbf {A}\mathbf {x}_e^{k+1} \right\| _2^2 < + \infty . \end{aligned}$$

(40)

Equation (40) imply that $\lim _{k\rightarrow + \infty } \Vert \mathbf {A}\mathbf {x}_e^{k} \Vert _2^2 =0$, and

$$\begin{aligned} \begin{aligned} \lim _{k\rightarrow + \infty } \left\| \mathbf {A}\mathbf {x}_e^{k} \right\| _2^2 = \lim _{k\rightarrow + \infty } \Vert \mathbf {A}\mathbf {x}^{k} - \mathbf {A}\mathbf {x}^{*}\Vert _2^2 = \lim _{k\rightarrow + \infty } \left\langle \mathbf {A}^T(\mathbf {A}\mathbf {x}^{k}-\mathbf {A}\mathbf {x}^{*}), \mathbf {x}^{k} - \mathbf {x}^{*} \right\rangle = 0. \end{aligned} \end{aligned}$$

(41)

Recall the definition of the Bregman distance (Goldstein and Osher 2009; Cai et al. 2009), for any convex function $J$, we have

$$\begin{aligned} D_J^{\mathbf {p}}(\mathbf {x},\mathbf {z}) + D_J^{\mathbf {q}}(\mathbf {z},\mathbf {x}) = \left\langle \mathbf {q}-\mathbf {p}, \mathbf {x}-\mathbf {z}\right\rangle \end{aligned}$$

(42)

where $\mathbf {p} \in \partial J(\mathbf {z})$ and $\mathbf {q} \in \partial J(\mathbf {x})$.

Now let $J(\mathbf {x}) = \Vert \mathbf {Ax-y}\Vert _2^2, \nabla J(\mathbf {x}) = \mathbf {A}^T(\mathbf {Ax-y})$, from (42) we have

$$\begin{aligned} D_J^{\nabla J(\mathbf {x}^{*})}(\mathbf {x}^k,\mathbf {x}^*) + D_J^{\nabla J(\mathbf {x}^{k})}(\mathbf {x}^*,\mathbf {x}^k) = \left\langle \mathbf {A}^T(\mathbf {A}\mathbf {x}^{k}-\mathbf {A}\mathbf {x}^{*}), \mathbf {x}^{k} - \mathbf {x}^{*} \right\rangle = 0. \end{aligned}$$

(43)

Equation (43) together with the nonnegativity of the Bregman distance, implies that $\lim _{k\rightarrow + \infty } D_J^{\nabla J(\mathbf {x}^{*})}(\mathbf {x}^k,\mathbf {x}^*) =0$, i.e.,

$$\begin{aligned} \lim _{k\rightarrow + \infty } \frac{1}{2}\Vert \mathbf {A}\mathbf {x}^{k} - \mathbf {y}\Vert _2^2 - \frac{1}{2}\Vert \mathbf {A}\mathbf {x}^{*} - \mathbf {y}\Vert _2^2 - \left\langle \mathbf {A}^T(\mathbf {A}\mathbf {x}^{*}-\mathbf {y}), \mathbf {x}^{k} - \mathbf {x}^{*} \right\rangle = 0. \end{aligned}$$

(44)

Equation (39) also leads to

$$\begin{aligned} \sum _{k=0}^{+ \infty } \left\langle \mathbf {p}^{k+1}-\mathbf {p}^{*}, \mathbf {u}_e^{k+1}-\mathbf {u}_e^{*}\right\rangle < + \infty , \end{aligned}$$

(45)

and hence

$$\begin{aligned} \lim _{k\rightarrow + \infty } \left\langle \mathbf {p}^{k+1}-\mathbf {p}^{*}, \mathbf {u}_e^{k+1}-\mathbf {u}_e^{*}\right\rangle =0. \end{aligned}$$

(46)

Let $J(\mathbf {u}) = \Vert \mathbf {u}\Vert _{2,1}$ and $\mathbf {p} \in \partial \Vert \mathbf {u} \Vert _{2,1}$, we can get $\lim _{k\rightarrow + \infty } D_{\Vert \cdot \Vert _{2,1}}^{\mathbf {p}^*}(\mathbf {u}^k,\mathbf {u}^*) =0$, i.e.,

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf {u}^k \Vert _{2,1} - \Vert \mathbf {u}^* \Vert _{2,1} - \left\langle \mathbf {u}_e^{k}-\mathbf {u}_e^{*}, \mathbf {p}^{*} \right\rangle = 0 \end{aligned}$$

(47)

Similarly, let $J(\mathbf {v}) = \Vert \mathbf {v}\Vert _{1}$ and $\mathbf {q} \in \partial \Vert \mathbf {v} \Vert _{1}$ we can get

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf {v}^k \Vert _{1} - \Vert \mathbf {v}^* \Vert _{1} - \left\langle \mathbf {v}_e^{k}-\mathbf {v}_e^{*}, \mathbf {q}^{*} \right\rangle = 0 \end{aligned}$$

(48)

Moreover, (39) also leads to $\sum _{k=0}^{+ \infty } \Vert \mathbf {Dx}_e^{k+1} - \mathbf {u}_e^{k} \Vert _2^2 < + \infty $, which implies that $\lim _{k\rightarrow + \infty } \Vert \mathbf {Dx}_e^{k+1} - \mathbf {u}_e^{k} \Vert _2^2 = 0$. By $\mathbf {Dx}^{*} = \mathbf {u}^{*}$, we conclude that

$$\begin{aligned} \begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf {Dx}_e^{k+1} - \mathbf {x}_e^{k} \Vert _2^2&= \lim _{k\rightarrow + \infty } \Vert \mathbf {Dx}^{k+1} - \mathbf {Dx}^{*} - \mathbf {u}^{k} + \mathbf {u}^{*} \Vert _2^2 \\&= \lim _{k\rightarrow + \infty } \Vert \mathbf {Dx}^{k+1} - \mathbf {u}^{k} \Vert _2^2 = 0. \end{aligned} \end{aligned}$$

(49)

Since $\Vert \cdot \Vert _{2,1}$ is continuous, by (47) and (49), for $\mathbf {x}$, we have

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf {Dx}^k \Vert _{2,1} - \Vert \mathbf {Dx}^* \Vert _{2,1} - \left\langle \mathbf {Dx}^{k}-\mathbf {Dx}^{*}, \mathbf {p}^{*} \right\rangle = 0. \end{aligned}$$

(50)

Similarly with (49), we have $\lim _{k\rightarrow + \infty } \Vert \mathbf {Dx}^{k+1} - \mathbf {v}^{k} \Vert _2^2 = 0$ and since $\Vert \cdot \Vert _{1}$ is continuous, for $\mathbf {x}$, by (48), we also have

$$\begin{aligned} \lim _{k\rightarrow + \infty } \Vert \mathbf {Dx}^k \Vert _{1} - \Vert \mathbf {Dx}^* \Vert _{1} - \left\langle \mathbf {Dx}^{k}-\mathbf {Dx}^{*}, \mathbf {q}^{*} \right\rangle = 0. \end{aligned}$$

(51)

Summing (44), (50) and (51)yields

$$\begin{aligned} \begin{aligned} 0&= \lim _{k\rightarrow + \infty } \frac{1}{2} \left\| \mathbf {A}\mathbf {x}^k-\mathbf {y}\right\| _2^2 + \lambda _1 \left\| \mathbf {Dx}^k \right\| _{2,1} + \lambda _2\left\| \mathbf {Dx}^k \right\| _{1} \\&\quad - \frac{1}{2} \left\| \mathbf {A}\mathbf {x}^*-\mathbf {y}\right\| _2^2 - \lambda _1 \left\| \mathbf {Dx}^* \right\| _{2,1} - \lambda _2\left\| \mathbf {Dx}^* \right\| _{1}\\&\quad - \left\langle \mathbf {Dx}^{k}-\mathbf {Dx}^{*}, \lambda _1 \mathbf {p}^{*} + \lambda _2 \mathbf {q}^{*} + \mathbf {A}^T(\mathbf {A}\mathbf {x}^{*}-\mathbf {y}) \right\rangle \\&= \lim _{k\rightarrow + \infty } \frac{1}{2} \left\| \mathbf {A}\mathbf {x}^k-\mathbf {y}\right\| _2^2 + \lambda _1 \left\| \mathbf {Dx}^k \right\| _{2,1} + \lambda _2\left\| \mathbf {Dx}^k \right\| _{1}\\&\quad - \frac{1}{2} \left\| \mathbf {A}\mathbf {x}^*-\mathbf {y}\right\| _2^2 - \lambda _1 \left\| \mathbf {Dx}^* \right\| _{2,1} - \lambda _2\left\| \mathbf {Dx}^* \right\| _{1}.\\ \end{aligned} \end{aligned}$$

(52)

where the last equality comes from (25). So (18) is proved.

Next, we prove (19) whenever (3) has a unique solution. It is proved by contradiction.

Define $\varPhi (\mathbf {x}) = \frac{1}{2} \Vert \mathbf {Ax-y}\Vert _2^2 + \lambda _1 \Vert \mathbf {Dx} \Vert _{2,1} + \lambda _2 \Vert \mathbf {Dx} \Vert _{1}$. Then $\varPhi (\mathbf {x})$ is convex and lower semi-continuous. Since $\mathbf {x}^*$ is the unique minimizer, we have $\varPhi (\mathbf {x}) > \varPhi (\mathbf {x}^*)$ for $\mathbf {x} \ne \mathbf {x}^*$.

Now, suppose that (19) does not hold. So, there exists a subsequence $\mathbf {x}^{k_i}$ such that $\Vert \mathbf {x}^{k_i} - \mathbf {x}^{*}\Vert _2 > \epsilon $ for some $\epsilon > 0$ and all $i$. Then, $\varPhi (\mathbf {x}^{k_i}) > \min \{ \varPhi (\mathbf {x}) : \Vert \mathbf {x}^{k_i} - \mathbf {x}^{*}\Vert _2 = \epsilon \}$. Let $\mathbf {z}$ be the intersection of $\{ \mathbf {z} : \Vert \mathbf {z} - \mathbf {z}^{*}\Vert _2 = \epsilon \}$ and the line from $\mathbf {x}^{*}$ to $\mathbf {x}^{k_i}$. Indeed, there exists a positive number $t \in (0,1)$ such that $\mathbf {z} = t \mathbf {x}^{*} + (1-t) \mathbf {x}^{k_i}$. By the convexity of $\varPhi $ and the definition of $\mathbf {x}^{*}$, we have

$$\begin{aligned} \begin{aligned} \varPhi (\mathbf {x}^{k_i})&\ge t \varPhi (\mathbf {x}^{*}) + (1-t) \varPhi (\mathbf {x}^{k_i}) \ge \varPhi (t \mathbf {x}^{*} + (1-t) \mathbf {x}^{k_i}) = \varPhi (\mathbf {z}) \\&\ge \min \{ \varPhi (\mathbf {x}) : \Vert \mathbf {x}^{k_i} - \mathbf {x}^{*}\Vert _2 = \epsilon \}.\\ \end{aligned} \end{aligned}$$

(53)

Denote ${\tilde{\mathbf{x}}} = \min \{ \varPhi (\mathbf {x}) : \Vert \mathbf {x}^{k_i} - \mathbf {x}^{*}\Vert _2 = \epsilon \}$. By (20), we have

$$\begin{aligned} \varPhi (\mathbf {x}^*) = \lim _{k\rightarrow + \infty } \varPhi (\mathbf {x}^{k_i}) \ge \varPhi ({\tilde{\mathbf{x}}}) > \varPhi (\mathbf {x}^*), \end{aligned}$$

(54)

which is a contradiction. So (19) holds whenever (5) has a unique solution. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, J., Fu, Y. Split Bregman algorithms for sparse group Lasso with application to MRI reconstruction. Multidim Syst Sign Process 26, 787–802 (2015). https://doi.org/10.1007/s11045-014-0282-7

Download citation

Received: 07 October 2013
Revised: 04 January 2014
Accepted: 28 January 2014
Published: 12 February 2014
Issue Date: July 2015
DOI: https://doi.org/10.1007/s11045-014-0282-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Split Bregman algorithms for sparse group Lasso with application to MRI reconstruction

Abstract

Access this article

Similar content being viewed by others

Group sparse recovery via group square-root elastic net and the iterative multivariate thresholding-based algorithm

$$\ell _{1/2,1}$$ group sparse regularization for compressive sensing

Solving constrained nonsmooth group sparse optimization via group Capped- $$\ell _1$$ relaxation and group smoothing proximal gradient algorithm

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Theorem 1

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Split Bregman algorithms for sparse group Lasso with application to MRI reconstruction

Abstract

Access this article

Similar content being viewed by others

Group sparse recovery via group square-root elastic net and the iterative multivariate thresholding-based algorithm

$$\ell _{1/2,1}$$ group sparse regularization for compressive sensing

Solving constrained nonsmooth group sparse optimization via group Capped- $$\ell _1$$ relaxation and group smoothing proximal gradient algorithm

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proof of Theorem 1

Appendix: Proof of Theorem 1

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation