Two-stage convex relaxation approach to least squares loss constrained low-rank plus sparsity optimization problems

Han, Le; Bi, Shujun; Pan, Shaohua

doi:10.1007/s10589-015-9797-6

Two-stage convex relaxation approach to least squares loss constrained low-rank plus sparsity optimization problems

Published: 07 October 2015

Volume 64, pages 119–148, (2016)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Le Han¹,
Shujun Bi¹ &
Shaohua Pan¹

594 Accesses
2 Citations
Explore all metrics

Abstract

This paper is concerned with the least squares loss constrained low-rank plus sparsity optimization problems that seek a low-rank matrix and a sparse matrix by minimizing a positive combination of the rank function and the zero norm over a least squares constraint set describing the observation or prior information on the target matrix pair. For this class of NP-hard optimization problems, we propose a two-stage convex relaxation approach by the majorization for suitable locally Lipschitz continuous surrogates, which have a remarkable advantage in reducing the error yielded by the popular nuclear norm plus $\ell _1$-norm convex relaxation method. Also, under a suitable restricted eigenvalue condition, we establish a Frobenius norm error bound for the optimal solution of each stage and show that the error bound of the first stage convex relaxation (i.e. the nuclear norm plus $\ell _1$-norm convex relaxation), can be reduced much by the second stage convex relaxation, thereby providing the theoretical guarantee for the two-stage convex relaxation approach. We also verify the efficiency of the proposed approach by applying it to some random test problems and some problems with real data arising from specularity removal from face images, and foreground/background separation from surveillance videos.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Random Gradient-Free Minimization of Convex Functions

Article 30 November 2015

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

Article Open access 06 March 2024

Preconditioned golden ratio primal-dual algorithm with linesearch

Article 16 April 2024

References

Agarwal, A., Negahban, S., Wainwright, M.J.: Noisy matrix decomposition via convex relaxation: optimal rates in high deimensions. Ann. Stat. 40, 1171–1197 (2012)
Article MathSciNet MATH Google Scholar
Aybat, N.S., Ma, S., Goldfarb, D.: Noisy efficient algorithms for robust and stable principal component pursuit problems. Comput. Optim. Appl. 58, 1–29 (2014)
Article MathSciNet MATH Google Scholar
Bach, F.R.: Consistency of trace norm minimization. J. Mach. Learn. Res. 8, 1019–1048 (2008)
MathSciNet MATH Google Scholar
Bi, S.J.: Study for multi-stage convex relaxation approach tolow-rank optimization problems, Disertation for ph.D. Degree, South China University of Technology, Guangzhou (2014)
Candès, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theor. 51, 4203–4215 (2005)
Article MathSciNet MATH Google Scholar
Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9, 717–772 (2009)
Article MathSciNet MATH Google Scholar
Candès, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58, 1–37 (2011)
Article MathSciNet MATH Google Scholar
Candès, E.J., Plain, Y.: Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements. IEEE Trans. Inf. Theory 57, 2342–2359 (2011)
Article MathSciNet Google Scholar
Chandrasekaran, V., Sanghavi, S., Parrilo, P.A., Willsky, A.S.: Rank-sparsity incoherence for matrix decomposition. SIAM J. Optim. 21, 572–596 (2011)
Article MathSciNet MATH Google Scholar
Chartrand, R.: Nonconvex splitting for regularized low-rank + sparse decomposition. IEEE Trans. Signal Process. 60, 5810–5819 (2012)
Article MathSciNet Google Scholar
Chen, C.H., He, B.S., Ye, Y.Y., Yuan, X.M.: The directExtension of ADMM for multi-block convex minimization problems isnot necessarily convergent. Math. Progr. Ser. A (2014). doi: 10.1007/s10107-014-0826-5
Chen, Y., Jalali, A., Sanghavi, S., Caramanis, C.: Low-rank matrix recovery from errors and erasures. IEEE Trans. Inf. Theory 59, 4324–4337 (2013)
Article Google Scholar
Donoho, D.: For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparest solution. Commun. Pure Appl. Math. 59, 797–829 (2006)
Article MathSciNet MATH Google Scholar
Fazel, M.: Matrix Rank Minimization with Applications, Disertation for ph.D. Degree, StanfordUniversity, California (2002)
Georghiades, A., Belhumeur, P., Kriegman, D.: From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans. Pattern Anal. Mach. Intell. 23, 643–660 (2001)
Article Google Scholar
Golbabaee, M., Vandergheynst, P.: Hyperspectral image compressed sensing via low-rank and joint sparse matrix recovery. In: Proceedings of IEEE International Conference on on Acoustics, Speech and Signal Processing, Kyota, pp. 2741–2744 (2011)
He, B.S., Tao, M., Yuan, X.: Alternating direction method with Gaussian back substitution for separable convex programming. SIAM J. Optim. 22, 57–81 (2012)
Article MathSciNet MATH Google Scholar
He, R., Sun, Z., Tan, T., Zheng, W.S.: Recovery of corrupted low-rank matrices via half-quadratic based nonconvex minimization. In: Proceedings of IEEE International Coference on Computer Vision and Pattern Recognation, Providence, RI, pp. 2889–2896 (2011)
Hsu, D., Kakade, S.M., Zhang, T.: Robust matrix decomposition with sparse corruptions. IEEE Trans. Inf. Theory 57, 7221–7234 (2011)
Article MathSciNet Google Scholar
Kong, L., Xiu, N.H.: Exact low-rank matrix recovery via nonconvex schatten p-minimization. Asia-Pac. J. Op. Res. 30, 1340010 (2013)
Article MATH Google Scholar
Lai, M.J., Xu, Y., Yin, W.: Improved iteratively reweighted least squares for unconstrained smoothed $l_q$ minimization. SIAM J. Numer. Anal. 5, 927–957 (2013)
Article MathSciNet MATH Google Scholar
Lewis, A.S.: The convex analysis of unitarily invariant matrix functions. J. Convex Anal. 2, 173–183 (1995)
MathSciNet MATH Google Scholar
Li, L., Huang, W., Gu, I.Y., Qi, T.: Statistical modeling of complex backgrounds for foreground object detection. IEEE Trans. Image Process. 13, 1459–1472 (2004)
Article Google Scholar
McCoy, M., Tropp, J.: Two proposals for robust PCA using semidefinite programming. Electr. J. Stat. 2, 1123–1160 (2011)
Article MathSciNet MATH Google Scholar
Miao, W.M.: Matrix completion model with fixed basiscoefficients and rank regularized problems with hard constraints, Disertation for ph.D. Degree. NationalUniversity of Singapore, Singapore (2013)
Negahban, S., Wainwright, M.J.: Restricted strong convexity and weighted matrix completion: Optimal bounds with noise. J. Mach. Learn. Res. 13, 1665–1697 (2012)
MathSciNet MATH Google Scholar
Peng, Y., Ganesh, A., Wright, J., Ma, Y.: RASL: Robust alignment via sparse and low-rank decomposition for linearly correlated images. IEEE Trans. Pattern Anal. Mach. Intell. 34, 2233–2246 (2012)
Article Google Scholar
Rao, G., Peng, Y., Xun, Z.B.: The robust low-rank and sparse matrix decomposition based on $S_{1/2}$ modeling. Sci China: Inf. Sci. 6, 733–748 (2013). (in Chinese)
Google Scholar
Raskutti, G., Wainwright, M.J., Yu, B.: Restricted eigenvalue properties for correlated Gaussian designs. J. Mach. Learn. Res. 11, 2241–2259 (2010)
MathSciNet MATH Google Scholar
Recht, B., Fazel, M., Parrilo, P.: Guaranteed minimum rank solutions of matrix equations via nuclear norm minimization. SIAM Rev. 52, 471–501 (2010)
Article MathSciNet MATH Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Book MATH Google Scholar
Shu, X., Ahuja, N.: Imaging via three-dimensional compressive sampling (3DCS). In: Proceedings of IEEE International Conference on Computer Vision, pp. 439–436 (2011)
Tao, M., Yuan, X.: Recovering low-rank and sparse components of matrices from incomplete and noisy obervations. SIAM J. Optim. 21, 57–81 (2011)
Article MathSciNet MATH Google Scholar
Waters, A., Sankaranarayanan, A., Baraniuk, R.: Sparcs: recovering low-rank and sparse matrices from compressive measurements. In: Proceedings of Neural Information Processing Systems, Granada, Spain, pp. 1–9 (2011)
Wright, J., Ganesh, A., Min, K., Ma, Y.: Compressive principal component pursuit. In: Proceedings of IEEE International Symposium on Information Theory, pp. 1276–1280 (2012)
Xu, H., Sanghavi, S., Caramanis, C.: Robust PCA via outlier pursuit. IEEE Trans. Inf. Theory 58, 3047–3064 (2012)
Article MathSciNet Google Scholar
Zhang, T.: Some sharp performance bounds for least squares regression with $L_1$ reguarization. Ann. Stat. 37, 2109–2144 (2009)
Article MATH Google Scholar
Zhang, Z., Ganesh, A., Liang, X., Ma, Y.: TILT: transform-invariant low-rank textures. Int. J. Comput. Vis. 99, 1–24 (2012)
Article MathSciNet MATH Google Scholar
Zhou, Z., Li, X., Wright, J., Candés, E., Ma, Y.: Stable principal component pursuit. In: Proceedings of IEEE International Symposium on Information Theory, Austin, TX, pp. 1518–1522 (2010)

Download references

Acknowledgments

The authors would like to thank two anonymous referees for their helpful suggestions on the revision of the original manuscript. Supported by the National Natural Science Foundation of China under project Nos. 11501219 and 11571120, the Natural Science Foundation of Guangdong Province under project Nos. 2015A030313214 and 2015A030310298, and the Fundamental Research Funds for the Central Universities(SCUT).

Author information

Authors and Affiliations

Department of Mathematics, South China University of Technology, Tianhe District, Guangzhou, 510641, China
Le Han, Shujun Bi & Shaohua Pan

Authors

Le Han
View author publications
You can also search for this author in PubMed Google Scholar
Shujun Bi
View author publications
You can also search for this author in PubMed Google Scholar
Shaohua Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shaohua Pan.

Appendix

Lemma 6.1

Let ${\mathcal {T}}$ be the subspace given by (3). Then, for any $Z\in {\mathbb {R}}^{n_1\times n_2}$, it holds that

$$\begin{aligned}&{\mathcal {P}}_{{\mathcal {T}}}(Z) = U_1^*(U_1^*)^{{\mathbb {T}}}Z + ZV_1^*(V_1^*)^{{\mathbb {T}}}-U_1^*(U_1^*)^{{\mathbb {T}}} ZV_1^*(V_1^*)^{{\mathbb {T}}} \ \ \mathrm{and} \\&{\mathcal {P}}_{{\mathcal {T}}^{\perp }}(Z) = U_2^*(U_2^*)^{{\mathbb {T}}}ZV_2^*(V_2^*)^{{\mathbb {T}}}. \end{aligned}$$

Proof

Let ${\mathcal {B}}:{\mathbb {R}}^{n_1\times n_2}\rightarrow {\mathbb {R}}^{n_1\times n_2}$ be defined by ${\mathcal {B}}(X)= U_1^*(U_1^*)^{{\mathbb {T}}}XV_2^* (V_2^*)^{{\mathbb {T}}}+XV_1^*(V_1^*)^{{\mathbb {T}}}$ for $X\in {\mathbb {R}}^{n_1\times n_2}$. Then, from the definition of the subspace ${\mathcal {T}}$, it follows that

$$\begin{aligned} {\mathcal {P}}_{{\mathcal {T}}}(Z) = \mathop {\arg \min }_{X\in {\mathbb {R}}^{n_1\times n_2}}\left\{ \frac{1}{2} \big \Vert X-Z\big \Vert _F^2:\ X={\mathcal {B}}(X)\right\} . \end{aligned}$$

(20)

By the necessary optimal conditions of (20), there exists $Y\in {\mathbb {R}}^{n_1\times n_2}$, such that

$$\begin{aligned} {\mathcal {P}}_{{\mathcal {T}}}(Z)-Z-Y+{\mathcal {B}}^*(Y)=0\ \ \mathrm{and}\ \ {\mathcal {P}}_{{\mathcal {T}}}(Z) = {\mathcal {B}} ({\mathcal {P}}_{{\mathcal {T}}}(Z)). \end{aligned}$$

By the expression of ${\mathcal {B}}$, it is easy to check that ${\mathcal {B}}$ is self-adjoint and ${\mathcal {B}}({\mathcal {B}}(X))={\mathcal {B}}(X)$ for any $X\in {\mathbb {R}}^{n_1\times n_2}$. Together with the last equation, we deduce that

$$\begin{aligned}&{\mathcal {B}}({\mathcal {P}}_{{\mathcal {T}}}(Z)) - {\mathcal {B}}(Z)-{\mathcal {B}}(Y)+{\mathcal {B}}({\mathcal {B}}^*(Y))=0 \Longleftrightarrow {\mathcal {P}}_{{\mathcal {T}}}(Z) - {\mathcal {B}}(Z){-}{\mathcal {B}}(Y)+\,{\mathcal {B}}(Y)=0, \end{aligned}$$

which implies the expression of ${\mathcal {P}}_{{\mathcal {T}}}(Z)$. Using the same arguments, one may easily obtain the expression of ${\mathcal {P}}_{{\mathcal {T}}^{\perp }}(Z)$. Thus, we complete the proof. $\square $

The proof of Lemma 2.2

It suffices to consider that the right-hand side is positive (otherwise the inequality is trivial), which implies that $\langle H,{\mathcal {A}}^*{\mathcal {A}}(G)\rangle >0$, and then $\langle H,{\mathcal {A}}^*{\mathcal {A}}(H)\rangle >0$. Without loss of generality, we assume that $n_1\le n_2$, $\Omega $ takes the form

$$\begin{aligned} \Omega =\left\{ (i,j)\ |\ i+(j-1)n_1\le s^*\ \ \mathrm{with}\ i\in \{1,\ldots ,n_1\}, j\in \{1,\ldots ,\lfloor \frac{s^*}{n_1}\rfloor +1\}\right\} ,\nonumber \\ \end{aligned}$$

(21)

where $\lfloor \frac{s^*}{n_1}\rfloor $ means the integer not more than $\frac{s^*}{n_1}$, and all components $G_{ij}$ for $i+(j-1)n_1>s^*$ are arranged in a descending order by the column index, i.e.,

$$\begin{aligned} |G_{i_0+1,j_0}|\ge & {} \cdots \ge |G_{n_1,j_0}|\ge |G_{1,j_0+1}|\ge \cdots \ge |G_{n_1,j_0+1}| \\\ge & {} \cdots \ge |G_{1,n_2}|\ge \cdots \ge |G_{n_1,n_2}|, \end{aligned}$$

where $i_0$ and $j_0$ are positive integers such that $i_0+n_1(j_0-1)=s^*$. For $k=1,2,\ldots $, let

$$\begin{aligned}&\Omega _k:=\Big \{(i,j)\ |\ s^*+(k-1)t<i+(j-1)n_1\le s^*+kt\ \ \mathrm{for}\ i\in \{1,\ldots ,n_1\}, \\&\qquad \quad j\in \big \{\big \lfloor \frac{s^*}{n_1}\big \rfloor + 1,\ldots ,\big \lfloor \frac{s^*+kt}{n_1}\big \rfloor +1\big \}\Big \}, \end{aligned}$$

except that the largest column index in the last block stops at $n_2$. By comparing with (21), it is immediate to see that $\Omega =\Omega _0$ and $\Lambda =\Omega _1$, and consequently $\Gamma =\Omega _0\cup \Omega _1$. In addition, from the order of $|G_{ij}|$ for $i+(j-1)n_1>s^*$, we have $\Vert {\mathcal {P}}_{\Omega _k}(G)\Vert _{\infty }\le \frac{\Vert {\mathcal {P}}_{\Omega _{k-1}}(G)\Vert _{1}}{t}$ when $k>1$, which implies that $\sum _{k>1}\Vert {\mathcal {P}}_{\Omega _k}(G)\Vert _{\infty }\le \frac{\Vert {\mathcal {P}}_{\Omega ^{c}}(G)\Vert _{1}}{t}$. Then, we have that

$$\begin{aligned}&\big \langle H,{\mathcal {A}}^*{\mathcal {A}}\big \rangle -\big \langle H,{\mathcal {A}}^*{\mathcal {A}} ({\mathcal {P}}_{\Gamma } (G-H))\big \rangle \\&\quad =\big \langle H,{\mathcal {A}}^*{\mathcal {A}}(H)\big \rangle + \big \langle H, {\mathcal {A}}^*{\mathcal {A}}({\mathcal {P}}_{\Gamma ^c}(G))\big \rangle \\&\quad =\big \langle H,{\mathcal {A}}^*{\mathcal {A}}(H)\big \rangle + \sum _{k>1}\big \langle H, {\mathcal {A}}^*{\mathcal {A}}({\mathcal {P}}_{\Omega _k}(G))\big \rangle \\&\quad =\big \langle H,{\mathcal {A}}^*{\mathcal {A}}(H)\big \rangle \left[ 1+\frac{\sum _{k>1}\big \langle H, {\mathcal {A}}^*{\mathcal {A}}{\mathcal {P}}_{\Omega _k}(G)\big \rangle }{\big \langle H,{\mathcal {A}}^*{\mathcal {A}}(H)\big \rangle }\right] \\&\quad \ge \big \langle H,{\mathcal {A}}^*{\mathcal {A}}(H) \big \rangle \left[ 1-\varpi (s^*+t,t) \sum _{k>1}\frac{\Vert {\mathcal {P}}_{\Omega _k}(G)\Vert _{\infty }}{\Vert H\Vert _F}\right] \\&\quad \ge \langle H,{\mathcal {A}}^*{\mathcal {A}} (H)\rangle \left[ 1-\frac{\varpi (s^* + t,t)}{t}\frac{\Vert {\mathcal {P}}_{\Omega ^c}(G)\Vert _1}{\Vert H\Vert _F}\right] \\&\quad \ge \chi _{-}(s^*+t)\left[ \Vert H\Vert _F - \frac{\varpi (s^*+t,t)}{t} \Vert {\mathcal {P}}_{\Omega ^c}(G)\Vert _1\right] \Vert H\Vert _F. \end{aligned}$$

where the second equality is using $\Gamma =\Omega _0\cup \Omega _1$, the third one is due to $\langle H,{\mathcal {A}}^*{\mathcal {A}}(H)\rangle >0$, and the first inequality is by the definition of $\varpi (\cdot ,\cdot )$, Combining the last inequality with

$$\begin{aligned} \langle H,{\mathcal {A}}^*{\mathcal {A}}({\mathcal {P}}_{\Gamma }(G-H))\rangle \ge -\chi _{+}(s^*+t,t)\Vert H\Vert _F\Vert {\mathcal {P}}_{\Gamma }(G-H)\Vert _F, \end{aligned}$$

we immediately obtain the desired result. Thus, we complete the proof. $\square $

The proof of Theorem 4.1

Let the subspaces ${\mathcal {H}}$ and ${\mathcal {K}}$ and the index sets $\Gamma $ and $\Lambda $ be defined as in Lemma 4.2. Using Lemma 2.4 with ${\mathcal {J}}_1={\mathcal {H}}, {\mathcal {I}}={\mathcal {K}},G=\Delta L^{k}$ and $H={\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k)$ yields that

$$\begin{aligned}&\max \left( 0,\frac{\langle {\mathcal {P}}_{{\mathcal {K}}} (\Delta L^k), {\mathcal {Q}}(\Delta L^k)\rangle }{\Vert {\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k)\Vert _F}\right) \\&\quad \ge \vartheta _{-}(2r^*+l)\Big (\Vert {\mathcal {P}}_{{\mathcal {K}}} (\Delta L^k)\Vert _F -\frac{\pi (2r^*+l,l)}{l}\Vert {\mathcal {P}}_{{\mathcal {T}}^\bot } (\Delta L^k)\Vert _*\Big ). \end{aligned}$$

Using Lemma 2.2 with $H={\mathcal {P}}_{\Gamma }(\Delta S^k)$ and $G=\Delta S^k$ then yields that

$$\begin{aligned}&\max \left( 0,\frac{\big \langle {\mathcal {P}}_{\Gamma }(\Delta S^k),{\mathcal {Q}}(\Delta S^k)\big \rangle }{\Vert {\mathcal {P}}_{\Gamma }(\Delta S^k)\Vert _F}\right) \\&\quad \ge \chi _-(s^*+t)\Big (\Vert {\mathcal {P}}_{\Gamma }(\Delta S^k)\Vert _F - \frac{\varpi (s^*+t,t)}{t}\Vert {\mathcal {P}}_{\Omega ^c}(\Delta S^k)\Vert _1\Big ). \end{aligned}$$

In addition, from the definitions of $\vartheta _{+}(\cdot )$ and $\chi _{+}(\cdot )$, it follows that

$$\begin{aligned} \big \langle {\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k), {\mathcal {Q}}(\Delta L^k )\big \rangle&\le \Vert {\mathcal {A}}{\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k)\Vert _F\Vert {\mathcal {A}}(\Delta L^k )\Vert _F \\&\le \sqrt{\vartheta _+(2r^*+l)}\Vert {\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k)\Vert _F\Vert {\mathcal {A}}(\Delta L^k )\Vert _F,\\ \big \langle {\mathcal {P}}_{\Gamma }(\Delta S^k), {\mathcal {Q}}(\Delta S^k )\big \rangle&\le \Vert {\mathcal {A}}{\mathcal {P}}_{\Gamma }(\Delta S^k)\Vert _F\Vert {\mathcal {A}}(\Delta S^k )\Vert _F \\&\le \sqrt{\chi _+(s^*+t)}\Vert {\mathcal {P}}_{\Gamma }(\Delta S^k)\Vert _F\Vert {\mathcal {A}}(\Delta S^k )\Vert _F. \end{aligned}$$

From the above four inequalities, it is immediate to obtain that

$$\begin{aligned}&\big \Vert {\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k)\big \Vert _F +\big \Vert {\mathcal {P}}_{\Gamma }(\Delta S^k)\big \Vert _F -\frac{\pi (2r^*+l,l)}{l}\big \Vert {\mathcal {P}}_{{\mathcal {T}}^\bot }(\Delta L^k)\big \Vert _* \nonumber \\&\qquad -\frac{\varpi (s^*+t,t)}{t} \big \Vert {\mathcal {P}}_{\Omega ^c}(\Delta S^k)\big \Vert _1 \nonumber \\&\quad \le \frac{\sqrt{\vartheta _+(2r^*+ l)}}{\vartheta _{-}(2r^*+l)}\big \Vert {\mathcal {A}}(\Delta L^k)\big \Vert _F + \frac{\sqrt{\chi _+(s^*+t)}}{\chi _-(s^*+t) }\big \Vert {\mathcal {A}}(\Delta S^k)\big \Vert _F. \end{aligned}$$

(22)

Let $\beta ^k\equiv \max \Big (\frac{\pi (2r^*+l,l)}{l(1-\Vert {\mathcal {P}}_{{\mathcal {T}}^\bot }(W^{k-1})\Vert )}, \frac{\varpi (s^*+t,t)}{t\nu _{k-1}T_\mathrm{min}^{k-1}}\Big )$. From Lemma 4.1 and the definition of $\widehat{\gamma }$,

$$\begin{aligned}&\frac{\pi (2r^*+l,l)}{l}\big \Vert \mathcal{P}_{\mathcal{T}^\bot }(\Delta L^k)\big \Vert _* +\frac{\varpi (s^*+t,t)}{t} \big \Vert \mathcal{P}_{\Omega ^c}(\Delta S^k)\big \Vert _1 \\&\quad \le \beta ^k\left( (1-\Vert {\mathcal {P}}_{{\mathcal {T}}^\bot } (W^{k-1})\Vert )\Vert \mathcal{P}_{\mathcal{T}^\bot }(\Delta L^k)\Vert _* + \nu _{k-1}T_\mathrm{min}^{k-1}\Vert \mathcal{P}_{\Omega ^c} (\Delta S^k)\Vert _1\right) \\&\quad \le \beta ^k\left( \big \Vert W^{k-1} - U_1^*(V_1^*)^{\mathbb {T}}\big \Vert \big \Vert {\mathcal {P}}_{{\mathcal {T}}}(\Delta L^k)\big \Vert _* + \nu _{k-1}T_\mathrm{max}^{k-1}\Vert {\mathcal {P}}_\Omega (\Delta S^k) \Vert _1\right) \\&\quad \le \beta ^k\left( \sqrt{2r^*}\big \Vert W^{k-1} - U_1^*(V_1^*)^{\mathbb {T}}\big \Vert \big \Vert {\mathcal {P}}_{{\mathcal {T}}}(\Delta L^k)\big \Vert _F + \nu _{k-1}\sqrt{s^*} T_\mathrm{max}^{k-1} \Vert {\mathcal {P}}_\Omega (\Delta S^k)\Vert _F\right) , \\&\quad \le \widehat{\gamma }\left( \sqrt{2r^*} \xi _{k-1}\big \Vert {\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k)\big \Vert _F + \sqrt{s^*}\eta _{k-1}\Vert {\mathcal {P}}_\Gamma (\Delta S^k)\Vert _F\right) , \end{aligned}$$

where the last inequality is due to $\beta ^k\le \frac{\widehat{\gamma }}{\min [(1-\Vert {\mathcal {P}}_{{\mathcal {T}}^{\perp }}(W^{k-1})\Vert ),\,\nu _{k-1}T_\mathrm{min}^{k-1}]}$ and $\Gamma =\Omega \cup \Lambda $. Together with equation (22), we obtain that

$$\begin{aligned}&(1-\widehat{\gamma }\sqrt{2r^*}\xi _{k-1})\Vert {\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k)\Vert _F + (1-\widehat{\gamma }\sqrt{s^*}\eta _{k-1})\Vert {\mathcal {P}}_\Gamma (\Delta S^k)\Vert _F \\&\quad \le \frac{\sqrt{\vartheta _+(2r^*+l)}}{\vartheta _{-}(2r^*+l)} \Vert {\mathcal {A}}(\Delta L^k)\Vert _F +\frac{\sqrt{\chi _+(s^*+t)}}{\chi _-(s^*+t) }\Vert {\mathcal {A}}(\Delta S^k)\Vert _F \\&\quad \le \widetilde{\gamma }\big (\Vert {\mathcal {A}}(\Delta L^k)\Vert _F+\Vert {\mathcal {A}}(\Delta S^k)\Vert _F\big ). \end{aligned}$$

Notice that Assumption 4.1 holds and $0\le \xi _{k-1}<\frac{1}{c_1}$ and $0\le \eta _{k-1}<\frac{1}{c_2}$ imply that $1-\widehat{\gamma }\sqrt{2r^*}\xi _{k-1}>0$ and $1-\widehat{\gamma }\sqrt{s^*}\eta _{k-1}>0$. By the definitions of $a_{k-1}$ and $b_{k-1}$, we have

$$\begin{aligned} \Vert {\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k)\Vert _F\le a_{k-1}\big (\Vert {\mathcal {A}}(\Delta L^k)\Vert _F + \Vert {\mathcal {A}}(\Delta S^k)\Vert _F\big )\nonumber \\ \Vert {\mathcal {P}}_\Gamma (\Delta S^k)\Vert _F\le b_{k-1}\big (\Vert {\mathcal {A}}(\Delta L^k)\Vert _F + \Vert {\mathcal {A}}(\Delta S^k)\Vert _F\big ). \end{aligned}$$

(23)

Next we focus on characterizing the bounds of $\Vert {\mathcal {A}}(\Delta L^k)\Vert _F$ and $\Vert {\mathcal {A}}(\Delta S^k)\Vert _F$. Since

$$\begin{aligned} \Vert {\mathcal {A}}(\Delta L^k)\Vert _F^2 + \Vert {\mathcal {A}}(\Delta S^k)\Vert _F^2&=\Vert {\mathcal {A}}(\Delta L^k+\Delta S^k)\Vert _F^2-2\langle {\mathcal {A}}^*{\mathcal {A}}(\Delta L^k),\Delta S^k\rangle , \\&\le 4\delta ^2 -2\langle {\mathcal {A}}^*{\mathcal {A}}(\Delta L^k),\Delta S^k\rangle , \end{aligned}$$

we only need to bound $|\langle \mathcal{Q}(\Delta L^k),\Delta S^k\rangle |$. Notice that $\Vert \Delta L^k\Vert _\infty \le 2\tau $. We have that

$$\begin{aligned}&|\langle \mathcal{Q}(\Delta L^k),\Delta S^k\rangle | \le \Vert \mathcal{Q}(\Delta L^k)\Vert _\infty \Vert \Delta S^k\Vert _1 \le \Vert {\mathcal {Q}}\Vert _\infty \Vert \Delta L^k\Vert _\infty \Vert \Delta S^k\Vert _1\\&\quad \le 2\tau \Vert {\mathcal {Q}}\Vert _\infty \Vert \Delta S^k\Vert _1. \end{aligned}$$

In addition, from Lemma 4.1 and the definitions of $\zeta _{k-1}$ and $\mu _{k-1}$, it follows that

$$\begin{aligned} \Vert \mathcal{P}_{\Omega ^c}(\Delta S^k)\Vert _1 \le \zeta _{k-1}\big \Vert {\mathcal {P}}_{{\mathcal {T}}}(\Delta L^k)\big \Vert _* +\mu _{k-1}\Vert {\mathcal {P}}_\Omega (\Delta S^k)\Vert _1. \end{aligned}$$

From the last three inequalities, it follows that

$$\begin{aligned}&\Vert {\mathcal {A}}(\Delta L^k)\Vert _F^2 + \Vert {\mathcal {A}}(\Delta S^k)\Vert _F^2 \le 4\delta ^2 + 4\tau \Vert \mathcal{Q}\Vert _\infty \Vert \Delta S^k\Vert _1\nonumber \\&\quad \le 4\delta ^2 + 4\tau \Vert \mathcal{Q}\Vert _\infty \Vert \mathcal{P}_{\Omega ^c}(\Delta S^k)\Vert _1 + 4\tau \Vert \mathcal{Q}\Vert _\infty \Vert \mathcal{P}_\Omega (\Delta S^k)\Vert _1\nonumber \\&\quad \le 4\delta ^2 + 4\tau \Vert \mathcal{Q}\Vert _\infty \zeta _{k-1}\big \Vert {\mathcal {P}}_{{\mathcal {T}}}(\Delta L^k)\big \Vert _* + 4\tau \Vert \mathcal{Q}\Vert _\infty (1+\mu _{k-1})\Vert {\mathcal {P}}_\Omega (\Delta S^k)\Vert _1\nonumber \\&\quad \le 4\delta ^2 + 4\tau \Vert \mathcal{Q}\Vert _\infty \big (\zeta _{k-1}\sqrt{2r^*}\Vert \Delta L^k\Vert _F + \sqrt{s^*}(1+\mu _{k-1})\Vert \Delta S^k\Vert _F\big ). \end{aligned}$$

(24)

Combining inequality (23) with inequality (24), we obtain that

$$\begin{aligned}&\Vert {\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k)\Vert _F \\&\quad \le a_{k-1}\sqrt{4\delta ^2 + 4\tau \Vert \mathcal{Q}\Vert _\infty \big (\zeta _{k-1}\sqrt{2r^*}\Vert \Delta L^k\Vert _F + \sqrt{s^*}(1+\mu _{k-1})\Vert \Delta S^k\Vert _F\big )}, \\&\Vert {\mathcal {P}}_{\Gamma }(\Delta S^k)\Vert _F \\&\quad \le b_{k-1}\sqrt{4\delta ^2 + 4\tau \Vert \mathcal{Q}\Vert _\infty \big (\zeta _{k-1}\sqrt{2r^*}\Vert \Delta L^k\Vert _F + \sqrt{s^*}(1+\mu _{k-1})\Vert \Delta S^k\Vert _F\big )}. \end{aligned}$$

Now substituting the bounds of $\Vert {\mathcal {P}}_{{\mathcal {K}}}(\Delta L^k)\Vert _F$ and $\Vert {\mathcal {P}}_{\Gamma }(\Delta S^k)\Vert _F$ into (12) gives that

$$\begin{aligned}&\Vert \Delta L^k\Vert _F^2 + \Vert \Delta S^k\Vert _F^2 \\&\quad \le \theta _{k-1}\big [4\delta ^2 + 4\tau \Vert \mathcal{Q}\Vert _\infty \big (\zeta _{k-1}\sqrt{2r^*}\Vert \Delta L^k\Vert _F + \sqrt{s^*}(1+\mu _{k-1})\Vert \Delta S^k\Vert _F\big )\big ]. \end{aligned}$$

Notice that $x^2+y^2\le ax+by +c$ for $a,b,c\in {\mathbb {R}}_{+}$ implies $(x+y)\le \frac{a+b}{2}+\sqrt{2}\sqrt{c+\frac{a^2+b^2}{4}}$. Therefore, the last equation implies that $\Vert \Delta L^k\Vert _F+\Vert \Delta S^k\Vert _F\le \frac{a+b}{2}+\sqrt{2}\sqrt{c+\frac{a^2+b^2}{4}}$ with

$$\begin{aligned} a=4\tau \Vert \mathcal{Q}\Vert _\infty \sqrt{2r^*}\theta _{k-1}\zeta _{k-1},\ b=4\tau \Vert \mathcal{Q}\Vert _\infty \sqrt{s^*}\theta _{k-1}(1+\mu _{k-1}),\ c=4\delta ^2\theta _{k-1}. \end{aligned}$$

A suitable rearrangement yields the desired result. The proof is then completed. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, L., Bi, S. & Pan, S. Two-stage convex relaxation approach to least squares loss constrained low-rank plus sparsity optimization problems. Comput Optim Appl 64, 119–148 (2016). https://doi.org/10.1007/s10589-015-9797-6

Download citation

Received: 12 September 2014
Published: 07 October 2015
Issue Date: May 2016
DOI: https://doi.org/10.1007/s10589-015-9797-6

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-stage convex relaxation approach to least squares loss constrained low-rank plus sparsity optimization problems

Abstract

Access this article

Similar content being viewed by others

Random Gradient-Free Minimization of Convex Functions

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

Preconditioned golden ratio primal-dual algorithm with linesearch

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Lemma 6.1

Proof

The proof of Lemma 2.2

The proof of Theorem 4.1

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Two-stage convex relaxation approach to least squares loss constrained low-rank plus sparsity optimization problems

Abstract

Access this article

Similar content being viewed by others

Random Gradient-Free Minimization of Convex Functions

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

Preconditioned golden ratio primal-dual algorithm with linesearch

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Lemma 6.1

Proof

The proof of Lemma 2.2

The proof of Theorem 4.1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation