Abstract
Schatten p-quasi-norm minimization has advantages over nuclear norm minimization in recovering low-rank matrices. However, Schatten p-quasi-norm minimization is much more difficult, especially for generic linear matrix equations. In this paper, we first extend the lower bound theory of \(\ell _p\) minimization to Schatten p-quasi-norm minimization. We prove that the positive singular values of local minimizers are bounded from below by a constant. Motivated by this property, we propose a proximal linearization method, whose subproblems can be solved efficiently by the (linearized) alternating direction method of multipliers. The convergence analysis of the proposed method involves the nonsmooth analysis of singular value functions. We give a necessary and sufficient condition for a singular value function to be a Kurdyka–Łojasiewicz function. The subdifferentials of related singular value functions are computed. The global convergence of the proposed method is established under some assumptions. Experiments on matrix completion, Sylvester equation and image deblurring show the effectiveness of the algorithm.
Similar content being viewed by others
Notes
The code is available at https://zhouchenlin.github.io/.
Given a matrix X, we set the numerical rank as the number of singular values \(\sigma _r(X)\) satisfying \(\sigma _r(X)/\Vert X\Vert _F\ge 10^{-4}\).
References
Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. 116(1), 5–16 (2009)
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss–Seidel methods. Math. Program. 137(1–2), 91–129 (2013)
Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18(11), 2419–2434 (2009)
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)
Boţ, R.I., Nguyen, D.-K.: The proximal alternating direction method of multipliers in the nonconvex setting: convergence analysis and rates. Math. Oper. Res. 45(2), 682–712 (2020)
Cai, J.-F., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)
Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717 (2009)
Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006)
Candès, E.J., Tao, T.: The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inf. Theory 56(5), 2053–2080 (2010)
Candès, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted \(\ell _1\) minimization. J. Four. Anal. Appl. 14(5), 877–905 (2008)
Chan, R.H., Tao, M., Yuan, X.: Constrained total variation deblurring models and fast algorithms based on alternating direction method of multipliers. SIAM J. Imag. Sci. 6(1), 680–697 (2013)
Chen, C., He, B., Yuan, X.: Matrix completion via an alternating direction method. IMA J. Numer. Anal. 32(1), 227–245 (2012)
Chen, X., Ng, M.K., Zhang, C.: Non-Lipschitz-regularization and box constrained model for image restoration. IEEE Trans. Image Process. 21(12), 4709–4721 (2012)
Chen, X., Xu, F., Ye, Y.: Lower bound theory of nonzero entries in solutions of \(\ell _2\)-\(\ell _p\) minimization. SIAM J. Sci. Comput. 32(5), 2832–2852 (2011)
Donoho, D.L.: For most large underdetermined systems of linear equations the minimal \(\ell _1\)-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 59(6), 797–829 (2006)
Eckart, C., Young, G.: The approximation of one matrix by another of lower rank. Psychometrika 1(3), 211–218 (1936)
El Ghaoui, L., Gahinet, P.: Rank minimization under LMI constraints: a framework for output feedback problems. In: European Control Conf., pp. 1176–1179 (1993)
Fazel, M., Hindi, H., Boyd, S.P.: A rank minimization heuristic with application to minimum order system approximation. In: Proceedings of the 2001 American Control Conference (Cat. No. 01CH37148), vol. 6, pp. 4734–4739. IEEE (2001)
Fornasier, M., Rauhut, H., Ward, R.: Low-rank matrix recovery via iteratively reweighted least squares minimization. SIAM J. Optim. 21(4), 1614–1640 (2011)
Gazzola, S., Meng, C., Nagy, J.G.: Krylov methods for low-rank regularization. SIAM J. Matrix Anal. Appl. 41(4), 1477–1504 (2020)
Gu, S., Xie, Q., Meng, D., Zuo, W., Feng, X., Zhang, L.: Weighted nuclear norm minimization and its applications to low level vision. Int. J. Comput. Vis. 121(2), 183–208 (2017)
Gu, S., Zhang, L., Zuo, W., Feng, X.: Weighted nuclear norm minimization with application to image denoising. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2862–2869 (2014)
Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (2012)
Hosseini, S., Luke, D.R., Uschmajew, A.: Tangent and normal cones for low-rank matrices. In: Nonsmooth Optimization and its Applications, pp. 45–53 (2019)
Lai, M.-J., Liu, Y., Li, S., Wang, H.: On the Schatten \(p\)-quasi-norm minimization for low-rank matrix recovery. Appl. Comput. Harmon. Anal. 51, 157–170 (2021)
Lai, M.-J., Xu, Y., Yin, W.: Improved iteratively reweighted least squares for unconstrained smoothed \(\ell _q\) minimization. SIAM J. Numer. Anal. 51(2), 927–957 (2013)
Lai, M.-J., Yin, W.: Augmented \(\ell _1\) and nuclear-norm models with a globally linearly convergent algorithm. SIAM J. Imag. Sci. 6(2), 1059–1091 (2013)
Larsen, R.M.: PROPACK-software for large and sparse SVD calculations. http://sun.stanford.edu/~rmunk/PROPACK/
Lee, K., Elman, H.C.: A preconditioned low-rank projection method with a rank-reduction scheme for stochastic partial differential equations. SIAM J. Sci. Comput. 39(5), S828–S850 (2017)
Lewis, A.S., Sendov, H.S.: Nonsmooth analysis of singular values. Part I: theory. Set-Valued Anal. 13(3), 213–241 (2005)
Li, G., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4), 2434–2460 (2015)
Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Łojasiewicz inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. 18(5), 1199–1232 (2018)
Lin, Z.: Some software packages for partial SVD computation. arXiv preprint arXiv:1108.1548 (2011)
Liu, Z., Wu, C., Zhao, Y.: A new globally convergent algorithm for non-Lipschitz \(\ell _p-\ell _q\) minimization. Adv. Comput. Math. 45(3), 1369–1399 (2019)
Lu, C., Lin, Z., Yan, S.: Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization. IEEE Trans. Image Process. 24(2), 646–654 (2014)
Markovsky, I.: Structured low-rank approximation and its applications. Automatica 44(4), 891–909 (2008)
Mohan, K., Fazel, M.: Iterative reweighted algorithms for matrix rank minimization. J. Mach. Learn. Res. 13(1), 3441–3473 (2012)
Nikolova, M.: Analysis of the recovery of edges in images and signals by minimizing nonconvex regularized least-squares. Multiscale Model. Simul. 4(3), 960–991 (2005)
Pong, T.K., Tseng, P., Ji, S., Ye, J.: Trace norm regularization: reformulations, algorithms, and multi-task learning. SIAM J. Optim. 20(6), 3465–3489 (2010)
Recht, B., Fazel, M., Parrilo, P.A.: Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis, vol. 317. Springer (2009)
Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Physica D 60(1–4), 259–268 (1992)
Simoncini, V.: Computational methods for linear matrix equations. SIAM Rev. 58(3), 377–441 (2016)
Van den Dries, L., Miller, C., et al.: Geometric categories and o-minimal structures. Duke Math. J. 84(2), 497–540 (1996)
Vandereycken, B.: Low-rank matrix completion by Riemannian optimization. SIAM J. Optim. 23(2), 1214–1236 (2013)
Wang, Y., Yang, J., Yin, W., Zhang, Y.: A new alternating minimization algorithm for total variation image reconstruction. SIAM J. Imag. Sci. 1(3), 248–272 (2008)
Wang, Y., Yin, W., Zeng, J.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78(1), 29–63 (2019)
Wen, Z., Yin, W., Zhang, Y.: Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Math. Program. Comput. 4(4), 333–361 (2012)
Yang, J., Yuan, X.: Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization. Math. Comput. 82(281), 301–329 (2013)
Yu, P., Li, G., Pong, T.K.: Kurdyka–Łojasiewicz exponent via INF-projection. Found. Comput. Math. 22, 1–47 (2021)
Zeng, C., Wu, C.: On the edge recovery property of noncovex nonsmooth regularization in image restoration. SIAM J. Numer. Anal. 56(2), 1168–1182 (2018)
Zeng, C., Wu, C.: On the discontinuity of images recovered by noncovex nonsmooth regularized isotropic models with box constraints. Adv. Comput. Math. 45(2), 589–610 (2019)
Zeng, C., Wu, C., Jia, R.: Non-Lipschitz models for image restoration with impulse noise removal. SIAM J. Imag. Sci. 12(1), 420–458 (2019)
Zhang, X., Bai, M., Ng, M.K.: Nonconvex-TV based image restoration with impulse noise removal. SIAM J. Imag. Sci. 10(3), 1627–1667 (2017)
Zheng, Z., Ng, M., Wu, C.: A globally convergent algorithm for a class of gradient compounded non-Lipschitz models applied to non-additive noise removal. Inverse Prob. 36(12), 125017 (2020)
Acknowledgements
The author is extremely grateful to the editor and the two anonymous referees for their valuable feedback, which improved this paper significantly. The author is also grateful to Dr. Xianshun Nian and Dr. Guomin Liu for helpful discussions. This work was partially supported by the National Natural Science Foundation of China (12201319).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
This appendix summarizes some important results on KL theory and gives some examples. The following definition is adopted from Attouch et al. [2, Definition 4.1].
Definition 7.1
(o-minimal structure on \(\mathbb {R}\)) Let \(\mathscr {O} = \{\mathscr {O}_n\}_{n \in \mathbb {N}}\) such that each \(\mathscr {O}_n\) is a collection of subsets of \(\mathbb {R}^n\). The family \(\mathscr {O}\) is an o-minimal structure on \(\mathbb {R}\), if it satisfies the following axioms:
-
(i)
Each \(\mathscr {O}_n\) is a boolean algebra. Namely \(\emptyset \in \mathscr {O}_n\) and for each \(\mathscr {A},\mathscr {B}\) in \(\mathscr {O}_n\), \(\mathscr {A} \cup \mathscr {B}\), \(\mathscr {A} \cap \mathscr {B}\), and \(\mathbb {R}^n \setminus \mathscr {A}\) belong to \(\mathscr {O}_n\).
-
(ii)
For all \(\mathscr {A}\) in \(\mathscr {O}_n\), \(\mathscr {A} \times \mathbb {R}\) and \(\mathbb {R} \times \mathscr {A}\) belong to \(\mathscr {O}_{n+1}\).
-
(iii)
For all \(\mathscr {A}\) in \(\mathscr {O}_{n+1}\), \(\{(x_1,\ldots ,x_n)\in \mathbb {R}^n: (x_1,\ldots ,x_n,x_{n+1}) \in \mathscr {A}\}\) belongs to \(\mathscr {O}_{n}\).
-
(iv)
For all \(i \ne j\) in \(\{1,2,\ldots ,n\}\), \(\{(x_1,\ldots ,x_n) \in \mathbb {R}^n: x_i = x_j\}\) belongs to \(\mathscr {O}_n\).
-
(v)
The set \(\{(x_1,x_2) \in \mathbb {R}^2: x_1<x_2\}\) belongs to \(\mathscr {O}_2\).
-
(vi)
The elements of \(\mathscr {O}_1\) are exactly finite unions of intervals.
Let \(\mathscr {O}\) be an o-minimal structure on \(\mathbb {R}\). We say that a set \(\mathscr {A} \subseteq \mathbb {R}^n\) is definable (on \(\mathscr {O}\)) if \(\mathscr {A} \in \mathscr {O}_n\). A function \(f: \mathbb {R}^n \rightarrow (-\infty , +\infty ]\) is definable if its graph \(\{(\varvec{x},y) \in \mathbb {R}^n \times (-\infty ,+\infty ]: y \in f(\varvec{x})\}\) is definable on \(\mathscr {O}\). We list some known elementary properties of definable functions below.
Property 7.2
(See [2]) Finite sums of definable functions are definable; indicator functions of definable sets are definable; compositions of definable functions or mappings are definable.
It is known that any proper lower semicontinuous function that is definable is a KL function; see [2, Theorem 4.1].
Example 7.3
A class of o-minimal structure is the log-exp structure [45, Example 2.5]. By this structure, the following functions are all definable:
-
1.
semi-algebraic functions; see Definition below.
-
2.
the function \(\mathbb {R} \rightarrow \mathbb {R}\) given by
$$\begin{aligned} x \mapsto {\left\{ \begin{array}{ll} x^r, &{} x >0 \\ 0, &{} x \le 0, \end{array}\right. } \end{aligned}$$where \(r \in \mathbb {R}\).
-
3.
the exponential function: \(\mathbb {R} \rightarrow \mathbb {R}\) given by \(x \mapsto e^x\) and the logarithm function: \((0,\infty ) \rightarrow \mathbb {R}\) given by \(x \mapsto \log (x)\).
Definition 7.4
(See [5, Definition 5]) A subset \(\mathscr {S}\) of \(\mathbb {R}^d\) is a real semi-algebraic set if there exists a finite number of real polynomial functions \(f_{ij},g_{ij}: \mathbb {R}^d \rightarrow \mathbb {R}\) such that
A function \(f: \mathbb {R}^{d} \rightarrow (-\infty ,+\infty ]\) is called semi-algebraic if its graph
is a semi-algebraic subset of \(\mathbb {R}^{d+1}\).
The class of semi-algebraic sets is stable under the following operations: finite unions, finite intersections, complementation and Cartesian products.
Example 7.5
[5, Example 2] There is broad class of semi-algebraic sets and functions arising in optimization.
-
1.
Real polynomial functions.
-
2.
Indicator functions of semi-algebraic sets.
-
3.
Finite sums and product of semi-algebraic functions.
-
4.
Composition of semi-algebraic functions.
-
5.
In matrix theory, all the following are semi-algebraic sets: cone of positive semidefinite matrices, Stiefel manifolds and constant rank matrices.
Example 7.6
Define \(f(\varvec{x}): \mathbb {R}^{n}\rightarrow \mathbb {R}\) by \(f(\varvec{x})=\sum _{i=1}^{n}|x_i|^p\). We prove that f is definable. First, consider the function \(g(t)=|t|\). The graph of g(t) is
Hence, g(t) is a semi-algebraic function. From Property and Example , we know that \(f(\varvec{x})\) is definable.
Example 7.7
We prove that the function \(f(\varvec{x})\) defined in (47) is a semi-algebraic function. From Example , we see that it suffices to prove that the function \(f_i(\varvec{x}):=b_i \underline{x}_i,i=1,\ldots ,r\) is semi-algebraic. We only prove that \(f_1(\varvec{x})=b_1 \underline{x}_1\) is semi-algebraic, and the other cases are similar. Define
By the definition of \(\underline{\varvec{x}}\), we have
and \(\{\varvec{x}\in \mathbb {R}^h: |x_j|\ge |x_k|\}\) can be written as a union of some semi-algebraic sets. Hence, \(\mathscr {T}_j\) is semi-algebraic. The graph of \(f_1(\varvec{x})\) is
which is a semi-algebraic set.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zeng, C. Proximal linearization methods for Schatten p-quasi-norm minimization. Numer. Math. 153, 213–248 (2023). https://doi.org/10.1007/s00211-022-01335-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00211-022-01335-7