Proximal linearization methods for Schatten p-quasi-norm minimization

Zeng, Chao

doi:10.1007/s00211-022-01335-7

Proximal linearization methods for Schatten p-quasi-norm minimization

Published: 02 December 2022

Volume 153, pages 213–248, (2023)
Cite this article

Numerische Mathematik Aims and scope Submit manuscript

Chao Zeng¹

651 Accesses
2 Citations
Explore all metrics

Abstract

Schatten p-quasi-norm minimization has advantages over nuclear norm minimization in recovering low-rank matrices. However, Schatten p-quasi-norm minimization is much more difficult, especially for generic linear matrix equations. In this paper, we first extend the lower bound theory of $\ell _p$ minimization to Schatten p-quasi-norm minimization. We prove that the positive singular values of local minimizers are bounded from below by a constant. Motivated by this property, we propose a proximal linearization method, whose subproblems can be solved efficiently by the (linearized) alternating direction method of multipliers. The convergence analysis of the proposed method involves the nonsmooth analysis of singular value functions. We give a necessary and sufficient condition for a singular value function to be a Kurdyka–Łojasiewicz function. The subdifferentials of related singular value functions are computed. The global convergence of the proposed method is established under some assumptions. Experiments on matrix completion, Sylvester equation and image deblurring show the effectiveness of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Tseng’s extragradient method with double projection for solving pseudomonotone variational inequality problems in Hilbert spaces

Article 10 April 2024

Zhongbing Xie, Gang Cai, … Qiao-Li Dong

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

Article Open access 06 March 2024

Yaohua Hu, Xinlin Hu & Xiaoqi Yang

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Article 13 April 2024

Jianchao Bai, Linyuan Jia & Zheng Peng

Notes

The code is available at https://zhouchenlin.github.io/.
Given a matrix X, we set the numerical rank as the number of singular values $\sigma _r(X)$ satisfying $\sigma _r(X)/\Vert X\Vert _F\ge 10^{-4}$.

References

Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. 116(1), 5–16 (2009)
MathSciNet MATH Google Scholar
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka–Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)
MathSciNet MATH Google Scholar
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss–Seidel methods. Math. Program. 137(1–2), 91–129 (2013)
MathSciNet MATH Google Scholar
Beck, A., Teboulle, M.: Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans. Image Process. 18(11), 2419–2434 (2009)
MathSciNet MATH Google Scholar
Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. 146(1–2), 459–494 (2014)
MathSciNet MATH Google Scholar
Boţ, R.I., Nguyen, D.-K.: The proximal alternating direction method of multipliers in the nonconvex setting: convergence analysis and rates. Math. Oper. Res. 45(2), 682–712 (2020)
MathSciNet MATH Google Scholar
Cai, J.-F., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20(4), 1956–1982 (2010)
MathSciNet MATH Google Scholar
Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9(6), 717 (2009)
MathSciNet MATH Google Scholar
Candès, E.J., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52(2), 489–509 (2006)
MathSciNet MATH Google Scholar
Candès, E.J., Tao, T.: The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inf. Theory 56(5), 2053–2080 (2010)
MathSciNet MATH Google Scholar
Candès, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted $\ell _1$ minimization. J. Four. Anal. Appl. 14(5), 877–905 (2008)
MATH Google Scholar
Chan, R.H., Tao, M., Yuan, X.: Constrained total variation deblurring models and fast algorithms based on alternating direction method of multipliers. SIAM J. Imag. Sci. 6(1), 680–697 (2013)
MathSciNet MATH Google Scholar
Chen, C., He, B., Yuan, X.: Matrix completion via an alternating direction method. IMA J. Numer. Anal. 32(1), 227–245 (2012)
MathSciNet MATH Google Scholar
Chen, X., Ng, M.K., Zhang, C.: Non-Lipschitz-regularization and box constrained model for image restoration. IEEE Trans. Image Process. 21(12), 4709–4721 (2012)
MathSciNet MATH Google Scholar
Chen, X., Xu, F., Ye, Y.: Lower bound theory of nonzero entries in solutions of $\ell _2$-$\ell _p$ minimization. SIAM J. Sci. Comput. 32(5), 2832–2852 (2011)
Google Scholar
Donoho, D.L.: For most large underdetermined systems of linear equations the minimal $\ell _1$-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 59(6), 797–829 (2006)
MathSciNet MATH Google Scholar
Eckart, C., Young, G.: The approximation of one matrix by another of lower rank. Psychometrika 1(3), 211–218 (1936)
MATH Google Scholar
El Ghaoui, L., Gahinet, P.: Rank minimization under LMI constraints: a framework for output feedback problems. In: European Control Conf., pp. 1176–1179 (1993)
Fazel, M., Hindi, H., Boyd, S.P.: A rank minimization heuristic with application to minimum order system approximation. In: Proceedings of the 2001 American Control Conference (Cat. No. 01CH37148), vol. 6, pp. 4734–4739. IEEE (2001)
Fornasier, M., Rauhut, H., Ward, R.: Low-rank matrix recovery via iteratively reweighted least squares minimization. SIAM J. Optim. 21(4), 1614–1640 (2011)
MathSciNet MATH Google Scholar
Gazzola, S., Meng, C., Nagy, J.G.: Krylov methods for low-rank regularization. SIAM J. Matrix Anal. Appl. 41(4), 1477–1504 (2020)
MathSciNet MATH Google Scholar
Gu, S., Xie, Q., Meng, D., Zuo, W., Feng, X., Zhang, L.: Weighted nuclear norm minimization and its applications to low level vision. Int. J. Comput. Vis. 121(2), 183–208 (2017)
MATH Google Scholar
Gu, S., Zhang, L., Zuo, W., Feng, X.: Weighted nuclear norm minimization with application to image denoising. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2862–2869 (2014)
Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (2012)
Google Scholar
Hosseini, S., Luke, D.R., Uschmajew, A.: Tangent and normal cones for low-rank matrices. In: Nonsmooth Optimization and its Applications, pp. 45–53 (2019)
Lai, M.-J., Liu, Y., Li, S., Wang, H.: On the Schatten $p$-quasi-norm minimization for low-rank matrix recovery. Appl. Comput. Harmon. Anal. 51, 157–170 (2021)
MathSciNet MATH Google Scholar
Lai, M.-J., Xu, Y., Yin, W.: Improved iteratively reweighted least squares for unconstrained smoothed $\ell _q$ minimization. SIAM J. Numer. Anal. 51(2), 927–957 (2013)
MathSciNet MATH Google Scholar
Lai, M.-J., Yin, W.: Augmented $\ell _1$ and nuclear-norm models with a globally linearly convergent algorithm. SIAM J. Imag. Sci. 6(2), 1059–1091 (2013)
MATH Google Scholar
Larsen, R.M.: PROPACK-software for large and sparse SVD calculations. http://sun.stanford.edu/~rmunk/PROPACK/
Lee, K., Elman, H.C.: A preconditioned low-rank projection method with a rank-reduction scheme for stochastic partial differential equations. SIAM J. Sci. Comput. 39(5), S828–S850 (2017)
MathSciNet MATH Google Scholar
Lewis, A.S., Sendov, H.S.: Nonsmooth analysis of singular values. Part I: theory. Set-Valued Anal. 13(3), 213–241 (2005)
MathSciNet MATH Google Scholar
Li, G., Pong, T.K.: Global convergence of splitting methods for nonconvex composite optimization. SIAM J. Optim. 25(4), 2434–2460 (2015)
MathSciNet MATH Google Scholar
Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Łojasiewicz inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. 18(5), 1199–1232 (2018)
MathSciNet MATH Google Scholar
Lin, Z.: Some software packages for partial SVD computation. arXiv preprint arXiv:1108.1548 (2011)
Liu, Z., Wu, C., Zhao, Y.: A new globally convergent algorithm for non-Lipschitz $\ell _p-\ell _q$ minimization. Adv. Comput. Math. 45(3), 1369–1399 (2019)
MathSciNet Google Scholar
Lu, C., Lin, Z., Yan, S.: Smoothed low rank and sparse matrix recovery by iteratively reweighted least squares minimization. IEEE Trans. Image Process. 24(2), 646–654 (2014)
MathSciNet MATH Google Scholar
Markovsky, I.: Structured low-rank approximation and its applications. Automatica 44(4), 891–909 (2008)
MathSciNet MATH Google Scholar
Mohan, K., Fazel, M.: Iterative reweighted algorithms for matrix rank minimization. J. Mach. Learn. Res. 13(1), 3441–3473 (2012)
MathSciNet MATH Google Scholar
Nikolova, M.: Analysis of the recovery of edges in images and signals by minimizing nonconvex regularized least-squares. Multiscale Model. Simul. 4(3), 960–991 (2005)
MathSciNet MATH Google Scholar
Pong, T.K., Tseng, P., Ji, S., Ye, J.: Trace norm regularization: reformulations, algorithms, and multi-task learning. SIAM J. Optim. 20(6), 3465–3489 (2010)
MathSciNet MATH Google Scholar
Recht, B., Fazel, M., Parrilo, P.A.: Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)
MathSciNet MATH Google Scholar
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis, vol. 317. Springer (2009)
Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Physica D 60(1–4), 259–268 (1992)
MathSciNet MATH Google Scholar
Simoncini, V.: Computational methods for linear matrix equations. SIAM Rev. 58(3), 377–441 (2016)
MathSciNet MATH Google Scholar
Van den Dries, L., Miller, C., et al.: Geometric categories and o-minimal structures. Duke Math. J. 84(2), 497–540 (1996)
MathSciNet MATH Google Scholar
Vandereycken, B.: Low-rank matrix completion by Riemannian optimization. SIAM J. Optim. 23(2), 1214–1236 (2013)
MathSciNet MATH Google Scholar
Wang, Y., Yang, J., Yin, W., Zhang, Y.: A new alternating minimization algorithm for total variation image reconstruction. SIAM J. Imag. Sci. 1(3), 248–272 (2008)
MathSciNet MATH Google Scholar
Wang, Y., Yin, W., Zeng, J.: Global convergence of ADMM in nonconvex nonsmooth optimization. J. Sci. Comput. 78(1), 29–63 (2019)
MathSciNet MATH Google Scholar
Wen, Z., Yin, W., Zhang, Y.: Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Math. Program. Comput. 4(4), 333–361 (2012)
MathSciNet MATH Google Scholar
Yang, J., Yuan, X.: Linearized augmented Lagrangian and alternating direction methods for nuclear norm minimization. Math. Comput. 82(281), 301–329 (2013)
MathSciNet MATH Google Scholar
Yu, P., Li, G., Pong, T.K.: Kurdyka–Łojasiewicz exponent via INF-projection. Found. Comput. Math. 22, 1–47 (2021)
MATH Google Scholar
Zeng, C., Wu, C.: On the edge recovery property of noncovex nonsmooth regularization in image restoration. SIAM J. Numer. Anal. 56(2), 1168–1182 (2018)
MathSciNet MATH Google Scholar
Zeng, C., Wu, C.: On the discontinuity of images recovered by noncovex nonsmooth regularized isotropic models with box constraints. Adv. Comput. Math. 45(2), 589–610 (2019)
MathSciNet MATH Google Scholar
Zeng, C., Wu, C., Jia, R.: Non-Lipschitz models for image restoration with impulse noise removal. SIAM J. Imag. Sci. 12(1), 420–458 (2019)
MathSciNet MATH Google Scholar
Zhang, X., Bai, M., Ng, M.K.: Nonconvex-TV based image restoration with impulse noise removal. SIAM J. Imag. Sci. 10(3), 1627–1667 (2017)
MathSciNet MATH Google Scholar
Zheng, Z., Ng, M., Wu, C.: A globally convergent algorithm for a class of gradient compounded non-Lipschitz models applied to non-additive noise removal. Inverse Prob. 36(12), 125017 (2020)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

The author is extremely grateful to the editor and the two anonymous referees for their valuable feedback, which improved this paper significantly. The author is also grateful to Dr. Xianshun Nian and Dr. Guomin Liu for helpful discussions. This work was partially supported by the National Natural Science Foundation of China (12201319).

Author information

Authors and Affiliations

School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China
Chao Zeng

Authors

Chao Zeng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao Zeng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

This appendix summarizes some important results on KL theory and gives some examples. The following definition is adopted from Attouch et al. [2, Definition 4.1].

Definition 7.1

(o-minimal structure on $\mathbb {R}$) Let $\mathscr {O} = \{\mathscr {O}_n\}_{n \in \mathbb {N}}$ such that each $\mathscr {O}_n$ is a collection of subsets of $\mathbb {R}^n$. The family $\mathscr {O}$ is an o-minimal structure on $\mathbb {R}$, if it satisfies the following axioms:

(i)
Each $\mathscr {O}_n$ is a boolean algebra. Namely $\emptyset \in \mathscr {O}_n$ and for each $\mathscr {A},\mathscr {B}$ in $\mathscr {O}_n$, $\mathscr {A} \cup \mathscr {B}$, $\mathscr {A} \cap \mathscr {B}$, and $\mathbb {R}^n \setminus \mathscr {A}$ belong to $\mathscr {O}_n$.
(ii)
For all $\mathscr {A}$ in $\mathscr {O}_n$, $\mathscr {A} \times \mathbb {R}$ and $\mathbb {R} \times \mathscr {A}$ belong to $\mathscr {O}_{n+1}$.
(iii)
For all $\mathscr {A}$ in $\mathscr {O}_{n+1}$, $\{(x_1,\ldots ,x_n)\in \mathbb {R}^n: (x_1,\ldots ,x_n,x_{n+1}) \in \mathscr {A}\}$ belongs to $\mathscr {O}_{n}$.
(iv)
For all $i \ne j$ in $\{1,2,\ldots ,n\}$, $\{(x_1,\ldots ,x_n) \in \mathbb {R}^n: x_i = x_j\}$ belongs to $\mathscr {O}_n$.
(v)
The set $\{(x_1,x_2) \in \mathbb {R}^2: x_1<x_2\}$ belongs to $\mathscr {O}_2$.
(vi)
The elements of $\mathscr {O}_1$ are exactly finite unions of intervals.

Let $\mathscr {O}$ be an o-minimal structure on $\mathbb {R}$. We say that a set $\mathscr {A} \subseteq \mathbb {R}^n$ is definable (on $\mathscr {O}$) if $\mathscr {A} \in \mathscr {O}_n$. A function $f: \mathbb {R}^n \rightarrow (-\infty , +\infty ]$ is definable if its graph $\{(\varvec{x},y) \in \mathbb {R}^n \times (-\infty ,+\infty ]: y \in f(\varvec{x})\}$ is definable on $\mathscr {O}$. We list some known elementary properties of definable functions below.

Property 7.2

(See [2]) Finite sums of definable functions are definable; indicator functions of definable sets are definable; compositions of definable functions or mappings are definable.

It is known that any proper lower semicontinuous function that is definable is a KL function; see [2, Theorem 4.1].

Example 7.3

A class of o-minimal structure is the log-exp structure [45, Example 2.5]. By this structure, the following functions are all definable:

1.
semi-algebraic functions; see Definition below.
2.
the function $\mathbb {R} \rightarrow \mathbb {R}$ given by
$$\begin{aligned} x \mapsto {\left\{ \begin{array}{ll} x^r, &{} x >0 \\ 0, &{} x \le 0, \end{array}\right. } \end{aligned}$$
where $r \in \mathbb {R}$.
3.
the exponential function: $\mathbb {R} \rightarrow \mathbb {R}$ given by $x \mapsto e^x$ and the logarithm function: $(0,\infty ) \rightarrow \mathbb {R}$ given by $x \mapsto \log (x)$.

Definition 7.4

(See [5, Definition 5]) A subset $\mathscr {S}$ of $\mathbb {R}^d$ is a real semi-algebraic set if there exists a finite number of real polynomial functions $f_{ij},g_{ij}: \mathbb {R}^d \rightarrow \mathbb {R}$ such that

$$\begin{aligned} \mathscr {S}=\bigcup _{j=1}^s\bigcap _{i=1}^t \left\{ \varvec{x}\in \mathbb {R}^{d}: f_{ij}(\varvec{x})=0 \text { and } g_{ij}(\varvec{x})<0 \right\} \end{aligned}$$

A function $f: \mathbb {R}^{d} \rightarrow (-\infty ,+\infty ]$ is called semi-algebraic if its graph

$$\begin{aligned} \left\{ (\varvec{x},y)\in \mathbb {R}^{d+1}: f(\varvec{x})=y \right\} \end{aligned}$$

is a semi-algebraic subset of $\mathbb {R}^{d+1}$.

The class of semi-algebraic sets is stable under the following operations: finite unions, finite intersections, complementation and Cartesian products.

Example 7.5

[5, Example 2] There is broad class of semi-algebraic sets and functions arising in optimization.

1.
Real polynomial functions.
2.
Indicator functions of semi-algebraic sets.
3.
Finite sums and product of semi-algebraic functions.
4.
Composition of semi-algebraic functions.
5.
In matrix theory, all the following are semi-algebraic sets: cone of positive semidefinite matrices, Stiefel manifolds and constant rank matrices.

Example 7.6

Define $f(\varvec{x}): \mathbb {R}^{n}\rightarrow \mathbb {R}$ by $f(\varvec{x})=\sum _{i=1}^{n}|x_i|^p$. We prove that f is definable. First, consider the function $g(t)=|t|$. The graph of g(t) is

$$\begin{aligned}{} & {} \{(t,y)\in \mathbb {R}^2: y = t, t>0 \}\cup \{(t,y)\in \mathbb {R}^2: y = -t, t>0 \}\\{} & {} \quad \cup \{(t,y)\in \mathbb {R}^2: y = t, t=0 \}. \end{aligned}$$

Hence, g(t) is a semi-algebraic function. From Property and Example , we know that $f(\varvec{x})$ is definable.

Example 7.7

We prove that the function $f(\varvec{x})$ defined in (47) is a semi-algebraic function. From Example , we see that it suffices to prove that the function $f_i(\varvec{x}):=b_i \underline{x}_i,i=1,\ldots ,r$ is semi-algebraic. We only prove that $f_1(\varvec{x})=b_1 \underline{x}_1$ is semi-algebraic, and the other cases are similar. Define

$$\begin{aligned} \mathscr {T}_j:=\{\varvec{x}\in \mathbb {R}^h: x_j=\underline{x}_1\}, \quad j=1,\ldots ,h. \end{aligned}$$

By the definition of $\underline{\varvec{x}}$, we have

$$\begin{aligned} \mathscr {T}_j=\bigcap _{k=1}^{h}\{\varvec{x}\in \mathbb {R}^h: |x_j|\ge |x_k|\} \end{aligned}$$

and $\{\varvec{x}\in \mathbb {R}^h: |x_j|\ge |x_k|\}$ can be written as a union of some semi-algebraic sets. Hence, $\mathscr {T}_j$ is semi-algebraic. The graph of $f_1(\varvec{x})$ is

$$\begin{aligned} \bigcup _{j=1}^h \left( \left\{ (\varvec{x},y)\in \mathbb {R}^{h+1}:y = b_1 |x_j| \right\} \cap (\mathscr {T}_j\times \mathbb {R}) \right) , \end{aligned}$$

which is a semi-algebraic set.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zeng, C. Proximal linearization methods for Schatten p-quasi-norm minimization. Numer. Math. 153, 213–248 (2023). https://doi.org/10.1007/s00211-022-01335-7

Download citation

Received: 03 January 2022
Revised: 02 November 2022
Accepted: 12 November 2022
Published: 02 December 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s00211-022-01335-7

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Proximal linearization methods for Schatten p-quasi-norm minimization

Abstract

Access this article

Similar content being viewed by others

Tseng’s extragradient method with double projection for solving pseudomonotone variational inequality problems in Hilbert spaces

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Definition 7.1

Property 7.2

Example 7.3

Definition 7.4

Example 7.5

Example 7.6

Example 7.7

Rights and permissions

About this article

Cite this article

Mathematics Subject Classification

Navigation

Proximal linearization methods for Schatten p-quasi-norm minimization

Abstract

Access this article

Similar content being viewed by others

Tseng’s extragradient method with double projection for solving pseudomonotone variational inequality problems in Hilbert spaces

On convergence of iterative thresholding algorithms to approximate sparse solution for composite nonconvex optimization

A New Insight on Augmented Lagrangian Method with Applications in Machine Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Definition 7.1

Property 7.2

Example 7.3

Definition 7.4

Example 7.5

Example 7.6

Example 7.7

Rights and permissions

About this article

Cite this article

Share this article

Mathematics Subject Classification

Search

Navigation