Abstract
As a convex relaxation of the rank minimization model, the nuclear norm minimization (NNM) problem has been attracting significant research interest in recent years. The standard NNM regularizes each singular value equally, composing an easily calculated convex norm. However, this restricts its capability and flexibility in dealing with many practical problems, where the singular values have clear physical meanings and should be treated differently. In this paper we study the weighted nuclear norm minimization (WNNM) problem, which adaptively assigns weights on different singular values. As the key step of solving general WNNM models, the theoretical properties of the weighted nuclear norm proximal (WNNP) operator are investigated. Albeit nonconvex, we prove that WNNP is equivalent to a standard quadratic programming problem with linear constrains, which facilitates solving the original problem with off-the-shelf convex optimization solvers. In particular, when the weights are sorted in a non-descending order, its optimal solution can be easily obtained in closed-form. With WNNP, the solving strategies for multiple extensions of WNNM, including robust PCA and matrix completion, can be readily constructed under the alternating direction method of multipliers paradigm. Furthermore, inspired by the reweighted sparse coding scheme, we present an automatic weight setting method, which greatly facilitates the practical implementation of WNNM. The proposed WNNM methods achieve state-of-the-art performance in typical low level vision tasks, including image denoising, background subtraction and image inpainting.











Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Notes
A general proximal operator is defined on a convex problem to guarantee an accurate projection. Although the problem here is nonconvex, we can strictly prove that it is equivalent to a convex quadratic programing problem in Sect. 3. We thus also call it a proximal operator throughout the paper for convenience.
The SAR image was downloaded at http://aess.cs.unh.edu/radar%20se%20Lecture%2018%20B.html.
The color image was used in previous work (Portilla 2004).
The color versions of images #3, #5, #6, #7, #9, #11 are used in this MC experiment.
References
Arias, P., Facciolo, G., Caselles, V., & Sapiro, G. (2011). A variational framework for exemplar-based image inpainting. International Journal of computer Vision, 93(3), 319–347.
Babacan, S. D., Luessi, M., Molina, R., & Katsaggelos, A. K. (2012). Sparse bayesian methods for low-rank matrix estimation. IEEE Transactions on Signal Processing, 60(8), 3964–3977.
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transaction on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.
Buades, A., Coll, B., & Morel, J. M. (2005). A non-local algorithm for image denoising. In CVPR.
Buades, A., Coll, B., & Morel, J. M. (2008). Nonlocal image and movie denoising. International Journal of Computer Vision, 76(2), 123–139.
Buchanan, A.M., & Fitzgibbon, A.W, (2005). Damped newton algorithms for matrix factorization with missing data. In CVPR.
Cai, J. F., Candès, E. J., & Shen, Z. (2010). A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 20(4), 1956–1982.
Candès, E. J., & Recht, B. (2009). Exact matrix completion via convex optimization. Foundations of Computational mathematics, 9(6), 717–772.
Candès, E. J., Wakin, M. B., & Boyd, S. P. (2008). Enhancing sparsity by reweighted \(l_1\) minimization. Journal of Fourier Analysis and Applications, 14(5–6), 877–905.
Candès, E. J., Li, X., Ma, Y., & Wright, J. (2011). Robust principal component analysis? Journal of the ACM, 58(3), 11.
Chan, T. F., & Shen, J. J. (2005). Image processing and analysis: Variational, PDE, wavelet, and stochastic methods. Philadelphia: SIAM Press.
Chartrand, R. (2007). Exact reconstruction of sparse signals via nonconvex minimization. IEEE Signal Processing Letters, 14(10), 707–710.
Chartrand, R. (2012). Nonconvex splitting for regularized low-rank+ sparse decomposition. IEEE Transaction on Signal Processing, 60(11), 5810–5819.
Dabov, K., Foi, A., Katkovnik, V., & Egiazarian, K. (2007). Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Transaction on Image Processing, 16(8), 2080–2095.
Dahl, J., Hansen, P. C., Jensen, S. H., & Jensen, T. L. (2010). Algorithms and software for total variation image reconstruction via first-order methods. Numerical Algorithms, 53(1), 67–92.
De La Torre, F., & Black, M. J. (2003). A framework for robust subspace learning. International Journal of Computer Vision, 54(1–3), 117–142.
Ding, X., He, L., & Carin, L. (2011). Bayesian robust principal component analysis. IEEE Transactions on Image Processing, 20(12), 3419–3430.
Dong, W., Zhang, L., & Shi, G. (2011). Centralized sparse representation for image restoration. In ICCV.
Dong, W., Shi, G., & Li, X. (2013). Nonlocal image restoration with bilateral variance estimation: A low-rank approach. IEEE Transaction on Image Processing, 22(2), 700–711.
Dong, W., Shi, G., Li, X., Ma, Y., & Huang, F. (2014). Compressive sensing via nonlocal low-rank regularization. IEEE Transaction on Image Processing, 23(8), 3618–3632.
Donoho, D. L. (1995). De-noising by soft-thresholding. IEEE Transaction on Info Theory, 41(3), 613–627.
Eriksson, A., & Van Den Hengel, A. (2010). Efficient computation of robust low-rank matrix approximations in the presence of missing data using the \(l_1\) norm. In CVPR.
Fazel, M. (2002). Matrix rank minimization with applications. PhD thesis, PhD thesis, Stanford University.
Fazel, M., Hindi, H., & Boyd, S.P. (2001). A rank minimization heuristic with application to minimum order system approximation. In American Control Conference. (ACC).
Gu, S., Zhang, L., Zuo, W., & Feng, X. (2014). Weighted nuclear norm minimization with application to image denoising. In CVPR.
Jain, P., Netrapalli, P., & Sanghavi, S. (2013). Low-rank matrix completion using alternating minimization. In ACM symposium on theory of computing.
Ji, H., Liu, C., Shen, Z., & Xu, Y. (2010). Robust video denoising using low rank matrix completion. In CVPR.
Ji, S., & Ye, J. (2009). An accelerated gradient method for trace norm minimization. In ICML (pp. 457–464).
Ke, Q., & Kanade, T. (2005). Robust \(l_1\) norm factorization in the presence of outliers and missing data by alternative convex programming. In CVPR.
Kwak, N. (2008). Principal component analysis based on l1-norm maximization. IEEE Transaction on Pattern Analysis and Machine Intelligence, 30(9), 1672–1680.
Levin, A., & Nadler, B. (2011). Natural image denoising: Optimality and inherent bounds. In CVPR.
Levin, A., Nadler, B., Durand, F., & Freeman, W.T. (2012). Patch complexity, finite pixel correlations and optimal denoising. In ECCV.
Li, L., Huang, W., Gu, I. H., & Tian, Q. (2004). Statistical modeling of complex backgrounds for foreground object detection. IEEE Transaction on Image Processing, 13(11), 1459–1472.
Lin, Z., Ganesh, A., Wright, J., Wu, L., Chen, M., & Ma, Y. (2009). Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix. In International Workshop on Computational Advances in Multi-Sensor Adaptive Processing.
Lin, Z., Liu, R., & Su, Z. (2011). Linearized alternating direction method with adaptive penalty for low-rank representation. In NIPS.
Lin, Z., Liu, R., & Li, H. (2015). Linearized alternating direction method with parallel splitting and adaptive penalty for separable convex programs in machine learning. Machine Learning, 99(2), 287–325.
Liu, G., Lin, Z., Yan, S., Sun, J., Yu, & Y., Ma, Y. (2010). Robust subspace segmentation by low-rank representation. In ICML.
Liu, R., Lin, Z., De la, Torre, F., & Su, Z. (2012). Fixed-rank representation for unsupervised visual learning. In CVPR.
Lu, C., Tang, J., Yan, S., & Lin, Z. (2014a). Generalized nonconvex nonsmooth low-rank minimization. In CVPR.
Lu, C., Zhu, C., Xu, C., Yan, S., & Lin, Z. (2014b). Generalized singular value thresholding. arXiv preprint arXiv:1412.2231.
Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zisserman, A. (2009). Non-local sparse models for image restoration. In ICCV.
Meng, D., & Torre, F.D.L. (2013). Robust matrix factorization with unknown noise. In ICCV.
Mirsky, L. (1975). A trace inequality of john von neumann. Monatshefte für Mathematik, 79(4), 303–306.
Mnih, A.,&Salakhutdinov, R. (2007). Probabilistic matrix factorization. In NIPS.
Mohan, K., & Fazel, M. (2012). Iterative reweighted algorithms for matrix rank minimization. The Journal of Machine Learning Research, 13(1), 3441–3473.
Moreau, J. J. (1965). Proximité et dualité dans un espace hilbertien. Bulletin de la Société mathématique de France, 93, 273–299.
Mu, Y., Dong, J., Yuan, X., & Yan, S. (2011). Accelerated low-rank visual recovery by random projection. In CVPR.
Nie, F., Huang, H., & Ding, C.H. (2012). Low-rank matrix recovery via efficient schatten p-norm minimization. In AAAI.
Oh, T.H., Kim, H., Tai, Y.W., Bazin, J.C., & Kweon, I.S. (2013). Partial sum minimization of singular values in rpca for low-level vision. In ICCV.
Peng, Y., Ganesh, A., Wright, J., Xu, W., & Ma, Y. (2012). Rasl: Robust alignment by sparse and low-rank decomposition for linearly correlated images. IEEE Transaction on Pattern Analysis and Machine Intelligence, 34(11), 2233–2246.
Portilla, J. (2004). Blind non-white noise removal in images using gaussian scale. Citeseer: In Proceedings of the IEEE benelux signal processing symposium.
Rhea, D. (2011). The case of equality in the von Neumann trace inequality. Preprint.
Roth, S., & Black, M. J. (2009). Fields of experts. International Journal of Computer Vision, 82(2), 205–229.
Ruslan, S., & Srebro, N. (2010). Collaborative filtering in a non-uniform world: Learning with the weighted trace norm. In NIPS.
She, Y. (2012). An iterative algorithm for fitting nonconvex penalized generalized linear models with grouped predictors. Computational Statistics & Data Analysis, 56(10), 2976–2990.
Srebro, N., & Jaakkola, T., et al. (2003). Weighted low-rank approximations. In ICML.
Srebro, N., Rennie, J., & Jaakkola, T.S. (2004). Maximum-margin matrix factorization. In NIPS.
Tipping, M. E., & Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3), 611–622.
Wang, N., & Yeung, D.Y. (2013). Bayesian robust matrix factorization for image and video processing. In ICCV.
Wang, S., Zhang, L.,&Y, L. (2012). Nonlocal spectral prior model for low-level vision. In ACCV.
Wright, J., Peng, Y., Ma, Y., Ganesh, A., & Rao, S. (2009). Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In NIPS.
Zhang, D., Hu, Y., Ye, J., Li, X., & He X (2012a). Matrix completion by truncated nuclear norm regularization. In CVPR.
Zhang, Z., Ganesh, A., Liang, X., & Ma, Y. (2012b). Tilt: transform invariant low-rank textures. International Journal of Computer Vision, 99(1), 1–24.
Zhao, Q., Meng, D., Xu, Z., Zuo, W., & Zhang, L. (2014) Robust principal component analysis with complex noise. In ICML.
Zheng, Y., Liu, G., Sugimoto, S., Yan, S., & Okutomi, M. (2012). Practical low-rank matrix approximation under robust \(l_1\) norm. In CVPR.
Zhou M, Chen, H., Ren, L., Sapiro, G., Carin, L., & Paisley, J.W. (2009). Non-parametric bayesian dictionary learning for sparse image representations. In NIPS.
Zhou, X., Yang, C., Zhao, H., & Yu, W. (2014). Low-rank modeling and its applications in image analysis. arXiv preprint arXiv:1401.3409.
Zoran, D., & Weiss, Y. (2011). From learning models of natural image patches to whole image restoration. In ICCV.
Acknowledgments
This work is supported by the Hong Kong RGC GRF grant (PolyU 5313/13E).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Jean-Michel Morel.
Appendix
Appendix
In this appendix, we provide the proof details of the theoretical results in the main text.
1.1 Proof of Theorem 1
Proof
For any \({{\varvec{X}}}, {{\varvec{Y}}}\in \mathfrak {R}^{m\times {n}}(m>n)\) , denote by \(\bar{{{\varvec{U}}}}{{\varvec{D}}}\bar{{{\varvec{V}}}}^T\) and \( {{\varvec{U}}}\varvec{\varSigma }{{\varvec{V}}}^T\) the singular value decomposition of matrix \({{\varvec{X}}}\) and \({{\varvec{Y}}}\), respectively, where \(\varvec{\varSigma }=\left( \begin{array}{cc} diag(\sigma _1,\sigma _2,...,\sigma _n)\\ \mathbf 0 \end{array} \right) \in \mathfrak {R}^{m\times {n}}\), and \({{\varvec{D}}}=\left( \begin{array}{cc} diag(d_1,d_2,...,d_n)\\ \mathbf 0 \end{array} \right) \) are the diagonal singular value matrices. Based on the property of Frobenius norm, the following derivations hold:
Based on the von Neumann trace inequality in Lemma 1, we know that \(Tr\left( {{\varvec{Y}}}^T{{\varvec{X}}}\right) \) achieves its upper bound \(\sum _i^n\sigma _i d_i\) if \({{\varvec{U}}} = \bar{{{\varvec{U}}}}\) and \({{\varvec{V}}} = \bar{{{\varvec{V}}}}\). Then, we have
From the above derivation, we can see that the optimal solution of the WNNP problem in (5) is
where \({{\varvec{D}}}\) is the optimum of the constrained quadratic optimization problem in (6).
End of proof. \(\square \)
1.2 Proof of Corollary 1
Proof
Without considering the constraint, the optimization problem (6) degenerates to the following unconstrained formula:
It is not difficult to derive its global optimum as:
Since we have \(\sigma _1 \ge \sigma _2 \ge ... \ge \sigma _n\) and the weight vector has a non-descending order \(w_1\le w_2 \le ... \le w_n\), it is easy to see that \(\bar{d}_1 \ge \bar{d}_2 \ge ... \ge \bar{d}_n\). Thus, \(\bar{d}_{i=1,2,...,n}\) satisfy the constraint of (6), and the solution in (15) is then the globally optimal solution of the original constrained problem in (6).
End of proof. \(\square \)
1.3 Proof of Theorem 2
Proof
Denote by \({{\varvec{U}}}_k\varvec{\varLambda }_k{{\varvec{V}}}_k^T\) the SVD of the matrix \(\{{{\varvec{Y}}}+\mu _k^{-1}{{\varvec{L}}}_k-{{\varvec{E}}}_{k+1}\}\) in the \((k+1)\)-th iteration, where \(\varvec{\varLambda }_k = \{diag(\sigma _k^1, \sigma _k^2 , ..., \sigma _k^n)\}\) is the diagonal singular value matrix. Based on the conclusion of Corollary 1, we have
where \(\varvec{\varSigma }_k = {\mathcal {S}}_\mathbf{w /\mu _k}(\varvec{\varLambda }_k)\) is the singular value matrix after weighted shrinkage. Based on the Lagrange multiplier updating method in step 5 of Algorithm 1, we have
Thus, \(\{{{\varvec{L}}}_{k}\}\) is bounded.
To analyze the boundedness of \(\varGamma ({{\varvec{X}}}^{k+1},{{\varvec{E}}}^{k+1},{{\varvec{L}}}^{k},\mu ^k)\), first we can see the following inequality holds because in step 3 and step 4 we have achieved the globally optimal solutions of the \({{\varvec{X}}}\) and \({{\varvec{E}}}\) subproblems:
Then, based on the way we update \({{\varvec{L}}}\):
there is
Denote by \(\Theta \) the upper bound of \(\Vert {{\varvec{L}}}_k-{{\varvec{L}}}_{k-1}\Vert _F^2\) for all \(\{k=1,\ldots ,\infty \}\). We have
Since the penalty parameter \(\{\mu _k\}\) satisfies \(\sum _{k=1}^\infty \mu _k^{-2}\mu _{k+1}<+\infty \), we have
Thus, we know that \(\varGamma ({{\varvec{X}}}^{k+1},{{\varvec{E}}}^{k+1},{{\varvec{L}}}^{k},\mu ^k)\) is also upper bounded.
The boundedness of \(\{{{\varvec{X}}}^{k}\}\) and \(\{{{\varvec{E}}}^{k}\}\) can be easily deduced as follows:
Thus, \(\{{{\varvec{X}}}_{k}\}\), \(\{{{\varvec{E}}}_{k}\}\) and \(\{{{\varvec{L}}}_{k}\}\) generated by the proposed algorithm are all bounded. There exists at least one accumulation point for \(\{{{\varvec{X}}}_{k},{{\varvec{E}}}_{k},{{\varvec{L}}}_{k}\}\). Specifically, we have
and the accumulation point is a feasible solution to the objective function.
We then prove that the change of the variables in adjacent iterations tends to be zero. For the \({{\varvec{E}}}\) subproblem in step 3, we have
in which \(\mathcal {S}_{\frac{1}{\mu _k}}(\cdot )\) is the soft-thresholding operation with parameter \(\frac{1}{\mu _k}\), and m and n are the size of matrix \({{\varvec{Y}}}\).
To prove \(\lim _{k\rightarrow \infty }\Vert {{\varvec{X}}}_{k+1}-{{\varvec{X}}}_{k}\Vert _F=0\), we recall the updating strategy in Algorithm 1 which makes the following inequalities hold:
where \({{\varvec{U}}}_{k-1}\varvec{\varLambda }_{k-1}{{\varvec{V}}}_{k-1}^T\) is the SVD of the matrix \(\{{{\varvec{Y}}}+\mu _{k-1}^{-1}{{\varvec{L}}}_{k-1}-{{\varvec{E}}}_{k}\}\) in the k-th iteration. We then have
End of proof. \(\square \)
1.4 Proof of Remark 1
Proof
Based on the conclusion of Theorem 1, the WNNM problem can be equivalently transformed to a constrained singular value optimization problem. Furthermore, when utilizing the reweighting strategy \(w_i^{\ell +1}=\frac{C}{\sigma _i^\ell ({{\varvec{X}}})+\varepsilon }\), the singular values of \({{\varvec{X}}}\) are consistently sorted in a non-ascending order. The weight vector thus follows the non-descending order. It is then easy to deduce that the sorted orders of the sequences \(\{\sigma _i({{\varvec{Y}}}), \sigma _i({{\varvec{X}}}_\ell ),w_i^\ell ; i=1,2,\cdots ,n\}\) keep unchanged during the iteration. Thus, the optimization for each singular value \(\sigma _i({{\varvec{X}}})\) can be analyzed independently. For the purpose of simplicity, in the following development we omit the subscript i and denote by y a singular value of matrix \({{\varvec{Y}}}\), and denote by x and w the corresponding singular value of \({{\varvec{X}}}\) and its weight.
For the weighting strategy \(w^\ell =\frac{C}{x^{\ell -1}+\varepsilon }\), we have
Since we initialize \(x^0\) as the singular value of matrix \({{\varvec{X}}}_0={{\varvec{Y}}}\), and each \(x^\ell \) is a result of soft-thresholding operation on positive value \(y=\sigma _i({{\varvec{Y}}})\), \(\{x^\ell \}\) is a non-negative sequence. The convergence value \(\lim _{\ell \rightarrow \infty } x^\ell \) for different conditions are analyzed as follows.
-
(1)
\(c_2<0\) From the definition of \(c_1\) and \(c_2\), we have \((y+\varepsilon )^2-4C<0\). In such case, the quadratic system \(x^2+(\varepsilon -y)x+C-y\varepsilon =0\) does not have a real solution and function \(f(x) = x^2+(\varepsilon -y)x+C-y\varepsilon \) gets its positive minimum value \(C-y\varepsilon -\frac{(y-\varepsilon )^2}{4}\) at \(x=\frac{y-\varepsilon }{2}\). \(\forall \tilde{x}\ge 0\), the following inequalities hold
$$\begin{aligned}&f(\tilde{x})\ge f\left( \frac{y-\varepsilon }{2}\right) \\&\tilde{x}^2+(\varepsilon -y)\tilde{x}\ge -\frac{(y-\varepsilon )^2}{4}\\&\tilde{x}-\frac{C-y\varepsilon -\frac{(y-\varepsilon )^2}{4}}{\tilde{x}+\varepsilon }\ge y-\frac{C}{\tilde{x}+\varepsilon }. \end{aligned}$$The sequence \(x^{\ell +1}=max\left( y-\frac{C}{x^{\ell }+\varepsilon },0\right) \) with initialization \(x^0=y\) is a monotonically decreasing sequence for any \(x^\ell \ge 0\). We have \(x^\ell <y\), and
$$\begin{aligned} x^\ell -\left( y-\frac{C}{x^\ell +\varepsilon }\right) >\frac{C-y\varepsilon -\frac{(y-\varepsilon )^2}{4}}{y+\varepsilon }. \end{aligned}$$If \(x^\ell \le \frac{C-y\varepsilon }{y}\), we have \(y-\frac{C}{x^\ell +\varepsilon }\le 0\) and \(x^{\ell +1} = max\left( y-\frac{C}{x^{\ell }+\varepsilon },0\right) =0\). If \(x^\ell >\frac{C-y\varepsilon }{y}\), \(\exists N\in \mathbb {N}\) makes \(x^{\ell +N}<x^\ell -N\cdot \frac{C-y\varepsilon -\frac{(y-\varepsilon )^2}{4}}{y+\varepsilon }\) less than \(\frac{C-y\varepsilon }{y}\). The sequence \(\{x^\ell \}\) will shrink to 0 monotonically.
-
(2)
\(c_2\ge 0\) In such case, we can know that \(y>0\), because if \(y=0\), we will have \(c_2=(y+\varepsilon )^2-4C=\varepsilon ^2-4C<0\). For positive C and sufficiently small value \(\varepsilon \), we can know that \(c_1\) is also non-negative:
$$\begin{aligned}&c_2 = (y+\varepsilon )^2-4C\ge 0\\&(y+\varepsilon )^2\ge 4C\\&y-\varepsilon \ge 2(\sqrt{C}-\varepsilon )\\&c_1=y-\varepsilon \ge 0. \end{aligned}$$Having \(c_2\ge 0\), \(c_1\ge 0\), we have
$$\begin{aligned} \bar{x}_2 = \frac{y-\varepsilon +\sqrt{(y-\varepsilon )^2-4(C-\varepsilon y)}}{2}>0. \end{aligned}$$For any \(x>\bar{x}_2>0\), the following inequalities hold:
$$\begin{aligned}&f(x) = x^2+(\varepsilon -y)x+C-y\varepsilon>0\\&\left[ x-\left( y-\frac{C}{x+\varepsilon }\right) \right] (x+\varepsilon )>0\\&x>y-\frac{C}{x+\varepsilon } . \end{aligned}$$Furthermore, we have
$$\begin{aligned} x>y-\frac{C}{x+\varepsilon }>y-\frac{C}{\bar{x}_2+\varepsilon }=\bar{x}_2. \end{aligned}$$Thus, for \(x^0=y>\bar{x}_2\), we always have \(x^\ell>x^{\ell +1}>\bar{x}_2\), the sequence is monotonically decreasing and has lower bound \(\bar{x}_2\). The sequence will converge to \(\bar{x}_2\), as we prove below. We propose a proof by contradiction. If \({x^\ell }\) converges to \(\hat{x}\ne \bar{x}_2\), then we have \(\hat{x}>\bar{x}_2\) and \(f(\hat{x})>0\). By the definition of convergence, we can obtain that \(\forall \epsilon >0\), \(\exists N\in \mathbb {N}\) s.t. \(\forall \ell \ge N\), the following inequality must be satisfied
$$\begin{aligned} |x^\ell -\hat{x}|<\epsilon . \end{aligned}$$(18)We can also have the following inequalies
$$\begin{aligned}&f(x^N) \ge f(\hat{x})\\&\left[ x^N-\left( y-\frac{C}{x^N+\varepsilon }\right) \right] (x^N+\varepsilon ) \ge f(\hat{x})\\&\left[ x^N-\left( y-\frac{C}{x^N+\varepsilon }\right) \right] (y+\varepsilon ) \ge f(\hat{x})\\&x^N-\left( y-\frac{C}{x^N+\varepsilon }\right) \ge \frac{f(\hat{x})}{y+\varepsilon }\\&x^{N}-x^{N+1}>\frac{f(\hat{x})}{y+\varepsilon } \end{aligned}$$If we take \(\epsilon =\frac{f(\hat{x})}{2(y+\varepsilon )}\), then \( x^{N}-x^{N+1}> 2\epsilon \), and we can thus obtain
$$\begin{aligned}&|x^{N+1}-\hat{x}|\\= & {} |x^{N+1}-x^N+x^N-\hat{x}|\\\ge & {} \left| |x^{N+1}-x^N|-|x^N-\hat{x}|\right| \\\le & {} |2\epsilon -\epsilon |=\epsilon \end{aligned}$$This is however a contradiction to (18), and thus \({x^\ell }\) converges to \( {\bar{x}}_2\).
End of proof. \(\square \)
Rights and permissions
About this article
Cite this article
Gu, S., Xie, Q., Meng, D. et al. Weighted Nuclear Norm Minimization and Its Applications to Low Level Vision. Int J Comput Vis 121, 183–208 (2017). https://doi.org/10.1007/s11263-016-0930-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-016-0930-5