Skip to main content
Log in

Exact Low-Rank Matrix Completion from Sparsely Corrupted Entries Via Adaptive Outlier Pursuit

  • Published:
Journal of Scientific Computing Aims and scope Submit manuscript

Abstract

Recovering a low-rank matrix from some of its linear measurements is a popular problem in many areas of science and engineering. One special case of it is the matrix completion problem, where we need to reconstruct a low-rank matrix from incomplete samples of its entries. A lot of efficient algorithms have been proposed to solve this problem and they perform well when Gaussian noise with a small variance is added to the given data. But they can not deal with the sparse random-valued noise in the measurements. In this paper, we propose a robust method for recovering the low-rank matrix with adaptive outlier pursuit when part of the measurements are damaged by outliers. This method will detect the positions where the data is completely ruined and recover the matrix using correct measurements. Numerical experiments show the accuracy of noise detection and high performance of matrix completion for our algorithms compared with other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Absil, P.A., Gallivan, K.A.: Trust-region methods on Riemannian manifolds. Found. Comput. Math. 7(3), 303–330 (2006)

    Article  MathSciNet  Google Scholar 

  2. Balzano, L., Nowak, R., Recht, B.: Online identification and tracking of subspaces from highly incomplete information. In: 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2010, pp. 704–711 (2010)

  3. Boumal, N., Absil, P.A.: RTRMC: A Riemannian trust-region method for matrix completion. In: Twenty-Fifth Annual Conference on Neural Information Processing Systems, Grenade (2011)

  4. Cai, J., Candès, E.J., Shen, Z.: A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20, 1956–1982 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  5. Cai, J., Osher, S.: Fast singular value thresholding without singular value decomposition. In: UCLA CAM Report, pp. 10–24 (2010)

  6. Candes, E.J., Li, X., Ma, Y., Wright, J.: Robust principal component analysis? J. ACM 58(1), 1–37 (2009)

    MathSciNet  Google Scholar 

  7. Candès, E.J., Recht, B.: Exact matrix completion via convex optimization. Found. Comput. Math. 9, 717–772 (2008)

    Article  Google Scholar 

  8. Candès, E.J., Tao, T.: The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inf. Theory 56(5), 2053–2080 (2010)

    Article  Google Scholar 

  9. Dai, W., Kerman, E., Milenkovic, O.: A geometric approach to low-rank matrix completion. IEEE Trans. Inf. Theory 58(1), 237–247 (2012)

    Article  MathSciNet  Google Scholar 

  10. Dai, W., Milenkovic, O., Kerman, E.: Subspace evolution and transfer (SET) for low-rank matrix completion. IEEE Trans. Signal Process. 59(7), 3120–3132 (2011)

    Article  MathSciNet  Google Scholar 

  11. Dong, B., Ji, H., Li, J., Shen, Z., Xu, Y.: Wavelet frame based blind image inpainting. Appl. Comput Harmon. Anal. 32(2), 155–312 (2012)

    Article  MathSciNet  Google Scholar 

  12. Elden, L.: Matrix Methods in Data Mining and Pattern Recognition. Society for Industrial and Applied Mathematics, Philadephia (2007). doi:10.1137/1.9780898718867

    Book  MATH  Google Scholar 

  13. Fazel, M., Hindi, H., Boyd, H., Boyd, S.P.: A rank minimization heuristic with application to minimum order system approximation. Proc. 2001 Am. Control Conf. Cat No. 01CH37148 6(2), 4734–4739 (2001)

    Article  Google Scholar 

  14. He, J., Balzano, L., Lui, J.C.S.: Online robust subspace tracking from partial information, arXiv: 1109.3827 (preprint) (2011)

  15. Jain, P., Meka, R., Dhillon, I.: Guaranteed rank minimization via singular value projection. Adv. Neural Inf. Process. Syst. 23, 937–945 (2010)

    Google Scholar 

  16. Ji, H., Huang, S., Shen, Z., Xu, Y.: Robust video restoration by joint sparse and low rank matrix approximation. SIAM J. Imaging Sci. 4(4), 1112–1142 (2012)

    MathSciNet  Google Scholar 

  17. Ji, H., Liu, C., Shen, Z., Xu, Y.: Robust video denoising using low rank matrix completion. In: IEEE Conference in Computer Vision and Pattern Recognition (CVPR), San Francisco, CA (2010)

  18. Keshavan, R., Montanari, A.: Regularization for matrix completion. In: 2010 IEEE International Symposium on Information Theory Proceedings (ISIT), pp. 1503–1507. Boston (2010)

  19. Keshavan, R., Montanari, A., Oh, S.: Matrix completion from a few entries. IEEE Trans. Inf. Theory 56(6), 2980–2998 (2010)

    Article  MathSciNet  Google Scholar 

  20. Keshavan, R.H., Montanari, A., Oh, S.: Low-rank matrix completion with noisy observations: a quantitative comparison. In: Proceedings of the 47th Annual Allerton Conference on Communication, Control, and Computing, Allerton’09, pp. 1216–1222. IEEE Press, Los Alamitos (2009)

  21. Linial, N., London, E., Rabinovich, Y.: The geometry of graphs and some of its algorithmic applications. Combinatorica 15, 215–245 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  22. Ma, S., Goldfarb, D., Chen, L.: Fixed point and Bregman iterative methods for matrix rank minimization. Math. Program. 128(1–2), 73–122 (2011)

    MathSciNet  Google Scholar 

  23. Meyer, G., Bonnabel, S., Sepulchre, R.: Linear regression under fixed-rank constraints: a Riemannian approach. In: 28th International Conference on Machine Learning (ICML), Washington (2011)

  24. Recht, B., Fazel, M., Parrilo, P.A.: Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  25. Toh, K.C., Yun, S.: An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems. Engineering 117543(3), 1–31 (2009)

    Google Scholar 

  26. Vandereycken, B.: Low-rank matrix completion by Riemannian optimization (preprint) (2011)

  27. Waters, A.E., Sankaranarayanan, A.C., Baraniuk, R.G.: Sparcs: Recovering low-rank and sparse matrices from compressive measurements. In: Neural Information Processing Systems (NIPS). Granada, Spain (2011)

  28. Wen, Z., Yin, W., Zhang, Y.: Solving a low-rank factorization model for matrix completion by a nonlinear successive over-relaxation algorithm. Rice University CAAM Technical Report TR1007, pp. 1–24 (2010)

  29. Yan, M.: Restoration of images corrupted by impulse noise using blind inpainting and \(\ell _0\) norm, UCLA CAM Report 11-72 (under review) (2011)

  30. Yan, M., Yang, Y., Osher, S.: Robust 1-bit compressive sensing using adaptive outlier pursuit. IEEE Trans. Signal Process. 60(7), 3868–3875 (2012)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work is supported by NSF Grant DMS-0714945, Center for Domain-Specific Computing (CDSC) under the NSF Expeditions in Computing Award CCF-0926127, ONR Grant N00014-11-1-0719, N00014-08-1-119 and an ARO MURI subcontract from Rice University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming Yan.

Appendix

Appendix

This appendix provides the mathematical proofs of the theoretical results in Sect. 4.

1.1 Proof of Theorem 1

Proof

When we overestimate \(K\), the \(K\) outliers will be found to make the objective function 0. Therefore, we only need to consider the \((p-K)\) correct entries. In the mean time, some non-outliers (the overestimated \(\varDelta K\) entries) are also considered as outliers and will not be used for MC. As we know, if the number of given entries in one row (or column) is less than \(r\), the reconstructed matrix is not unique. Since \(\varDelta K\) of the \((p-K)\) known entries will not be used in reconstructing the matrix, when \(\varDelta K\) is large enough to make the number of known entries in one row (or column) less than \(r\), we will have more than one solution. It is easily seen that the smallest number of known entries in one row (or column) of the matrix is less than or equal to \(\lfloor (p-K)/m\rfloor \) (or \(\lfloor (p-K)/n\rfloor \)), here \(\lfloor x\rfloor \) is the largest integer that does not exceed \(x\). Without loss of generality, let us assume column \(j\) has the smallest number of known entries. It is obvious that this number is bounded by \(\min (\lfloor (p-K)/m\rfloor , \lfloor (p-K)/n\rfloor )\). To make the number of known entries in this column no less than \(r\), the smallest number of entries to be deleted should not exceed \(\min (\lfloor (p-K)/m\rfloor , \lfloor (p-K)/n\rfloor )-r+1\). Thus if \(\varDelta K\) is greater than \(\min (\lfloor (p-K)/m\rfloor , \lfloor (p-K)/n\rfloor )-r\), the reconstructed matrix will not be unique.

1.2 Proof of Theorem 2

Proof

Since the probability that a certain location is chosen is fixed and equals \(q\!=\!(p-K)/ (mn)\). In addition, whether one entry is chosen or not is independent of other entries. The number of known entries in each row (or column) of this matrix follows binomial distribution. For each row, the cumulative distribution can be expressed as

$$\begin{aligned} F(x,n,q)=P(X\le x)=\sum _{i=0}^{\lfloor x \rfloor }\left(\begin{array}{c}n\\ i\end{array}\right)q^i(1-q)^{n-i}, \end{aligned}$$
(7.1)

where \(X\) is the number of known entries in this row.

From the Hoeffding’s inequality, we have

$$\begin{aligned} F(k,n,q)\le \exp \left(-2{(nq-k)^2\over n}\right), \end{aligned}$$
(7.2)

for any integer \(k\le nq\). Since the distribution of the number of known entries in each row is independent, we can find the upper bound for the probability that there exists one row with at most \(k\) given entries:

$$\begin{aligned} P(\min (k_1^r,k_2^r,\ldots ,k_m^r)\le k)\le 1-\left(1-\exp \left(-2{(nq-k)^2\over n}\right)\right)^m:=P_1. \end{aligned}$$
(7.3)

Here \(k_i^r\) stands for the number of given entries in the \(i^{th}\) row. Similarly the upper bound for the probability that there exists one column with at most \(k\) given entries can be expressed as follows:

$$\begin{aligned} P(\min (k_1^c,k_2^c,\ldots ,k_n^c)\le k)\le 1-\left(1-\exp \left(-2{(mq-k)^2\over m}\right)\right)^n:=P_2. \end{aligned}$$
(7.4)

where \(k_j^c\) is defined as the number of given entries in the \(j\) th column. Combing these two together, we have

$$\begin{aligned} P(\min (k_1^r,k_2^r,\ldots ,k_m^r,k_1^c,k_2^c,\ldots ,k_n^c)\le k)\le P_1+P_2\le 2\max (P_1,P_2), \end{aligned}$$
(7.5)

which means the probability that there exists one row or column with at most \(k\) given entries can be bounded by \(2\max (P_1,P_2)\).

Let us first assume \(P_1>P_2\). Defining

$$\begin{aligned} P_0=2\left(1-\left(1-\exp \left(-2{(nq-k)^2\over n}\right)\right)^m\right), \end{aligned}$$
(7.6)

we then have

$$\begin{aligned} \exp \left(-2{(nq-k)^2\over n}\right)=1-\left(1-{P_0\over 2}\right)^{1/m} \end{aligned}$$
(7.7)
$$\begin{aligned} \Longrightarrow \ k=nq-\sqrt{{-n\over 2}\log (1-(1-P_0/2)^{1/m})}. \end{aligned}$$
(7.8)

When \(P_1<P_2\), we have

$$\begin{aligned} k=mq-\sqrt{{-m\over 2}\log (1-(1-P_0/2)^{1/n})}. \end{aligned}$$
(7.9)

We define \(K_1\) as the minimal of these two values. Hence given \(P_0\), the probability of having one row or column with at most \(K_1\) entries is less than \(P_0\).

Besides the Hoeffding’s inequality, we also have Chernoff’s inequality,

$$\begin{aligned} F(k,n,q)\le \exp \left(-{1\over 2q}{(nq-k)^2\over n}\right). \end{aligned}$$
(7.10)

In this case

$$\begin{aligned} P_1&=1-\left(1-\exp \left(-{1\over 2p}{(nq-k)^2\over n}\right)\right)^m\end{aligned}$$
(7.11)
$$\begin{aligned} P_2&=1-\left(1-\exp \left(-{1\over 2p}{(mq-k)^2\over m}\right)\right)^n. \end{aligned}$$
(7.12)

After similar calculation, we have

$$\begin{aligned} k&=nq-\sqrt{{-2nq}\log (1-(1-P_0/2)^{1/m})} \end{aligned}$$
(7.13)

for \(P_1>P_2\), and

$$\begin{aligned} k&=mq-\sqrt{{-2mq}\log (1-(1-P_0/2)^{1/n})} \end{aligned}$$
(7.14)

when \(P_1<P_2\). Similarly we define \(K_2\) to be the smaller one of these two values, and the probability of having one row or column with at most \(K_2\) entries is less than \(P_0\). Combining the results from two inequalities together, we know that with at most \(P_0\) probability there exists one row or column with at most \(\max (K_1,K_2)\) given entries.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, M., Yang, Y. & Osher, S. Exact Low-Rank Matrix Completion from Sparsely Corrupted Entries Via Adaptive Outlier Pursuit. J Sci Comput 56, 433–449 (2013). https://doi.org/10.1007/s10915-013-9682-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10915-013-9682-3

Keywords

Mathematics Subject Classification (2000)

Navigation