Incoherent Submatrix Selection via Approximate Independence Sets in Scalar Product Graphs

Chrétien, Stéphane; Ho, Zhen Wai Olivier

doi:10.1007/978-3-030-37599-7_9

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11943))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

1883 Accesses

Abstract

This paper addresses the problem of extracting the largest possible number of columns from a given matrix $X\in \mathbb R^{n\times p}$ in such a way that the resulting submatrix has an coherence smaller than a given threshold $\eta $. This problem can clearly be expressed as the one of finding a maximum cardinality stable set in the graph whose adjacency matrix is obtained by taking the componentwise absolute value of $X^tX$ and setting entries less than $\eta $ to 0 and the other entries to 1. We propose a spectral-type relaxation which boils down to optimising a quadratic function on a sphere. We prove a theoretical approximation bound for the solution of the resulting relaxed problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The Maximum Weighted Submatrix Coverage Problem: A CP Approach

An Elementary Approach to the Problem of Column Selection in a Rectangular Matrix

Beyond symmetry: best submatrix selection for the sparse truncated SVD

Article 21 December 2023

Notes

1.
Here, positivity is trivial.

References

Adcock, B., Hansen, A.C., Poon, C., Roman, B.: Breaking the coherence barrier: a new theory for compressed sensing. Forum Math. Sigma 5, 84 (2017)
Article MathSciNet Google Scholar
Adcock, B., Hansen, A.C., Poon, C., Roman, B., et al.: Breaking the coherence barrier: asymptotic incoherence and asymptotic sparsity in compressed sensing. Preprint (2013)
Google Scholar
Arora, S., Ge, R., Moitra, A.:. New algorithms for learning incoherent and overcomplete dictionaries. In: Conference on Learning Theory, pp. 779–806 (2014)
Google Scholar
Baraniuk, R.G.: Compressive sensing [lecture notes]. IEEE Signal Process. Mag. 24(4), 118–121 (2007)
Article Google Scholar
Bellec, P.C.: Localized Gaussian width of $m$-convex hulls with applications to lasso and convex aggregation. arXiv preprint arXiv:1705.10696 (2017)
Bühlmann, P., Van De Geer, S.: Statistics for High-dimensional Data: Methods, Theory and Applications. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20192-9
Book MATH Google Scholar
Candes, E., Romberg, J.: Sparsity and incoherence in compressive sampling. Inverse Prob. 23(3), 969 (2007)
Article MathSciNet Google Scholar
Candès, E.J.: Mathematics of sparsity (and a few other things). In: Proceedings of the International Congress of Mathematicians, Seoul, South Korea, vol. 123. Citeseer (2014)
Google Scholar
Candes, E.J., Eldar, Y.C., Needell, D., Randall, P.: Compressed sensing with coherent and redundant dictionaries. Appl. Comput. Harmonic Anal. 31(1), 59–73 (2011)
Article MathSciNet Google Scholar
Candès, E.J., Plan, Y.: Near-ideal model selection by l1 minimization. Ann. Stat. 37(5A), 2145–2177 (2009)
Article Google Scholar
Candès, E.J., Wakin, M.B.: An introduction to compressive sampling. IEEE Signal Process. Mag. 25(2), 21–30 (2008)
Article Google Scholar
Cevher, V., Boufounos, P., Baraniuk, R.G., Gilbert, A.C., Strauss, M.J.: Near-optimal Bayesian localization via incoherence and sparsity. In: International Conference on Information Processing in Sensor Networks, IPSN 2009, pp. 205–216. IEEE (2009)
Google Scholar
Chrétien, S., Darses, S.: Invertibility of random submatrices via tail-decoupling and a matrix Chernoff inequality. Stat. Probab. Lett. 82(7), 1479–1487 (2012)
Article MathSciNet Google Scholar
Chrétien, S., Darses, S.: Sparse recovery with unknown variance: a Lasso-type approach. IEEE Trans. Inf. Theory 60(7), 3970–3988 (2014)
Article MathSciNet Google Scholar
Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing, vol. 1. Birkhäuser, Basel (2013)
Book Google Scholar
Hager, W.W.: Minimizing a quadratic over a sphere. SIAM J. Optim. 12(1), 188–208 (2001)
Article MathSciNet Google Scholar
Mallat, S.: A Wavelet Tour of Signal Processing: The Sparse Way. Academic Press, Cambridge (2008)
MATH Google Scholar
Nelson, J.L., Temlyakov, V.N.: On the size of incoherent systems. J. Approximation Theory 163(9), 1238–1245 (2011)
Article MathSciNet Google Scholar
Romberg, J.: Imaging via compressive sampling. IEEE Signal Process. Mag. 25(2), 14–20 (2008)
Article Google Scholar
Van De Geer, S.A., Bühlmann, P., et al.: On the conditions used to prove oracle results for the Lasso. Electron. J. Stat. 3, 1360–1392 (2009)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

National Physical Laboratory, Teddington, TW11 0LW, UK
Stéphane Chrétien
Laboratoire de Mathématiques de Besançon, 16 Route de Gray, 25030, Besançon, France
Stéphane Chrétien & Zhen Wai Olivier Ho
FEMTO-ST, 15B avenue des Montboucons, 25030, Besançon, France
Stéphane Chrétien

Authors

Stéphane Chrétien
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Wai Olivier Ho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stéphane Chrétien .

Editor information

Editors and Affiliations

University of Cambridge, Cambridge, UK
Giuseppe Nicosia
University of Florida, Gainesville, FL, USA
Panos Pardalos
Harvard University, Cambridge, MA, USA
Renato Umeton
Università di Catania, Catania, Catania, Italy
Giovanni Giuffrida
Almawave, Rome, Roma, Italy
Vincenzo Sciacca

A Minimizing Quadratic Functionals on the Sphere

1.1 A.1 A Semi-explicit Solution

The following result can be found in [16].

Lemma 1

For $Q\in \mathbb S_p$ and $q\in \mathbb R^p$, consider the following quadratic programming problem over the sphere:

$$\begin{aligned} \min _{\Vert x \Vert _2=1} \quad \frac{1}{2} x^tQx-q^tx. \end{aligned}$$

(10)

Let $\lambda _1 \le \ldots \le \lambda _p$ be the eigenvalues of Q and $\phi _1$,...,$ \phi _p$ be associated pairwise orthogonal, unit-norm eigenvectors. Let $\gamma _{k,i}=q^t \phi _i$, $i=1,\ldots ,p$. Let $\mathcal E_1=\{i \text { s.t. } \lambda _i=\lambda _1 \}$ and $\mathcal E_+=\{i \text { s.t. } \lambda _i>\lambda _1 \}$. Then, $x^*$ is a solution if and only if

$$\begin{aligned} x^*&= \sum _{i=1}^p c^*_i \phi _i \end{aligned}$$

and

1.
degenerate case: If $\gamma _i=0$ for all $i \in \mathcal E_1$ and
$$\begin{aligned} \sum _{i\in \mathcal E_+} \, \frac{\gamma _i^2}{(\lambda _i-\lambda _1)^2} \le 1. \end{aligned}$$
then $c_i^*=\gamma _i/(\lambda _i-\lambda _1)$, $i\in \mathcal E_1$ and $c_i^*$, $i\in \mathcal E_1$ are arbitrary under the constraint that $\sum _{i\in \mathcal E_1} \quad c^{*^2}_i = 1-\sum _{i\in \mathcal E_+} \quad c^{*^2}_i$.
2.
nondegenerate case: If not in the degenerate case, $c_i^*=\gamma _i/(\lambda _i-\mu )$, $i=1,\ldots ,n$ for $\mu > -\lambda _1$ which is a solution of
$$\begin{aligned} \sum _{i=1,\ldots ,n} \, \frac{\gamma _i^2}{(\lambda _i-\mu )^2}&= 1. \end{aligned}$$
(11)

Moreover, we have the following useful result.

Corollary 1

If Q is positive definite, and $\sum _{i=1,\ldots ,p} \ \gamma _i^2/\lambda _i^2 <1$, then $0<\mu <\lambda _1$.

Proof

This follows immediately from the intermediate value theorem.

1.2 A.2 Bounds on $\mu $

From (11), we can get the following easy bounds on $\mu $.

Lemma 2

Let $\gamma _{\min }= \min _{i=1}^p \gamma _i$ and $\gamma _{\max }= \max _{i=1}^p \gamma _i$. Then, we have

$$\begin{aligned} p \gamma _{\max }^2 \ge \max _{i=1}^p \ \{(\lambda _i-\mu )^2\}&\ge p \gamma _{\min }^2. \end{aligned}$$

(12)

and

$$\begin{aligned} \gamma _{\min }^2 \le \min _{i=1}^p \ \{(\lambda _i-\mu )^2\}&\le \Vert \gamma \Vert _2^2 \end{aligned}$$

(13)

Proof

The proof is divided into three parts, corresponding to each (double) inequality.

Proof of (12): We have

$$\begin{aligned} \max _{i=1}^p \frac{\gamma ^2_{\max }}{(\lambda _i-\mu )^2}&\ge \max _{i=1}^p \frac{\gamma _i^2}{(\lambda _i-\mu )^2} \\&\ge \frac{1}{p} \sum _{i=1}^p \frac{\gamma _i^2}{(\lambda _i-\mu )^2}\\&=\frac{1}{p}. \end{aligned}$$

This immediately gives $p \gamma _{\max } \ge \max _{i=1}^p \ \{(\lambda _i-\mu )^2\}$. On the one hand, we have

$$\begin{aligned} 1=p \ \sum _{i=1,\ldots ,p} \, \frac{\gamma _i^2}{(\lambda _i-\mu )^2}&\ge \frac{p\gamma _{\min }^2}{\max _{i=1}^p \{(\lambda _i-\mu )^2\}}. \end{aligned}$$

Therefore, we get $\max _{i=1}^p \{(\lambda _i-\mu )^2\}\ge p \ \gamma _{\min }^2$. On the other hand, we have

Proof of (13):

$$\begin{aligned} \frac{\gamma _i^2}{(\lambda _i-\mu )^2}&\le 1 \end{aligned}$$

which gives

$$\begin{aligned} (\lambda _i-\mu )^2&\ge \gamma _i^2 \end{aligned}$$

for $i=1,\ldots ,p$. Thus, the lower bound follows. For the other bound, since

$$\begin{aligned} \sum _{i=1}^p \frac{\gamma _i^2}{(\lambda _i-\mu )^2}&=1, \end{aligned}$$

(14)

we get

$$\begin{aligned} 1 \le \sum _{i=1}^p \frac{\gamma _i^2}{(\lambda _i-\mu )^2}&\le \frac{\Vert \gamma \Vert _2^2}{\min _{i=1}^p \ (\lambda _i-\mu )^2} \end{aligned}$$

and the proof in completed.

1.3 A.3 $\ell _\infty $ Perturbation of the Linear Term

We now consider the problem of controlling the solution under perturbation of q.

Lemma 3

Consider the two quadratic programming problems over the sphere:

$$\begin{aligned} \min _{\Vert x\Vert _2=1} \quad \frac{1}{2} x^tQx-q_k^tx, \end{aligned}$$

(15)

for $k=1,2$. Assume that the solution to (15) is non-degenerate in both cases $k=1,2$ and let $x^*_1$ and $x^*_2$ be the corresponding solutions. Assume further that $\sum _{i=1,\ldots ,n} \ \gamma _{k,i}^2/\lambda _i^2 <1$, $k=1,2$. Let $\phi $ denote the inverse function of $x\mapsto x/(1+x)^3$. Then, we have

$$\begin{aligned} \Vert x_1^*-x_2^*\Vert _{\infty }&\le \sqrt{p} \left( \frac{\Vert \gamma _{1}-\gamma _{2}\Vert _2 }{(\lambda _1-\mu _2)}+ \frac{\Vert \gamma _{1}\Vert _2 \ \nu ^*}{(\lambda _1-\mu _1)(\lambda _1-\mu _2)} \right) ^2, \end{aligned}$$

with $r^*$ given by

$$\begin{aligned} \nu ^*&= (\lambda _p-\mu _1) \phi \left( p \ \frac{\gamma _{1,\max }^2}{\gamma _{1,\min }^2} \frac{\Vert \gamma _1^2-\gamma _2^2 \Vert _1}{2 \ \Vert \gamma _2\Vert _2^2}\right) \end{aligned}$$

Proof

Let $\varPhi $ denote the matrix whose columns are the eigenvectors of A. More precisely, $\lambda _1\le \cdots \le \lambda _p$ and let $\phi _i$ be an eigenvector associated with $\lambda _i$, $i=1,\ldots ,p$. Let $\gamma _i=q^t \phi _i$, $i=1,\ldots ,p$. Let $c_1^*$ (resp. $c_2^*$) be the vector of coefficients of $x_1^*$ (resp. $x_2^*$) in the eigenbasis of A. For each $k=1,2$, there exists a real $\mu _k$ such that

$$\begin{aligned} c_{k,i}^*=\frac{\gamma _{k,i}}{(\lambda _i-\mu _k)}, \end{aligned}$$

$i=1,\ldots ,p$ for $\mu _k > -\lambda _1$ which is a solution of

$$\begin{aligned} \sum _{i=1}^p \, \frac{\gamma _{k,i}^2}{(\lambda _i-\mu )^2} = 1. \end{aligned}$$

Now, apply Neuberger’s Theorem 2 to obtain an estimation of $\vert \mu _1-\mu _2\vert $ as a function of $\gamma _1$ and $\gamma _2$. For this purpose, set

$$\begin{aligned} F(\mu )&= \sum _{i=1}^p \, \frac{\gamma _{2,i}^2}{(\lambda _i-\mu )^2} -1, \ i.e. \quad F'(\mu ) = 2 \sum _{i=1}^p \ \frac{\gamma _{2,i}^2}{(\lambda _i-\mu )^3}. \end{aligned}$$

Now, we need to find the smallest value of $\nu $ such that, for all $\mu \in B(\mu _1,\nu )$, we need to find a number $h \in \bar{B}(0,\nu )$ such that

$$\begin{aligned} h&= F'(\mu )^{-1}\ F(\mu _1) \end{aligned}$$

We therefore have that

$$\begin{aligned} h&= \frac{\sum _{i=1}^p \frac{\gamma ^2_{2,i}}{(\lambda _i-\mu _1)^2}-1}{2 \ \sum _{i=1}^p \frac{\gamma ^2_{2,i}}{(\lambda _i-\mu )^3}} = \frac{\sum _{i=1}^p \frac{\gamma ^2_{1,i}}{(\lambda _i-\mu _1)^2}-1+ \sum _{i=1}^p \frac{\gamma ^2_{2,i}-\gamma ^2_{1,i}}{(\lambda _i-\mu _1)^2}}{2 \ \sum _{i=1}^p \frac{\gamma ^2_{2,i}}{(\lambda _i-\mu )^3}} \end{aligned}$$

and since

$$\begin{aligned} \sum _{i=1}^p \ \frac{\gamma ^2_{1,i}}{(\lambda _i-\mu _1)^2}&=1, \end{aligned}$$

we have

$$\begin{aligned} h&\le \frac{ (\min _{i=1}^p\ \{(\lambda _i -\mu _1)^{2}\})^{-1} \ \Vert \gamma ^2_1-\gamma ^2_2\Vert _1 }{2 \ \Vert \gamma _{2} \Vert _2^2 \ (\max \{(\lambda _i-\mu )^3\})^{-1}} \end{aligned}$$

where $\cdot ^2$ is to be understood componentwise. Moreover, since $\sum _{i=1,\ldots ,p}$$\gamma _{k,i}^2/\lambda _i^2 <1$, $k=1,2$,

$$\begin{aligned} \max \{(\lambda _i-\mu )^3\}&= (\lambda _p-\mu _1 +r)^3 \text { and } \min _{i=1}^p \{(\lambda _i-\mu _1)^2\} = (\lambda _1-\mu _1)^2. \end{aligned}$$

Thus, for $\nu >0$ such that

$$\begin{aligned} \nu&\ge \frac{ \Vert \gamma ^2_1-\gamma ^2_2\Vert _1 \ (\lambda _p-\mu _1+\nu )^{3} }{2 \ \Vert \gamma _{2} \Vert _2^2 \ (\lambda _1 -\mu _1)^2}, \end{aligned}$$

we get from Theorem 2 that there exists a solution to the equation $F(u)=0$ inside the ball $\bar{B}(\mu _1,\nu )$. Make the change of variable

$$\begin{aligned} \nu&= \alpha (\lambda _p-\mu _1) \end{aligned}$$

and obtain that we need to find $\alpha \in (0,1)$ such that

$$\begin{aligned} \frac{\alpha }{(1+\alpha )^3}&\ge \frac{ \Vert \gamma ^2_1-\gamma ^2_2\Vert _1 \ (\lambda _n-\mu _1)^{2} }{2 \ \Vert \gamma _{2} \Vert _2^2 \ (\lambda _1 -\mu _1)^2}. \end{aligned}$$

Lemma 2 now gives

$$\begin{aligned} \frac{ (\lambda _n-\mu _1)^{2} }{(\lambda _1 -\mu _1)^2}&\le p\ \frac{\gamma _{1,\max }^2}{\gamma _{1,\min }^2} \end{aligned}$$

from which we get that the value $\nu ^*$ of $\nu $ given by

$$\begin{aligned} \nu ^*&= (\lambda _p-\mu _1) \phi \left( p \ \frac{\gamma _{1,\max }^3}{\gamma _{1,\min }^2} \frac{\Vert \gamma _1^2-\gamma _2^2 \Vert _1}{2 \ \Vert \gamma _2\Vert _2^2}\right) \end{aligned}$$

is admissible, for $\Vert \gamma _1^2-\gamma _2^2\Vert _1$ such that the term involving $\phi $ is less than one.

$$\begin{aligned} \frac{\gamma _{1,i}}{(\lambda _i-\mu _1)}-\frac{\gamma _{2,i}}{(\lambda _i-\mu _2)}&= \frac{\gamma _{1,i}(\lambda _i-\mu _1+\mu _1-\mu _2)-\gamma _{2,i}(\lambda _i-\mu _1)}{(\lambda _i-\mu _1)(\lambda _i-\mu _2)} \\&= \frac{(\gamma _{1,i}-\gamma _{2,i})}{\lambda _i-\mu _2}+\frac{\gamma _{1,i}(\mu _1-\mu _2)}{(\lambda _i-\mu _1)(\lambda _i-\mu _2)}. \end{aligned}$$

Therefore,

$$\begin{aligned} \Vert c_1^*-c_2^* \Vert _2^2&\le \left( \frac{\Vert \gamma _{1}-\gamma _{2}\Vert _2 }{(\lambda _1-\mu _2)}+ \frac{\Vert \gamma _{1}\Vert _2 \ \vert \mu _1-\mu _2\vert }{(\lambda _1-\mu _1)(\lambda _1-\mu _2)} \right) ^2. \end{aligned}$$

Finally, using that $\vert \mu _1-\mu _2\vert \le \nu ^*$, we get

$$\begin{aligned} \Vert c^*_1-c^*_2 \Vert _2&\le \left( \frac{\Vert \gamma _{1}-\gamma _{2}\Vert _2 }{(\lambda _1-\mu _2)}+ \frac{\Vert \gamma _{1}\Vert _2 \ \nu ^*}{(\lambda _1-\mu _1)(\lambda _1-\mu _2)} \right) ^2, \end{aligned}$$

which gives

$$\begin{aligned} \Vert x_1^*-x_2^*\Vert _{\infty }&\le \sqrt{p} \left( \frac{\Vert \gamma _{1}-\gamma _{2}\Vert _2 }{(\lambda _1-\mu _2)}+ \frac{\Vert \gamma _{1}\Vert _2 \ \nu ^*}{(\lambda _1-\mu _1)(\lambda _1-\mu _2)} \right) ^2, \end{aligned}$$

as announced.

1.4 A.4 Neuberger’s Theorem

In this subsection, we recall Neuberger’s theorem.

Theorem 2

Suppose that $r > 0$, that $x \in R^p$, and that F is a continuous function from $\bar{B}(x,r)$ to $R^m$ with the property that for each y in B(x, r), there is an h in $\bar{B}(0,r)$ such that

$$\begin{aligned} \lim _{t\rightarrow 0+} \ \frac{(F(y + th) - F(y))}{t}&= -F(x). \end{aligned}$$

(16)

Then, there exists u in $\bar{B}(x,r)$ such that $F(u) = 0$.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chrétien, S., Ho, Z.W.O. (2019). Incoherent Submatrix Selection via Approximate Independence Sets in Scalar Product Graphs. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2019. Lecture Notes in Computer Science(), vol 11943. Springer, Cham. https://doi.org/10.1007/978-3-030-37599-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-37599-7_9
Published: 03 January 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37598-0
Online ISBN: 978-3-030-37599-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Incoherent Submatrix Selection via Approximate Independence Sets in Scalar Product Graphs

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

The Maximum Weighted Submatrix Coverage Problem: A CP Approach

An Elementary Approach to the Problem of Column Selection in a Rectangular Matrix

Beyond symmetry: best submatrix selection for the sparse truncated SVD

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Minimizing Quadratic Functionals on the Sphere

1.1 A.1 A Semi-explicit Solution

Lemma 1

Corollary 1

Proof

1.2 A.2 Bounds on \(\mu \)

Lemma 2

Proof

1.3 A.3 \(\ell _\infty \) Perturbation of the Linear Term

Lemma 3

Proof

1.4 A.4 Neuberger’s Theorem

Theorem 2

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Incoherent Submatrix Selection via Approximate Independence Sets in Scalar Product Graphs

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

The Maximum Weighted Submatrix Coverage Problem: A CP Approach

An Elementary Approach to the Problem of Column Selection in a Rectangular Matrix

Beyond symmetry: best submatrix selection for the sparse truncated SVD

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Minimizing Quadratic Functionals on the Sphere

A Minimizing Quadratic Functionals on the Sphere

1.1 A.1 A Semi-explicit Solution

Lemma 1

Corollary 1

Proof

1.2 A.2 Bounds on \(\mu \)

Lemma 2

Proof

1.3 A.3 \(\ell _\infty \) Perturbation of the Linear Term

Lemma 3

Proof

1.4 A.4 Neuberger’s Theorem

Theorem 2

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation