Stability and robustness of the l2/lq-minimization for block sparse recovery
Introduction
Compressed sensing [4], [6], [14] is a scheme which shows that some signals can be reconstructed from fewer measurements compared to the classical Nyquist-Shannon sampling method. Suppose the observed data one will recover the signal via the linear system where (n < N) is a real matrix, called the measurement matrix, and is a noise vector. To extract the information x, one applies a decoder Δ to y which is, typically, a nonlinear operator mapping from to . The vector is viewed as an approximation to x. The central question of compressed sensing is: What are the good encoder-decoder pairs (A, Δ) [2]? It is natural to hope that there exists an encoder-decoder pair (A, Δ), such that the measure error is as small as possible, where,
To measure the performance of an encoder-decoder pair (A, Δ), one used Gelfand width [2], [13] to characterize the degree of approximation of in the noise-free case. Then, as for a general matrix does there exist a decoder Δ, such that ? In fact, in classic compressed sensing problem, there exists an essential decoder Δ0: which can lead to for all s-sparse vectors whenever [16]. Here, ‖x‖0 denotes the number of non-zero entries of the vector x, an s-sparse vector is defined by ‖x‖0 ≤ s < <N.
However, the l0-minimization (4) is a nonconvex and NP-hard optimization problem [32]. To overcome this problem, one proposed the decoder Δq or lq-minimization for 0 < q ≤ 1 [5], [6], [15], [17], [23], [27], [39]. When [3] proved that the solutions to (5) are equivalent to those of (4) provided that the measurement matrices satisfy the Restricted Isometry Property (RIP) [5] with some definite Restrict Isometry Constant (RIC) δs ∈ (0, 1), here δs is defined as the smallest constants satisfying for any s-sparse vectors . For 0 < q < 1, recently, [34] also proved that the solutions to (5) are equivalent to those of (4) as long as q is smaller than a definite constant q0 < 1, where q0 depends on A and y. The lq(0 < q < 1)-minimization is a natural extension from the l1-minimization to the l0-minimization, because, compared to the l1-norm ‖x‖1, for 0 < q < 1 can make a closer approximation to ‖x‖0 and remain to induce sparsity. Besides, it was shown that the lq(0 < q < 1)-minimization often needs less restrictive the RIP requirements, but still could guarantee perfect recovery for smaller q [8], [29], [36] compared to the l1-minimization. Numerical experiments [7] also showed that the lq(0 < q < 1)-minimization recovers sparse signals from fewer linear measurements than does the l1-minimization. Of course, there exist other kind of nonconvex examples which replace the l0-minimization (4), typically, the smoothly clipped absolute deviation (SCAD) [21], the minimax concave penalty (MCP) [44] and the capped l1-norm [45]. These examples can often induce better sparsity and reduce the bias, and they are relatively easy to be implemented compared to the lq(0 < q < 1)-minimization from the algorithmic point of view. But they need theoretical guarantees to determine an appropriate regularization parameter based on the corresponding iterative thresholding algorithm [42]. However, the lq(0 < q < 1)-minimization strategy offers more theoretical advantages compared to the above mentioned nonconvex relaxed models in compressed sensing. Therefore, this paper mainly focuses on the lq(0 < q < 1)-minimization in what following.
In addition to recovering sparse vectors from error-free measurement, one requires that the decoder should be robust to noise and stable with regards to the compressibility of [25]. That is with ‖e‖2 ≤ ϵ, the decoder reads as: which provides a stability estimate of the form (see [3] for and [23] for a general statement with 0 < q < 1.) where σs(x)q denotes the best s-term approximation error of in lq (quasi-)norm where q > 0, i.e., Here and in the rest of the present paper, C and D denote positive absolute constants whose values may change from instance to instance.
The inequality (8) was obtained based on the RIP. However, using the Null Space Property (NSP) [25], [26] (see [25] for and [26] for 0 < q < 1), one also gets the results similar to (8). Furthermore, if we set then the inequality (8) reads as
This motivates the concept of the instance optimality. Let denote the set of all encoder-decoder pairs (A, Δ), that is
Then the instance optimality is defined as follows.
Definition 1 Let 0 < q ≤ p, for and for all there exist a pair and a constant C > 0, if
then the encoder-decoder pair (A, Δ) is called the (p, q) instance optimality of order s.
From Definition 1, we can conclude that an s-signal can be exactly recover if there exists a pair (A, Δ) that satisfies the instance optimality. We can also see from (8) that the encoder-decoder pair (A, Δq) satisfies (2, q) instance optimality when the matrix A has the RIP or the NSP in terms of different q ≤ 1.
We see that the above stability result (8) requires a proper estimate of the level ϵ of measurement error, but sometimes, it does not need to estimate the bound of ‖e‖2, that is, a priori ‖e‖2 ≤ ϵ is not available in many practical scenarios. Therefore, one hopes that the result (8) could be improved, for this purpose, one proposed the following quotient property.
Definition 2 Given a matrix if there is a constant α > 0, such that
then the matrix A is said to satisfy the lq quotient property with constant α, where denotes the unit ball relative to lq norm (q ≥ 1) or quasi-norm (0 < q < 1), that is .
Using the quotient property, one can obtain the following robustness estimate (see [25], [43] for and [35] for 0 < q < 1), Obviously, (12) indicates that the lq-minimization can perform well for arbitrary measurement error without estimating the upper bound of ‖e‖2.
However, the above classic compressed sensing only considers the sparsity of the reconstructed signal, but it does not take into account any further structure. In many practical applications, the reconstructed signal is not only sparse but also non-zero entries are aligned to some blocks rather than being arbitrarily spread throughout the vector. These signals are called the block sparse signals and arise in several areas of signal processing and machine learning, for example, color imaging [31], DNA microarrays [33], equalization of sparse communication channels [11], multi-response linear regression [37], image annotation [28], etc.
To define the block sparsity, it is necessary to introduce some further notations. Suppose that is split into m blocks, which are of length respectively, that is and . A vector is called block s-sparse over if x[i] is nonzero for at most s indices i [19]. Obviously, for each i, the block sparsity reduces to the conventional definition of a sparse vector.
Furthermore, we introduce the following notations, where I(x) is an indicator function that obtains the value 1 if x > 0 and 0 otherwise. So a block s-sparse vector x can be defined by ‖x‖2, 0 ≤ s, and . The same as ‖x‖q, for 1 ≤ q ≤ ∞, ‖x‖2, q is a norm, while for 0 < q < 1, it is a quasi-norm and satisfies the q-triangle inequality: Obviously, for an m-block signal whose structure is like (13), then and for 0 < q ≤ p, we have and especially, ‖x‖2, 1 ≤ ‖x‖1. Let Σs denote the set of all block s-sparse vectors: . Similar to the definition of σs(x)q, we use σs(x)2, q to denote the best block s-term approximation error of in l2/lq (quasi-)norm where q > 0, i.e., It is clear that for all x ∈ Σs.
To recover a block sparse signal, similar to the standard l0-minimization, one seeks the sparsest block sparse vector via the following l2/l0-minimization [18], [19], [20],
But the l2/l0-minimization problem is also NP-hard. It is natural to use the l2/l1-minimization to replace the l2/l0-minimization. Consider [18], [19], [20], [38]
To characterize the performance of this method, Eldar and Mishali [19] proposed the block Restricted Isometry Property (block RIP).
Definition 3 Block RIP Given a matrix for every block s-sparse over there exists a positive constant if
then the matrix A satisfies the s-order block RIP over .
Obviously, the block RIP is an extension of the standard RIP, but it is a less stringent requirement comparing with the standard RIP [1], [19]. Eldar et al. [19] proved that the l2/l1-minimization can exactly recover any block s-sparse signal when the measurement matrices A satisfy the block RIP with . The block RIP constant can be also improved, for example, Lin and Li [30] improved to and established another sufficient condition for exact recovery.
Based on the performance of the lq(0 < q < 1)-minimization [7], [12], [23], it is also natural to extend the lq-minimization to the setting of block sparse recovery, this motivates us to consider the l2/lq-minimization. It is defined by [31], [40], [41] or for the inaccurate measurement with ‖e‖2 ≤ ϵ. Note that .
Like the lq(0 < q < 1)-minimization, the l2/lq(0 < q < 1)-minimization also has superior properties compared to the l2/l1-minimization. Some numerical experiments demonstrated that fewer measurements are needed for exact recovery with the decoder when 0 < q < 1 than when see [31], [40], [41]. Moreover, [41] studied the exact recovery conditions and gave the stability estimate of the l2/lq(0 < q < 1)-minimization based on the block restricted q-isometry property. But there, it requires a proper estimate of the level ϵ of measurement error. However, such an estimate may not be a priori available in some settings, i.e., it is not necessary to estimate the upper bound of ‖e‖2. Thus, a question arises if the decoder can perform well for arbitrary measurement error when the estimates of measurement noise levels are absent. Besides, another interesting question arises if the block RIP can be weakened to exactly recover block-sparse vector via the l2/lq-minimization. The purpose of this paper is to investigate the above two questions. To achieve this goal, we propose the lq stable block NSP, which weakens the block RIP. We also propose the lp, q robust block NSP for 0 < q ≤ p, the block quotient property and the block instance optimality, which are very crucial to characterize the stability and robustness of the decoder for arbitrary noise e without the need to estimate ‖e‖2.
The remainder of the paper is organized as follows. In Section 2, based on the lq block NSP, we first give the lq stable block NSP and derive its an equivalent form, and show it is weaker than the block RIP. Then we further consider the lq robust block NSP and the lp, q robust block NSP for 0 < q ≤ p, respectively. Using these properties, we effectively characterize reconstruction results for the l2/lq-minimization when the observations are corrupted by noise. In Section 3, we give the block instance optimality, the block quotient property and the simultaneous block quotient property. These properties combining with the lp, q robust block NSP will yield two important lemmas, which are the cores of the proof of the main results. In addition, we also show that Gaussian random matrices satisfy the block quotient property with high probability. In Section 4, we give our two main results, Theorems 7 and 8. Section 5 is a discussion and Section 6 includes some conclusions. Finally, we relegate the proofs of the main results, lemmas and proposition, i.e., Theorems 1–Theorem 5, Theorem 7, Theorem 8, Lemma 1, Lemma 2 and Proposition 1 to Appendix A.
Section snippets
lq Block null space property
Suppose that is an m-block signal, whose structure is like (13), we set and by SC we mean the complement of the set S with respect to i.e., . Let xS denote the vector equal to x on a block index set S and zero elsewhere, then . In [26], we extended the lq NSP on the traditional sparse signals to the block sparse case.
Definition 4 (lq block NSP) Given a matrix for any set with card(S) ≤ s and for all v ∈ KerA\{0}, if
then
Instance optimality and quotient property
In this section, we shall introduce the block instance optimality and the block quotient property on the l2/lq-minimization. These properties combining with the lp, q robust block NSP will yield Theorem 8. In addition, we shall discuss what matrices satisfy the block quotient property.
Definition 8 ((p, q) block instance optimality) Let 0 < q ≤ p, for and for all there exist a pair and a constant C > 0, such that
then the encoder-decoder pair (
Main results
In this section, we give our main results. The lp, q robust block NSP is necessary to the two theorems, while the lq block quotient property is another necessary condition to Theorem 8. We defer the proofs of the results to Appendix A.
Theorem 7 Given 1 ≤ p ≤ 2, 0 < q ≤ 1, suppose that the matrices satisfy the lp, q robust block NSP of order s with constants 0 < τ < 1 and γ > 0, then, for all and with ‖e‖2 ≤ ϵ, there exist positive constants C1 and D1, it holds that
Discussion
In this section, we first discuss the tightness of the bounds in our results when . Second, we describe the impact of some coefficients in Inequalities (43) and (46) as shown in Figs. 1 and 2. Third, we analyze several parameters relevant to the above desired properties.
A natural question is whether the bounds given in the main results are tight. Here, we only consider the case of . Suppose that according to (46), we have
Conclusions
This paper focuses on block sparse recovery with the l2/lq-minimization. We generalized the null space property, the instance optimality and the quotient property to the block sparse case, and gave the lq stable block NSP and its an equivalent form, a new sufficient condition to exactly recover block sparse signals via the l2/lq-minimization, but it is weaker than the block RIP proposed by Eldar and Mishali. Because the instance optimality is important to evaluate the performance of the
Acknowledgements
The authors would like to thank Prof. Jianjun Wang and Dr. Wendong Wang, Southwest University, China, and Dr. Feng Zhao, University of Lincoln, UK, for a beneficial discussion of the manuscript. Moreover, the authors also thank the Editor and anonymous Reviewers for their insightful comments and valuable suggestions, which led to a significant improvement of this paper. This work was supported by the National Natural Science Foundation of China (NSFC) (11131006), Science Research Project of the
References (45)
The restricted isometry property and its implications for compressed sensing
C. R. Math. Acad. Sci. Paris Ser. I
(2008)Deterministic constructions of compressed sensing matrices
J. Complexity
(2007)Stability and robustness of l1-minimizations with weibull matrices and redundant dictionaries
Linear Algebra Appl.
(2014)- et al.
Sparsest solutions of underdetermined linear systems via lq-minimization for 0 < q ≤ 1
Appl. Comput. Harmon. Anal.
(2009) - et al.
The gelfand widths of lp-ball for 0 < p ≤ 1
J. Complex.
(2010) Sparse approximate solutions to linear systems
SIAM J. Comput.
(1995)- et al.
Restricted p-isometry property and its application for nonconvex compressive sensing
Adv. Comput. Math.
(2012) - et al.
On the reconstruction of block-sparse signals with an optimal number of measurements
IEEE Trans. Signal Process.
(2009) - et al.
On recovery of block-sparse signals via mixed l2/lq(0 < q ≤ 1) norm minimization
EURASIP J. Adv. Signal Process.
(2013) - et al.
Linear convergence of adaptively iterative thresholding algorithms for compressed sensing
IEEE Trans. Signal Process.
(2015)
Model-based compressive sensing
IEEE Trans. Inf. Theory
A simple proof of the restricted isometry property for random matrices
Constructive Approximation
Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information
IEEE Trans. Inf. Theory
Decoding by linear programming
IEEE Trans. Inf. Theory
Near-optimal signal recovery from random projections: universal encoding strategies
IEEE Trans. Inf. Theory
Exact reconstruction of sparse signals via nonconvex minimization
IEEE Signal Process. Lett.
Restricted isometry properties and nonconvex compressive sensing
Inverse Probl.
Stability of compressed sensing for dictionaries and almost sure convergence rate for the kaczmarz algorithm
Compressed sensing and best k-term approximation
Am. Math. Soc.
Sparse channel estimation via matching pursuit with application to equalization
IEEE Trans. Commun.
Restricted isometry constants where lp sparse recovery can fail for 0 < p ≤ 1
IEEE Trans. Inf. Theory
Compressed sensing
IEEE Trans. Inf. Theory
Cited by (20)
The MMV tail null space property and DOA estimations by tail-ℓ<inf>2,1</inf> minimization
2022, Signal ProcessingEfficient iterative thresholding algorithms with functional feedbacks and null space tuning
2021, Signal ProcessingLow-rank matrix recovery via regularized nuclear norm minimization
2021, Applied and Computational Harmonic AnalysisMinimization of the logarithmic function in sparse recovery
2021, NeurocomputingGroup sparse recovery in impulsive noise via alternating direction method of multipliers
2020, Applied and Computational Harmonic AnalysisCitation Excerpt :For the Gaussian noise case, see [21,22,30]. Furthermore, various sufficient conditions and other results on recovery of block sparse signals were gained in contributions [23–28]. However, all these researches are discussed only in Gaussian noise case, that is, the observation measurement b is disturbed by Gaussian noise.
ℓ<inf>2,p</inf>-correlation and robust matching pursuit for sparse approximation
2020, Digital Signal Processing: A Review Journal