A gradient-based approach to optimization of compressed sensing systems
Introduction
Compressive sensing or compressed sensing (CS) has attracted a lot of attention during the last ten years or so [1], [2], [3], [4]. CS-based techniques have found many applications in the areas such as image compression, signal detection/classification, and more [4], [5]. By nature, CS is a mathematical framework that deals with accurate recovery of a signal vector y from a lower dimension measurement vector z.
Under the CS framework, the measurement z ∈ ℜM × 1 consists of linear projections of the signal vector y ∈ ℜN × 1 via where M ≪ N is assumed, Φ ∈ ℜM × N is called a sensing matrix or projection matrix. The original signal is assumed to be in the following form where Ψ ∈ ℜN × L is called a dictionary and s ∈ ℜL × 1 is the coefficient vector of y in Ψ. When N < L, Ψ is said to be overcomplete, which is assumed throughout this paper.
A CS system refers to the linear Eqs. (1) and (2) plus an algorithm that can yield an estimate of the original signal y based on the measurement z. By substituting y in (1) with (2), z can be rewritten as where the matrix A is sometimes referred to as equivalent dictionary of the CS system. The ultimate goal of a CS system is actually to recover s (hence y) from the measurement z.
As M < L, solving for s is a undetermined problem since it has an infinite number of solutions. To have a unique solution, extra constraints on the linear system have to be given. One of such constraints is related to the concepts of spark and signal sparsity. The spark of a matrix Q ∈ ℜM × L, denoted as spark(Q), is defined as the smallest number of columns in Q that are linearly dependent. The lp-norm of vector v ∈ ℜN × 1 is defined as For convenience, ||v||0 is used to denote the number of non-zero elements in v (though it is not a norm in a strict sense). A vector y given by (2) is said κ-sparse in Ψ if .
It was shown in [6] that any κ-sparse signal can be exactly recovered from its measurement by solving as long as spark(A) > 2κ. Such a problem is usually addressed using the orthogonal matching pursuit (OMP)-related techniques [7], [8]. Furthermore, it can be shown that the solution to the above problem is the same as the one to the l1-based minimization below while the latter can be solved efficiently using algorithms such as basis pursuit (BP) [9] and the l1/l2-based optimization techniques [10]. Recently, to further enhance the ability to deal with the sparsity issue, a two-level l1 minimization was proposed in [11] for compressed sensing using a non-convex and piecewise linear penalty.
In this paper, OMP-based algorithms are used and hence designing a CS system means to determine Φ and Ψ for a class of signals.
As discussed previously, signal sparsity is an essential prerequisite for a CS. This is regarding sparse representation of signals, which is a widely utilized technique for modelling natural signals and has found potential applications, including in image compression and denoising [12]. The key issue is for a given class of signals to find a dictionary Ψ such that yj is as close to Ψsj as possible for a coefficient vector sj constrained with ||sj||0 ≤ κ, where κ is a prescribed sparsity level. This is usually referred to as sparsifying dictionary learning, classically formulated as where and ||.||F denotes the Frobenius norm.
Such a problem is difficult to be solved as it is non-convex in Ψ and S, and ||.||0 is non-smooth and highly unstable. A popularly used approach is based on the alternating minimization strategy, leading to a two-stage iterative procedure in which the kth iteration is carried out with
- •
Sparse coding - update S using Sk, where Sk is the solution of (6) with . This problem can be solved using the OMP based techniques [7], [8].
- •
Updating dictionary - with where Ψk is the solution of (6) with .
Many algorithms of this class differentiate each other mainly in the 2nd stage that is to update Ψ. The very first algorithm is perhaps the method of optimal direction (MOD) [13], in which the dictionary Ψ is simply taken as the solution of : where denotes the transpose operator, and then multiplied by a diagonal matrix Dsc such that is normalized in l2-norm in order to avoid the ambiguity in this procedure. The most popularly used method for solving (6) is the K-singular value decomposition (K-SVD) [14], in which the atoms of the dictionary are updated one by one while the sparse structure of the given S is kept unchanged. Such an algorithm usually yields a better performance than the MOD as the non-zero elements in S are simultaneously updated. The same idea was recently used to improve the MOD in [15]. In [16], an algorithm, named as sequential generalization of K-mean (SGK), was proposed, which uses the same strategy of updating atoms one by one but without considering the structure of S.
As seen, the larger the spark(A), the bigger the signal space among which the CS systems can guarantee an exact recovery. For a given Ψ, the spark of the equivalent dictionary is determined by the sensing matrix Φ. It would be of great interest to design Φ such that spark(A) is maximized. Unfortunately, spark(A) is not tractable. As shown in [6], any κ-sparse signal s0 (in A) can be exactly recovered from via (4) as long as where μ(A) is the mutual coherence of matrix A ∈ ℜM × L, which is defined as maximum of the coherence factors between the column vectors of A:
It is due to (7) that the prevailing approaches to optimal sensing matrix design are all based on mutual coherence-related properties of the equivalent dictionary A.
The Gram of a matrix Q is defined as the product . Note that the off-diagonal entries of are closely related to the coherence factors of Q. This suggests that the coherence behavior of Q can be studied via its Gram. An averaged mutual coherence was proposed in [17] as where is the (i, j)th element of the Gram of with Ssc a diagonal scaling matrix such that with a prescribed parameter and Nav is the number of elements in the index set . Such a measure is a good performance indicator but is difficult to minimize with respect to Φ.
Most of the existing approaches to optimal sensing matrix design can be unified as where is a non-empty subset of symmetric matrices with diagonal elements all equal to one, containing a collection of target Grams which have some desired properties.
The very first target Gram used for optimal sensing matrix design is IL, the identity matrix of dimension L (See [5], [18], [19], [20]). Another popularly used target Gram is based on the concept of equiangular tight frames (ETFs). The set of columns of a (column normalized) matrix Q is said an ETF if are all equal and such a matrix yields the smallest mutual coherence [21], [22]. Therefore, it is desired to make the Gram of the equivalent dictionary as close as possible to that of an ETF. Since it is very difficult to characterize the set of ETF Grams, the latter is practically regularized as where the parameter ξ > 0, called prescribed coherence level, is a positive constant not less than μ, controlling the searching space for the optimal H. When ξ is larger than the Welch bound μ [21]: which holds for any Q of dimension M × L, the ideal ETF Grams of dimension L (with all possible ranks) are confined in .
When takes the corresponding optimal sensing matrix problem (10) is usually addressed using the alternating minimization based techniques [20], [23], [24], [25], [26], [27], among which one approach differs from another just in the way how the sensing matrix Φ is updated for a given H. As the problem is highly non-convex, numerical methods are usually applied. A QR factorization based method was proposed in [23], while an iterative algorithm was given in [27].
The gradient-based numerical methods have been considered as a class of efficient algorithms popularly used in engineering designs, where the closed-form solutions are difficult to be obtained. Such an approach was adopted for optimal sensing matrix design in [18], [24].
It has been observed that the sensing matrix obtained from (10) with or searched from performs well only when the sparse representation error is very small.
When Y is projected via the sensing matrix Φ, the corresponding measurements are of a form where is the sparse representation error of Z in A.
The first attempt to consider robust CS system design was given in [5], where the dictionary and sensing matrix are alternatively updated using an iterative procedure. In such a procedure, the dictionary is the solution of a linear combination of and for a given sensing matrix, while the sensing matrix updating is done with a different measure. Following the same lines, an alternative method was proposed in [25] with refined algorithms for designing each of the dictionary and sensing matrix. The resultant CS systems from both all demonstrate a very good performance against the sparse representation error.
Though Y0≜ΨS is a satisfactory approximate of Y, can be very big if Φ is not properly chosen. In order to reconstruct Y0 with higher accuracy, should be taken into account in designing sensing matrix. Therefore, it is desired to choose Φ in the same way as suggested by (10) and at the same time, to reduce as much as possible. To deal with this multi-target problem, Li et al. proposed an approach to robust sensing matrix design in [26] in which the following measure is investigated where H belongs to a subset defined before, is the equivalent dictionary, and α ≥ 0 is a parameter controlling the coupling of the measurement sparse representation error in the cost function.
The optimal robust sensing matrix problem under this framework was formulated in [26] as where Ψ and E are given. taking IL and was considered respectively. Experiments showed that the obtained sensing matrix is very robust against the sparse representation error, comparable to [5] and [25].
The main problems and corresponding contributions in this paper are as follows.
- •
Traditionally, the sparsifying dictionary design is to solve (6). In such an approach, the main concern is to minimize the sparse representation error and the atoms of the obtained dictionary may be very coherent. As understood, it is very difficult to make the equivalent dictionary A have a small mutual coherence μ(A) if the dictionary Ψ is very coherent.1 Therefore, it is important to design the sparsifying dictionary Ψ such that the sparse representation error is minimized with minimal coherence. This is the first problem to be investigated in this paper. A new approach to incoherent sparsifying dictionary design is proposed for that purpose and a gradient descent-based algorithm is derived to obtain the corresponding optimal sparsifying dictionary.
- •
Denote where DQ is a diagonal matrix such that . is said the normalized version of matrix Q. It follows from (8) that Q has the same coherence properties as its normalized version does. Note that the term in the classical mutual coherence based approaches, specified in (10) and (14), is intended to measure the coherence difference between the target Gram and that of the equivalent dictionary . As H is assumed to have its diagonal elements all equal to one, this term has the assigned physical meaning only when A is normalized. This means that the optimal sensing matrix design problems (10) and (14) should be solved subject to . The second problem to be considered in this paper is to investigate the optimal robust sensing matrix design problem (10) but with A replaced by its normalized version . Based on a parametric technique, a gradient descent-based algorithm is derived to solve such a problem.
The outline of this paper is given as follows. A novel framework for designing incoherent dictionary is proposed in Section 2, where an alternating minimization-based algorithm is derived to solve for the optimal dictionary Ψd. Section 3 is devoted to designing robust sensing matrix, which is of the same form as (10) but with A replaced by its normalized version . Such a problem is considered for H being searched within and H fixed to be the Gram of the dictionary Ψd. Experiments are carried out in Section 4 to demonstrate the superiority of the proposed approaches to the existing ones. Some concluding remarks are given in Section 5 to end this paper.
Section snippets
A novel framework for designing incoherent sparsifying dictionary
As argued in the previous section, in order to enhance the performance of a CS system characterized with (Φ, Ψ) the equivalent dictionary should be as incoherent as possible. Since it is very difficult to make the equivalent dictionary have a good coherence behavior by adjusting Φ if Ψ is very coherent due to the relationship of dimensionality M < N < L, it is desired to design the dictionary Ψ with its coherence behavior taken into account. This is the motivation and objective of this
Revisit of the mutual coherence-based optimal sensing matrix
In this section, we consider the problem of designing sensing matrix with a sparsifying dictionary Ψ given, say .
Experimental results
In this section, we evaluate the performance of the proposed algorithms and compare them with some of existing works.
Concluding remarks
In this paper, we investigated the problem of designing robust CS systems, in terms of optimizing the sparsifying dictionary and the sensing matrix. The two problems were addressed using an alternating minimization-based approach, where both the dictionary and sensing matrix are updated via a gradient descent-based algorithm. The expressions for the corresponding derivatives were derived. The validity of the proposed approaches was clearly demonstrated with experiments.
In the proposed CS
Acknowledgement
This work was supported by the National Natural Science Foundation of China (grant numbers 61571174, 61503339), and Zhejiang Provincial Natural Science Foundation of China (grant number LY15F010010).
References (28)
- et al.
Two-level l1 minimization for compressed sensing
Signal Process.
(2015) - et al.
Grassmannian frames with applications to coding and communication
Appl. Comput. Harmon. Anal.
(2003) - et al.
A gradient-based alternating minimization approach for optimization of the measurement matrix in compressive sensing
Signal Process.
(2012) - et al.
An efficient algorithm for designing projection matrix in compressive sensing based on alternating optimization
Signal Process.
(2016) Optimized projections for compressed sensing via rank-constrained nearest correlation matrix
Appl. Computat. Harmon. Anal.
(2014)- et al.
Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information
IEEE Trans. Inf. Theory
(2006) Compressed sensing
IEEE Trans. Inf. Theory
(2006)- et al.
Near-optimal signal recovery from random projections: universal encoding strategies? IEEE Trans. Inf. Theory
(2006) - et al.
Structured compressed sensing: from theory to applications
IEEE Trans. Signal Process.
(2011) - et al.
Learning to sense sparse signals: simultaneous sensing matrix and sparsifying dictionary optimization
IEEE Trans. Image Process.
(2009)
Optimally sparse representation in general (nonorthonormal) dictionaries via l1 minimization
Proc. Natl. Acad. Sci.
Greed is good: algorithmic results for sparse approximation
IEEE Trans. Inf. Theory
Signal recovery from random measurements via orthogonal matching pursuit
IEEE Trans. Inf. Theory
An introduction to compressive sampling
IEEE Signal Process. Mag.
Cited by (18)
Construction of unit norm tight frames inspired by the Paulsen problem
2022, Digital Signal Processing: A Review JournalUnit-norm tight frame-based sparse representation with application to speech inpainting
2022, Digital Signal Processing: A Review JournalSensor position optimization for flux mapping in a nuclear reactor using compressed sensing
2021, Annals of Nuclear EnergyCitation Excerpt :The orthogonal gradient descent method for sensing matrix optimization is used here to achieve optimal incoherence. An iterative procedure is developed in (Li et al., 2017) for searching the optimal dictionary, in which the dictionary update is executed using a gradient descent based algorithm. Another method of sensing matrix optimization is proposed in (Oey, 2018) by introducing the relative sparse representation error (SRE) parameter and incorporates it into the optimization problem.
Optimized structured sparse sensing matrices for compressive sensing
2019, Signal ProcessingCitation Excerpt :Toward that end, it is important to develop a quantized (even 1-bit) sparse sensing matrix. We finally note that it remains an open problem to certify certain properties (such as the RIP) for the optimized sensing matrices [9–19], which empirically outperforms a random one that satisfies the RIP. Works in these directions are ongoing.