Sparse signal recovery via infimal convolution based penalty

https://doi.org/10.1016/j.image.2021.116214Get rights and content

Highlights

  • We propose the non-convex infimal convolution-based penalty function for sparse signal recovery.

  • We use two iterative methods to solve the non-convex minimization problem.

  • We prove that the proposed algorithm converges to stationary points.

  • We show the effectiveness of the proposed algorithm by numerical examples.

Abstract

In this paper, we propose a non-convex penalty function for sparse signal recovery by using infimal convolution approximation. First, we show that this penalty function is between the 1 norm and the difference of 1 and 2 norm (12 norm), thus it can retain the advantages of these two norms at the same time, which means that it can induce the sparsity effectively for the low-amplitude components as the 1 norm and relieve underestimating the high-amplitude components as the 12 norm. Second, we employ two iterative methods to solve the non-convex penalty minimization. One is based on the difference of convex algorithm (DCA), which uses the alternating direction method of multipliers (ADMM) to solve the subproblem. And the other one employs the forward–backward splitting (FBS) algorithm, which can be solved by using the derived closed-form solutions. We also show that these two algorithms converge to a stationary point satisfying the first-order optimality condition. The experimental results demonstrate the effectiveness of the proposed method by comparing with some other penalty functions.

Introduction

In recent years, sparse recovery problems have drawn people’s attention in many applications such as compressive sensing (CS) [1], [2], [3], [4], machine learning [5], [6] and image processing [7], [8], [9], [10]. These problems are usually formulated as minxRNFxfx+λgxwhere xRN is the unknown signal that is sparse or can be sparsely represented on an appropriate basis, fx is the loss function related to data fidelity term, gx is the regularized function to penalize the sparsity of x, and λ>0 is the penalty parameter to balance the regularization and data fidelity.

For a linear sampling system b=Ax+n, where n is the observation noise or error, the loss function is often selected to be least-squares (LS) of the residuals as fx=Axb22 based on the assumption of Gaussian distributed noise condition, or the least-absolute (LA) of the residuals as fx=Axb1 based on the impulsive noise assumption such as Cauchy distribution [1], [11].

On the other hand, the penalty function should be selected as the 0-norm x0 intuitively, which represents the number of nonzero elements in x. However, minimizing the 0-norm is a NP-hard problem. A frequently used method is choosing its 1-norm convex approximation, i.e., gx=x1=i=1N|xi| to replace it [12]. This convex relaxation model has been widely used in many different fields, such as communications [13], synthetic aperture radar (SAR) images processing [6], direction of arrival (DOA) estimation [14] and magnetic resonance imaging (MRI) [7]. It has been proved that the s-sparse signal x can be recovered by the 1 model under some assumptions of the operator A, such as the restricted isometry property (RIP) of A if the operator is a sensing matrix [15]. Although the 1-norm regularization can induce sparsity most effectively among convex regularizers because of its shrinkage operator, it tends to underestimate high-amplitude components of x as it uniformly penalizes the amplitude, unlike that all nonzero entries have the same punishment in the original 0-norm. This property of 1-norm may lead to reconstruction failures with the least measurements [16] and sometimes brings undesirable blocky images [8].

Recently, there are some works which do not approximate the 0-norm but directly deal with the 0-norm or the s-sparse constrained problem, such as the iterative hard thresholding (IHT) algorithm [17] and its variants and acceleration versions: accelerated IHT (AIHT) [18], proximal IHT (PIHT) [19], extrapolated proximal IHT (EPIHT) [20] and accelerated proximal IHT [21]. Another idea is to transform the 0 model into an equivalent minimization problem, such as the difference of two convex function x1|x|s, where |x|s denotes the sum of top-s elements of x in absolute value [22]. Recently, this model has been extended to more general situations, such as the partial regularization i=s+1Nϕ|xi| [23] with |xi| representing the ith-largest elements in |x1|,|x2||xN|, and the s-difference regularization RxRx[s] [24] with x[s] representing the best s term approximation to x, where ϕ and R are some regularizers satisfying certain assumptions.

Meanwhile, some researchers focus on finding the non-convex regularizers, which can approximate the 0-norm better and achieve effective performance, such as the p (quasi)-norm with p0,1 [25], capped 1-norm [26], the minimax-concave penalty (MCP) [27] and the difference of the 1 and 2-norms (12) [28], [29], log-sum penalty (LSP) [30], smoothly clipped absolute deviation (SCAD) [9], correntropy induced metric (CIM) penalty [31], [32]. Among them, the 12 has achieved impressive results that its almost sure convergence to a global minimum with the help of a simulated annealing (SA) procedure and good performance in the condition of highly coherent matrix.

The main challenge faced by those non-convex regularizations gx induced minimization problems is how to solve them effectively. Many iterative algorithms are investigated by researchers, such as the difference of convex algorithm (DCA) [33], [34] and its accelerate versions: Boosted DCA (BDCA) [35] and proximal DCA with extrapolation (pDCAe) [36], General Iterative Shrinkage and Thresholding (GIST) [37], alternating direction method of multipliers (ADMM) [10], split Bregman iteration (SBI) [2], and nonmonotone accelerated proximal gradient (nmAPG) [38], which is an extension of the accelerated proximal gradient (APG) [39].

Underestimating high-amplitude components is one of the most frequently stated problems of 1-norm regularization. To deal with this problem, first, we propose a non-convex penalty function to retain the advantages of convex 1 norm and non-convex 12 norm in this paper, which can be flexibly adjusted between 1 and 12. This means that it can induce the sparsity effectively for the low-amplitude components and relieve underestimating high-amplitude components. At the same time, it can keep the objective function convex under certain conditions by using the infimal convolution.

Second, to solve the non-convex penalty regularized minimization problem, we employ the DCA algorithm and forward–backward splitting (FBS) algorithm, respectively. We prove that any cluster point of the sequence generated by these two algorithms converges to a stationary point. In addition, we also derive a closed-form solutions for the proximal gradient operator of FBS, which can accelerate the FBS.

Finally, we discuss some properties of the proposed non-convex penalty, which can be easily extended to other sparse recovery problems. We also evaluate the effectiveness of the proposed algorithms via numerical experiments.

The overall structure of the study takes the form of five sections, including this introductory section. Section Two begins by laying out the definition of infimal convolution based penalty (ICP), and looks at the theoretical properties of ICP. The third section is concerned with the iterative algorithms used for solving ICP based non-convex problem. Section four presents the numerical results. In the end, we provide our conclusion in Section five.

Here, we define our notation. We define the p-norm of the vector xRN as xp=n|xn|p1p. Especially, we define 1, 2 and -norms of x as x1=n|xn|, x2=n|xn|212 and x=maxn|xn|, respectively. For any given matrix ARM×N, AT represents the transpose of A, A22 represents the maximum eigenvalue of ATA, AΛRM×|Λ| is the submatrix of A with column indices Λ1,2,,N and |Λ| being the cardinality of Λ. B̲A means that the matrix AB is positive semidefinite. IN represents an N×N identity matrix. , denotes the inner product. The set of proper lower semicontinuous convex functions from RN to R+ is defined as Γ0RN.

Section snippets

Infimal convolution based penalty

We first recall the definition of infimal convolution. For two functions ϕ and φ from RN to R+, the infimal convolution [40] is given by ϕφx=infuRNϕu+φxuInstead of the frequently used gx=x1, we propose a new penalty function gβx as follows.

Definition 1

Let xRN, βR+. We definite the infimal convolution based penalty gβ:RNR as gβx=x1hβxwhere hx is the 2-norm infimal convolution defined as hβx=x2β2x22=infuRNu2+β2ux22

Property 1

The defined function hβ:RNR satisfies the following properties.

(a) hβx

Two iterative algorithms for ICP minimization problem

In this section, we employ two iterative frameworks for solving the unconstrained ICP based non-convex problem under the assumption that observation noise obeys Gaussian distribution. We consider the following minimization problem minxRNFx12Axb22+λgβxwhere ARM×N is the measurement matrix and bRM is the measurement data.

Numerical experiments

In this section, we present numerical experiments to demonstrate the efficiency of the ICP. We apply six methods in comparison with the proposed algorithm: (1) the 1-norm regularization based ADMM-lasso [5]; (2) the semismooth Newton augmented Lagrangian (SSNAL) method [45] for LASSO problem ( http://www.math.nus.edu.sg/ mattohkc/SuiteLasso.html); (3) the iterative p-shrinkage (IPS) algorithm [3] with p=1, which uses the p-shrinkage mapping as Sp,λxi=signximaxxiλ2p|xi|p1,0; (4) the p

Conclusions

In this paper, we have proposed an infimal convolution based penalty function for sparse signal recovery and employed the two iterative methods to solve the non-convex optimization problem: one of them uses the DCA framework with the ADMM solving the subproblem, the other one employs the FBS with the closed-form proximal operator. The convergence of the proposed algorithm is proved. The experimental results demonstrate the effectiveness of the proposed method by comparing with some other

CRediT authorship contribution statement

Lin Lei: Developed the idea, Writing - original draft. Yuli Sun: Developed the idea, Formal analysis, Writing - original draft. Xiao Li: Formal analysis.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors would like to thank the editors and anonymous reviewers for their careful reading of an earlier version of this article and constructive suggestions that improved the presentation of this work.

References (48)

  • WoodworthJ. et al.

    Compressed sensing recovery via nonconvex shrinkage penalties

    Inverse Problems

    (2016)
  • ZhangJ.

    Image compressive sensing recovery via collaborative sparsity

    IEEE J. Emerg. Sel. Topics Power Electron.

    (2012)
  • BoydS. et al.

    Distributed optimization and statistical learning via the alternating direction method of multipliers

    Found. Trends Mach. Learn.

    (2011)
  • DongG. et al.

    Classification via sparse representation of steerable wavelet frames on grassmann manifold: Application to target recognition in SAR image

    IEEE Trans. Image Process.

    (2017)
  • LustigM. et al.

    Compressed sensing MRI

    IEEE Signal Proc. Mag.

    (2008)
  • WenF. et al.

    Efficient and robust recovery of sparse signal and image using generalized nonconvex regularization

    IEEE Trans. Comput. Imag.

    (2017)
  • CandesE.J. et al.

    Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information

    IEEE Trans. Inform. Theory

    (2006)
  • BergerC.R. et al.

    Application of compressive sensing to sparse channel estimation

    IEEE Commun. Mag.

    (2010)
  • ChartrR. et al.

    Restricted isometry properties and nonconvex compressive sensing

    Inverse Problems

    (2008)
  • LuZ.

    Iterative hard thresholding methods for L0 regularized convex cone programming

    Math. Program.

    (2014)
  • BaoC.

    Image restoration by minimizing zero norm of wavelet frame coefficients

    Inverse Problems

    (2016)
  • ZhangX. et al.

    An accelerated proximal iterative hard thresholding method for L0 minimization

    (2017)
  • GotohJ. et al.

    DC formulations and algorithms for sparse optimization problems

    Math. Program.

    (2018)
  • LuZ. et al.

    Sparse recovery via partial regularization: Models, theory, and algorithms

    Math. Oper. Res.

    (2018)
  • Cited by (2)

    • Bearing fault diagnosis via generalized logarithm sparse regularization

      2022, Mechanical Systems and Signal Processing
      Citation Excerpt :

      However, in extreme cases, the energy of noises might be great and the signal-to-noise ratio(SNR) may be small, these methods are difficult to reconstruct fault component from the monitoring signal accurately and affect heavily by the noise. Over the past decades, a lot of attention has been attracted to sparse representation technique in the signal processing field [11–14]. Because the repetitive transients can be approximated by few dictionary atoms while the noise cannot, so that the sparse representation method is able to decompose the monitoring vibration signal to a vector called sparse coefficient, whose elements are mostly equal to zero.

    This work was partially supported by the National Natural Science Foundation of China (61701508).

    1

    These authors contributed equally.

    View full text