Skip to main content
Log in

Sparse signal recovery via generalized gaussian function

  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

In this paper, we replace the \(\ell _0\) norm with the variation of generalized Gaussian function \(\Phi _\alpha (x)\) in sparse signal recovery. We firstly show that \(\Phi _\alpha (x)\) is a type of non-convex sparsity-promoting function and clearly demonstrate the equivalence among the three minimization models \((\mathrm{P}_0):\min \limits _{x\in {\mathbb {R}}^n}\Vert x\Vert _0\) subject to \( Ax=b\), \({\mathrm{(E}_\alpha )}:\min \limits _{x\in {\mathbb {R}}^n}\Phi _\alpha (x)\) subject to \(Ax=b\) and \((\mathrm{E}^{\lambda }_\alpha ):\min \limits _{x\in {\mathbb {R}}^n}\frac{1}{2}\Vert Ax-b\Vert ^{2}_{2}+\lambda \Phi _\alpha (x).\) The established equivalent theorems elaborate that \((\mathrm{P}_0)\) can be completely overcome by solving the continuous minimization \((\mathrm{E}_\alpha )\) for some \(\alpha \)s, while the latter is computable by solving the regularized minimization \((\mathrm{E}^{\lambda }_\alpha )\) under certain conditions. Secondly, based on DC algorithm and iterative soft thresholding algorithm, a successful algorithm for the regularization minimization \((\mathrm{E}^{\lambda }_\alpha )\), called the DCS algorithm, is given. Finally, plenty of simulations are conducted to compare this algorithm with two classical algorithms which are half algorithm and soft algorithm, and the experiment results show that the DCS algorithm performs well in sparse signal recovery.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Aharon, M., Elad, M., Bruckstein, A.: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)

    Article  Google Scholar 

  2. Anl, T.H., Tao, P.D.: The DC programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann. Oper. Res. 133(1), 23–46 (2005)

    MathSciNet  MATH  Google Scholar 

  3. Babacan, S.D., Molina, R., Katsaggelos, A.K.: Bayesian compressive sensing using laplace priors. IEEE Trans. Image Process. 19(1), 53–63 (2010)

    Article  MathSciNet  Google Scholar 

  4. Beck, A., Teboulle, M.A.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  Google Scholar 

  5. Blumensath, T., Davies, M.E.: Iterative thresholding for sparse approximations. J. Fourier Anal. Appl. 14(5), 629–654 (2008)

    Article  MathSciNet  Google Scholar 

  6. Cand\(\grave{e}\)s, E., Romberg, J., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. , 59(8), 1207–1223 (2006)

  7. Cand\(\grave{e}\)s, E., Rudelson, M., Tao, T., Vershynin, R.: Error correction via linear programming. In: 46th Annual IEEE Symposium on Foundations of Computer Science, pp. 668-681. Pittsburgh, PA (2005)

  8. Cand\(\grave{e}\)s, E., Tao, T.: Decoding by linear programming, IEEE Trans. Inform. Theory 51(12), 4203–4215 (2005)

  9. Cao, W., Sun, J., Xu, Z.: Fast image deconvolution using closed-form thresholding formulas of regularization. J. Vis. Commun. Image Represent. 24(1), 31–41 (2003)

    Article  Google Scholar 

  10. Carmy, A., Gurfil, P., Kanevsky, D., Ramabhadran, B.: ABCS: approximate Bayesian compressed sensing. IBM Research Report, RC24816 (2009)

  11. Chartrand, R.: Exact reconstruction of sparse signals via nonconvex minimization. IEEE Signal Process. 14(10), 707–710 (2007)

    Article  Google Scholar 

  12. Davies, M.E., Gribonval, R.: Restricted isometry constants where \(l_p\) sparse recovery can fail for \(0<p<1\). IEEE Trans. Inform. Theory 55(5), 2203–2214 (2009)

    Article  MathSciNet  Google Scholar 

  13. Donoho, D.L.: Compressed sensing. IEEE Trans. Inform. Theory 52(4), 1289–1306 (2006)

    Article  MathSciNet  Google Scholar 

  14. Donoho, D.L.: De-noising by soft thresholding. IEEE Trans. Inform. Theory 41(3), 613–627 (1995)

    Article  MathSciNet  Google Scholar 

  15. Donoho, D.L., Elad, M.: Optimally sparse representation in general (nonorthoganal) dictionaries via \(l_1\) minimization. In: Proceedings of Nature Academic Sciences, pp. 2197–2202. USA (2003)

  16. Donoho, D.L., Tanner, J.: Sparse nonnegative solution of underdetermined linear equations by linear programming. PNAS 102(27), 9446–9451 (2005)

    Article  MathSciNet  Google Scholar 

  17. Elad, M.: Sparse and Redundant Representations: from Theory to Applications in Signal and Image Processing. Springe Press, New York (2010)

    Book  Google Scholar 

  18. Elhamifar, E., Vidal, R.: Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2765–2781 (2013)

    Article  Google Scholar 

  19. Glasbey, C.A.: An analysis of histogram-based thresholding algorithm. CVGIP Graph. Models Image Process. 55(6), 532–537 (1993)

    Article  Google Scholar 

  20. Goldstein, T., Osher, S.: The Split Bregman Method for \(\ell _1\)-regularized Problems. SIAM J. Imag. Sci. 2(2), 323–343 (2009)

    Article  Google Scholar 

  21. Horst, R., Thoain, V., Programming, D.C.: Overview. J. Optim. Appl. 103(1), 1–43 (1999)

    Article  MathSciNet  Google Scholar 

  22. Ji, S., Xue, Y., Carin, L.: Bayesian compressive sensing. IEEE Trans. Signal Process. 52(8), 2346–2356 (2008)

    Article  MathSciNet  Google Scholar 

  23. Le Thi, H.A., Thi, B.T.N., Le, H.M.: Sparse signal recovery by difference of convex functions algorithms. In: Intelligent Information and Database Systems, vol. 7803, pp. 387–397, Berlin, Germany (2013)

  24. Li, H., Zhang, Q., Cui, A., Peng, J.: Minimization of fraction function penalty in compressed sensing. IEEE Trans. Neural Netw. Learn. Syst. 31(5), 1626–1637 (2020)

    Article  MathSciNet  Google Scholar 

  25. Liu, G., Lin, Z., Yan, S.: Robust recovery of subspace structures by low-rank representation. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 171–184 (2013)

    Article  Google Scholar 

  26. Louizos, C., Welling, M., Kingma, D.: Learning sparse neural networks through \(\ell _0\) Regularization (2017). arXiv preprint arXiv:1712.01312

  27. Moulin, P., Liu, J.: Analysis of multiresolution image denoising schemes using generalized Gaussian and complexity priors. IEEE Trans. Inform. Theory 45(3), 909–919 (1999)

    Article  MathSciNet  Google Scholar 

  28. Natarajan, B.: Sparse approximate solutions to linear systems. SIAM J. Comput. 24(2), 227–234 (1995)

    Article  MathSciNet  Google Scholar 

  29. Nikolova, M.: Estimées locales fortement homogènes. Comptes Rendus Acad. Sci. Paris 325(1), 665–670 (1997)

    Article  Google Scholar 

  30. Ong, C.S., Hoai An, L.T.: An learning sparse classifiers with difference of convex functions algorithms. Optim. Methods Softw. 28(4), 830–854 (2013)

    Article  MathSciNet  Google Scholar 

  31. Scardapane, S., Comminiello, D., Hussain, A., Uncini, A.: Group sparse regularization for deep neural networks. Neurocomputing 241, 81–89 (2017)

    Article  Google Scholar 

  32. Tao, P.D., Hoai An, L.T.: Convex analysis approach to D.c. programming: theory, algorithms and application. Acta Mathematica Vietenamica 22(1), 289–355 (1997)

    MathSciNet  MATH  Google Scholar 

  33. Temlyakov, V.N.: Greedy algorithm and \(m\)-term trigonometric approximation. Construct. Approx. 14(4), 569–587 (1998)

    Article  MathSciNet  Google Scholar 

  34. Thiao, M.: Approches de la programmation DC et DCA en data mining, Th\(\grave{e}\)se de doctorat \(\grave{a}\) 1’INSA-Rouen, France (2011)

  35. Tibshirani, R.: Regression Shrinkage and Selection via the Lasso. J. Roy. Stat. Soc. Ser. B 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  36. Tropp, J., Gillbert, A.: Signal recovery from partial information via orthogonal matching pursuit. IEEE Trans. Inform. Theory 53(12), 4655–4666 (2007)

    Article  MathSciNet  Google Scholar 

  37. Wen, B., Chen, X., Pong, T.K.: A proximal difference-of-convex algorithm with extrapolation. Comput. Optim. Appl. 69, 1–28 (2018)

    Article  MathSciNet  Google Scholar 

  38. Xu, Z., Chang, X., Xu, F., Zhang, H.: \(L_{1/2}\) regularization: a thresholding representation theory and a fast solver. IEEE Trans. Neural Netw. Learn. Syst. 23(7), 1013–1027 (2012)

    Article  Google Scholar 

  39. Yang, J., Zhang, Y.: Alternating direction algorithms for \(\ell _1\) problems in compressive sensing. SIAM J. Sci. Comput. 33(1), 250–278 (2011)

    Article  MathSciNet  Google Scholar 

  40. Yin, W., Osher, S., Goldfarb, D., Darbon, J.: Bregman iterative algorithms for \(\ell _1\)-minimization with applications to compressed sensing. SIAM J. Imag. Sci. 1(1), 143–168 (2008)

    Article  Google Scholar 

Download references

Acknowledgements

we are grateful to the editor and the referees for their comments which led to substantial improvements in the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Haiyang Li or Jigen Peng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Proofs of Theorems and Lemmas

Proofs of Theorems and Lemmas

In this section, we mainly prove all the Theorems.

It is easy to verify that for any \(\alpha >0, \nu \in (0,1]\), \(\phi _\alpha (t)\) is symmetric on \({\mathbb {R}}\), increasing and concave for \(t\in [0,+\infty )\), \(\phi _\alpha (t)=0\) if \(t=0\) and \(\lim \limits _{\alpha \rightarrow +\infty }\phi _\alpha (t)=1\) if \(t\ne 0\). Therefore, with the adjustment of parameter \(\alpha \), \(\Phi _\alpha (x)\) can approximate \(\ell _0\) norm well. Moreover, the function \(\phi _\alpha (t)\) satisfies the following triangle inequality.

Lemma 2

For any \(\alpha >0\), \(\nu \in (0,1]\) and any real number \(x,\ y\), the triangle inequality holds:

$$\begin{aligned} \phi _{\alpha }(x+y)\le \phi _{\alpha }(x)+\phi _{\alpha }(y). \end{aligned}$$

Proof

Because

$$\begin{aligned} \phi _{\alpha }(x+y)-\phi _{\alpha }(x)-\phi _{\alpha }(y)&=e^{-\alpha \mid x\mid ^\nu }+e^{-\alpha \mid y\mid ^\nu }-e^{-\alpha \mid x+y\mid ^\nu }-1\\&\quad \le e^{-\alpha \mid x\mid ^\nu }+e^{-\alpha \mid y\mid ^\nu }-e^{-\alpha (\mid x\mid +\mid y\mid )^\nu }-1\\&\quad \le e^{-\alpha \mid x\mid ^\nu }+e^{-\alpha \mid y\mid ^\nu }-e^{-\alpha (\mid x\mid ^\nu +\mid y\mid ^\nu )}-1\\&=e^{-\alpha \mid x\mid ^\nu }(1-e^{-\alpha \mid y\mid ^\nu })+e^{-\alpha \mid y\mid ^\nu }-1\\&\quad \le 1-e^{-\alpha \mid y\mid ^\nu }+e^{-\alpha \mid y\mid ^\nu }-1\\&=0, \end{aligned}$$

the triangle inequality holds. \(\square \)

Proof of Theorem 1

Suppose that \(\hat{x}\) is the optimal solution to \((\mathrm{E}_{\alpha })\) with \(\Vert \hat{x}\Vert _{0}=k> m\), then the k columns combined linearly by \(\hat{x}\) are linear dependent. Thus, there exists a non-trivial vector \(h\in {\mathbb {R}}^n\) which has the same support with \(\hat{x}\) such that \(Ah=0\). Obviously, \(\hat{x}-h\), \(\hat{x}+h\) are also solutions to \(Ax=b\).

For arbitrary j, \(\hat{x}_{j}-h_{j},\hat{x}_{j}+h_{j}\) and \(\hat{x}_{j}\) have the same sign with the assumption that

$$\begin{aligned} \max _{1\le j\le k}\mid h_{j}\mid <\min _{1\le j\le k}\mid \hat{x}_j\mid . \end{aligned}$$

Owing to the fact that \(\phi _{\alpha }(t)=1-e^{-\alpha \mid t\mid ^\nu }\) is a strictly concave function for |t|, there is

$$\begin{aligned} \left( 1-e^{-\alpha (\mid \hat{x}_j-h_j\mid ^\nu )}\right) +\left( 1-e^{-\alpha (\mid \hat{x}_j+h_j\mid ^\nu )}\right) <2\left( 1-e^{-\alpha \mid \hat{x}_j\mid ^\nu }\right) , \end{aligned}$$

thus

$$\begin{aligned} \sum _{j=1}^n \left( 1-e^{-\alpha (\mid \hat{x}_j-h_j\mid ^\nu )}\right) +\sum _{j=1}^n \left( 1-e^{-\alpha (\mid \hat{x}_j+h_j\mid ^\nu )}\right) <2\sum _{j=1}^n \ \left( 1-e^{-\alpha \mid \hat{x}_j\mid ^\nu }\right) . \end{aligned}$$

Furthermore, we can get

$$\begin{aligned} \sum _{j=1}^n \left( 1-e^{-\alpha (\mid \hat{x}_j-h_j\mid ^\nu )}\right) <\sum _{j=1}^n \left( 1-e^{-\alpha \mid \hat{x}_j\mid ^\nu })\right) \end{aligned}$$

or

$$\begin{aligned} \sum _{j=1}^n \ \left( 1-e^{-\alpha (\mid \hat{x}_j+h_j\mid ^\nu )}\right) <\sum _{j=1}^n \ \left( 1-e^{-\alpha \mid \hat{x}_j\mid ^\nu }\right) , \end{aligned}$$

which are in contradiction to the fact that \(\hat{x}\) is the optimal solution to \((\mathrm{E}_{\alpha })\), so \(\Vert \hat{x}\Vert _{0}\le m\). The Theorem is proved. \(\square \)

Proof of Theorem 2

Let \(x^{*}\) be the optimal solution to \((\mathrm{E}^\lambda _\alpha )\) and \(\Vert x^*\Vert _0=k\). Without loss of generality, we assume

$$\begin{aligned} x^*=(x^*_1,\ldots ,x^*_k,0,\ldots ,0)^T. \end{aligned}$$

It is obvious that for all \(t\in {\mathbb {R}}\) and \(h\in {\mathbb {R}}^{n}\),

$$\begin{aligned} \frac{1}{2}\Vert Ax^{*}-b\Vert ^{2}_{2}+\lambda E_{\alpha }(x^{*})\le \frac{1}{2}\Vert A(x^{*}+th)-b\Vert ^{2}_{2}+\lambda E_{\alpha }(x^{*}+th), \end{aligned}$$

which implies that

$$\begin{aligned} \frac{1}{2} t^{2}\Vert Ah\Vert ^{2}_{2}+t\langle Ax^{*}-b,Ah\rangle +\lambda (E_{\alpha }(x^{*}+th)-E_{\alpha }(x^{*}))\ge 0. \end{aligned}$$
(A1)

If \(supp(h)\subseteq supp(x^{*})\), then for all \(t\in {\mathbb {R}}\),

$$\begin{aligned} E_{\alpha }(x^{*}+th)-E_{\alpha }(x^{*})=\sum _{i\in supp(x^{*})}(1-e^{-\alpha \mid x^{*}_{i}+th_{i}\mid ^\nu }-1+e^{-\alpha \mid x^{*}_{i}\mid ^\nu }). \end{aligned}$$
(A2)

Incorporating the equality (A2) into the inequality (A1), we have

$$\begin{aligned} \frac{1}{2} t^{2}\Vert Ah\Vert ^{2}_{2}+t\langle Ax^{*}-b,Ah\rangle +\lambda \sum _{i\in supp(x^{*})}(e^{-\alpha \mid x^{*}_{i}\mid ^\nu }-e^{-\alpha \mid x^{*}_{i}+th_{i}\mid ^\nu })\ge 0. \end{aligned}$$

Dividing on both sides of the inequality above by \(t>0\), and letting \(t\rightarrow 0\) yield that

$$\begin{aligned} \langle b-Ax^{*},Ah\rangle -\lambda \alpha \nu \sum _{i\in supp(x^{*})} h_{i} \mathrm{sgn}(x^{*}_{i})\mid x^{*}_{i}\mid ^{\nu -1}e^{-\alpha \mid x^{*}_{i}\mid ^\nu }\le 0. \end{aligned}$$

Apparently, the above inequality also holds for \(-h\) which leads to

$$\begin{aligned} \langle b-Ax^{*},Ah\rangle =\lambda \alpha \nu \sum _{i\in supp(x^{*})}h_{i}\mathrm{sgn}(x^{*}_{i})\mid x^{*}_{i}\mid ^{\nu -1} e^{-\alpha \mid x^{*}_{i}\mid ^\nu }. \end{aligned}$$
(A3)

Then, incorporating the equality (A3) into the inequality (A1) yields that, for all \(t\in {\mathbb {R}}\),

$$\begin{aligned} \frac{t^2}{2}\Vert Ah\Vert _2^2&\ge t\lambda \alpha \nu \sum _{i\in supp(x^{*})}h_{i}\mathrm{sgn}(x^{*}_{i})\mid x^{*}_{i}\mid ^{\nu -1} e^{-\alpha \mid x^{*}_{i}\mid ^\nu }\nonumber \\&-\lambda (E_\alpha (x^*+th)-E_\alpha (x^*)). \end{aligned}$$
(A4)

Dividing by \(t^2\) on both sides of the inequality (A4) and letting \(t\rightarrow 0\) yield

$$\begin{aligned} \Vert Ah\Vert _2^2\ge \lambda \alpha \nu \sum _{i\in supp(x^*)}h^2_i\mid x^*_i\mid ^{\nu -2}e^{-\alpha \mid x^*_i\mid ^\nu }(1-\nu +\alpha \nu \mid x^*_i\mid ^\nu ). \end{aligned}$$
(A5)

Denote the first k columns of A by B. According to the inequality above, for any \(y\ne 0\) with \(y\in {\mathbb {R}}^k\),

$$\begin{aligned} \Vert By\Vert _2^2=\Vert A\hat{y}\Vert _2^2\ge \lambda \alpha \nu \sum _{i=1}^ky^2_i\mid x^*_i\mid ^{\nu -2}e^{-\alpha \mid x^*_i\mid ^\nu }(1-\nu +\alpha \nu \mid x^*_i\mid ^\nu )>0, \end{aligned}$$

where \(\hat{y}=(y_1,\ldots ,y_k,0,\ldots ,0)^T\), which implies that the matrix \(B^TB\) is positive definite, and hence the columns of B are linearly independent, that is, \(\Vert x^*\Vert _0\le m\). \(\square \)

Before proving the equivalence between \((\mathrm P_0)\) and the constrained minimization problems \((\mathrm{E}_\alpha )\) (Theorems 3 and 4), we need to introduce two constants and Lemma 3.

We denote by \(\Delta \) the set of solutions to \(Ax=b\) with \(\Vert x\Vert _0\le m\), then the cardinality of \(\Delta \) is finite. In fact, for any \(x\in \Delta \), because \(A_{m\times n}\) is a full row rank matrix, there exists a full rank square submatrix \(B_{m\times m}\) of \(A_{m\times n}\) such that \(Bx=b\). Since \(A_{m\times n}\) has at most \(\genfrac(){0.0pt}0{n}{m}\) full rank square submatrix \(B_{m\times m}\)s, the set \(\Delta \) has at most \(\genfrac(){0.0pt}0{n}{m}\) elements. Furthermore, we define two constants q(Ab) and Q(Ab) as follows

$$\begin{aligned} q(A,b)=\min \limits _{z\in \Delta ,z_i\not =0, 1\le i\le n}\mid z_i\mid ,\ \ \ Q(A,b)=\max \limits _{z\in \Delta ,z_i\not =0, 1\le i\le n}\mid z_i\mid . \end{aligned}$$
(A6)

Because \(\Delta \) has at most \(\genfrac(){0.0pt}0{n}{m}\) elements, the set \(\{z_i\mid z\in \Delta ,z_i\not =0, 1\le i\le n\}\) has at most \(m\times \genfrac(){0.0pt}0{n}{m}\) non-zero elements, which implies that the constants q(Ab) and Q(Ab) are finite and positive.

Lemma 3

If there is a constant \(0<p<1\) such that, for any \(x\in (0,p)\),

$$\begin{aligned} \sum _{i=1}^{n}x^{t_i}=\sum _{i=1}^{n}x^{s_i}, \end{aligned}$$

where \(t_1\ge t_2\ge \cdots \ge t_n>0, s_1\ge s_2\ge \cdots \ge s_n>0\), then \(t_1=s_1, t_2=s_2, \ldots , t_n=s_n\).

Proof

Since

$$\begin{aligned} \sum _{i=1}^{N}x^{t_i}=\sum _{i=1}^{N}x^{s_i} \end{aligned}$$
(A7)

for any \(x\in (0,p)\), dividing on both sides of the equality (A7) by \(x^{t_N}\), we have

$$\begin{aligned} \sum _{i=1}^{N-1}x^{t_i-t_N}+1=\sum _{i=1}^{N}x^{s_i-t_N}. \end{aligned}$$
(A8)

Letting \(x\rightarrow 0\), the limit of the left of equality (A8) exists because \(t_i-t_N\ge 0\ (i=1,2,\ldots ,N-1)\), which implies that the limit of the right of equality (A8) also exists. Hence, \(s_N\ge t_N\). Similarly, dividing on both sides of the equality (A7) by \(x^{s_N}\), we have \(s_N\le t_N\), which implies that \(s_N=t_N\) and

$$\begin{aligned} \sum _{i=1}^{N-1}x^{t_i}=\sum _{i=1}^{N-1}x^{s_i}. \end{aligned}$$

Repeating the process above, we have \(t_1=s_1, t_2=s_2, \ldots , t_N=s_N\), as claimed. \(\square \)

Proof of Theorem 3

Let \(x^{\alpha }\) be the optimal solution to \((\mathrm{E}_{\alpha })\) and \(x^{0}\) the optimal solution to \((\mathrm P_0)\). It suffices to show that \(\Vert x^{\alpha }\Vert _{0}=\Vert x^0\Vert _{0}\) and \(\Phi _{\alpha }(x^0)=\Phi _{\alpha }(x^\alpha )\). Firstly, we have

$$\begin{aligned} {\min _{\{x\mid Ax=b\}}} \Vert x\Vert _{0}&=\Vert x^{0}\Vert _{0}\\&\quad \ge \sum _{i\in supp(x^{0})} (1-e^{-\alpha \mid x^{0}_i\mid ^\nu })\nonumber \\&\quad \ge \sum _{i\in supp(x^{\alpha })} (1-e^{-\alpha \mid x^{\alpha }_i\mid ^\nu })\nonumber \\&\quad \ge \Vert x^{\alpha }\Vert _{0} (1-e^{-\alpha q(A,b)^\nu })\nonumber \end{aligned}$$
(A9)

Since \(x^{0}\) and \(x^{\alpha }\) are the optimal solutions to \((\mathrm P_0)\) and \((\mathrm{E}_{\alpha })\), respectively, the first equality and the third inequality hold. The second inequality is true by the definition of \(\Phi _\alpha (x)\), and the last one holds by the definition of q(Ab) in (A6).

What’s more, for arbitrary \(\alpha \) with

$$\begin{aligned} \alpha >\frac{1}{q(A,b)^\nu }\ln (\Vert x^{0}\Vert _{0}+1), \end{aligned}$$
(A10)

the following inequality is true

$$\begin{aligned} 1-e^{-\alpha q(A,b)^\nu }>\frac{\Vert x^{0}\Vert _{0}}{\Vert x^{0}\Vert _{0}+1}. \end{aligned}$$
(A11)

Combining inequality (A9) and (A11) yields that

$$\begin{aligned} \Vert x^{0}\Vert _{0}>\frac{\Vert x^{0}\Vert _0}{\Vert x^{0}\Vert _{0}+1}\Vert x^{\alpha }\Vert _{0}, \end{aligned}$$

which implies that

$$\begin{aligned} \Vert x^{0}\Vert _{0}\le \Vert x^{\alpha }\Vert _{0}<\Vert x^{0}\Vert _{0}+1. \end{aligned}$$

Since \(\Vert x^{\alpha }\Vert _{0}\) is an integer number, we get \(\Vert x^{\alpha }\Vert _{0}=\Vert x^0\Vert _{0}\) if (A10) holds.

Secondly, if \(\Phi _{\alpha }(x^0)>\Phi _{\alpha }(x^\alpha )\), then

$$\begin{aligned} 0&<\Phi _{\alpha }(x^0)-\Phi _{\alpha }(x^\alpha )\\&=\sum _{i\in supp(x^{0})}(1-e^{-\alpha \mid x^{0}_i\mid ^\nu })-\sum _{i\in supp(x^{\alpha })}(1-e^{-\alpha \mid x^{\alpha }_i\mid ^\nu })\\&\quad \le \Vert x^{\alpha }\Vert _0(1-e^{-\alpha Q(A,b)^\nu })-\Vert x^{\alpha }\Vert _0(1-e^{-\alpha q(A,b)^\nu })\\&=\Vert x^{\alpha }\Vert _0(e^{-\alpha q(A,b)^\nu }-e^{-\alpha Q(A,b)^\nu }), \end{aligned}$$

where the third inequality is true by the definition of \(Q(A,b),\ q(A,b)\) in (A6) and \(\Vert x^{0}\Vert _0\le \Vert x^{\alpha }\Vert _0\). Dividing on both sides of the equality above by \(\Vert x^{\alpha }\Vert _0\), we have

$$\begin{aligned} 0<\frac{\Phi _{\alpha }(x^0)-\Phi _{\alpha }(x^\alpha )}{\Vert x^{\alpha }\Vert _0}\le e^{-\alpha q(A,b)^\nu }-e^{-\alpha Q(A,b)^\nu }. \end{aligned}$$
(A12)

Denote by \(u(\alpha )\) the right side of (A12), it is easy to verify that \(u(\alpha )\rightarrow 0\ (\alpha \rightarrow +\infty )\) and \(u(\alpha )\) is monotonically decreasing when

$$\begin{aligned} \alpha \ge \frac{\nu (\ln Q(A,b)-\ln q(A,b))}{Q(A,b)^\nu -q(A,b)^\nu }. \end{aligned}$$

Hence, there exists \(\alpha _2\) such that, whenever \(\alpha >\alpha _2\),

$$\begin{aligned} u(\alpha )<\frac{\Phi _{\alpha }(x^0)-\Phi _{\alpha }(x^\alpha )}{\Vert x^{\alpha }\Vert _0} \end{aligned}$$

which is a contraction with (A12). That is, \(\Phi _{\alpha }(x^0)=\Phi _{\alpha }(x^\alpha )\) if \(\alpha >\alpha _2\).

Denote by \(\alpha _1\) the right side of (A10). Taking \(\alpha ^{*}=\max \{\alpha _1,\ \alpha _2\}\), we can get the conclusion that, whenever \(\alpha >\alpha ^{*}\), \((\mathrm P_0)\) and \((\mathrm{E}_\alpha )\) are equivalent. The proof is completed. \(\square \)

Proof of Theorem 4

Let \(\{\alpha _i\mid i=1,2,\ldots ,n,\ldots \}\) be the increasing infinite sequence with \(\lim \limits _{i\rightarrow +\infty }\alpha _i=+\infty \) and \(\alpha _1=1\). For each \(\alpha _i\), by Lemma 1, the optimal solution \(\hat{x}^i\) to \((\mathrm{E}_{\alpha _i})\) belongs to \(\Delta \). Since \(\Delta \) is a finite set, one element, named \(\hat{x}\), will repeatedly solve \((\mathrm{E}_{\alpha _i})\) for some subsequence \(\{\alpha _{i_k}\mid k=0,1,\ldots \}\) of \(\{\alpha _i\}\). Denote by \(\hat{\alpha }\) the smallest number \(\alpha _{i_1}\) of the infinite subsequence \(\{\alpha _{i_k}\mid k=1,2,\ldots \}\). In the following, we show that \((\mathrm P_0)\) and \((\mathrm{E}_{\hat{\alpha }})\) are equivalent.

First, \(\hat{x}\) is the optimal solution to \((\mathrm P_0)\). In fact, for arbitrary \(\alpha _{i_{k}}\ge \alpha _{i_0}\), there is

$$\begin{aligned} \Phi _{\alpha _{i_k}}(\hat{x}) =\min _{\{x\mid Ax=b\}} \Phi _{\alpha _{i_k}}(x) \le \min _{\{x\mid Ax=b\}} \Vert x\Vert _{0}. \end{aligned}$$

Let \(k\rightarrow +\infty \), we have

$$\begin{aligned} \Vert \hat{x}\Vert _{0}&=\lim _{k\rightarrow +\infty }\Phi _{\alpha _{i_k}}(\hat{x})\\&=\lim _{k\rightarrow +\infty }\min _{\{x\mid Ax=b\}} \Phi _{\alpha _{i_k}}(x)\\&\quad \le \min _{\{x\mid Ax=b\}}\Vert x\Vert _{0}, \end{aligned}$$

that is to say, \(\hat{x}\) is the optimal solution to \((\mathrm P_0)\).

Second, for any k, \(x^{0}\), the optimal solution to \((\mathrm P_0)\), is the optimal solution to \((\mathrm{E}_{\alpha _{i_k}})\). In fact, from Theorem 3, there exists \(k_0\) such that \(x^{0}\) is the optimal solution to \((\mathrm{E}_{\alpha _{i_k}})\) whenever \(k\ge k_0\) and \(\Vert x^{0}\Vert _{0}=\Vert x^{\alpha _{i_k}}\Vert _{0}\). Take \(\Vert x^{0}\Vert _{0}=N\), hence for any \(k\ge k_0\),

$$\begin{aligned} \sum _{i=1}^N(1-e^{-\alpha \mid x^{0}_i\mid ^\nu })=\sum _{i=1}^N(1-e^{-\alpha \mid \hat{x}_i\mid ^\nu }), \end{aligned}$$

which leads to

$$\begin{aligned} \sum _{i=1}^N e^{-\alpha \mid x^{0}_i\mid ^\nu }=\sum _{i=1}^N e^{-\alpha \mid \hat{x}_i\mid ^\nu }. \end{aligned}$$

Without loss of generality, we assume that \(\mid x^{0}_1\mid \ge \mid x^{0}_2\mid \ge \cdots \ge \mid x^{0}_N\mid \) and \(\mid \hat{x}_1\mid \ge \mid \hat{x}_i\mid \ge \cdots \ge \mid \hat{x}_N\mid \). Due to Lemma 3, we have \(\mid x^{0}_1\mid =\mid \hat{x}_1\mid , \mid x^{0}_2\mid =\mid \hat{x}_2\mid ,\ldots ,\mid x^{0}_N\mid =\mid \hat{x}_N\mid \), which implies that for any k, \(x^{0}\) is the optimal solution to \((\mathrm{E}_{\alpha _{i_k}})\). The proof is completed. \(\square \)

Proof of Lemma 1

(1) Let \(x^{*}\) be the optimal solution to \((\mathrm{E}^\lambda _\alpha )\). Then, with the definition of the optimal solution, we have

$$\begin{aligned} E^\lambda _\alpha (x^{*})&=\frac{1}{2}\Vert Ax^{*}-b\Vert ^{2}_{2}+\lambda \sum _{i=1}^n(1-e^{-\alpha \mid x^{*}_i\mid ^\nu })\le E^\lambda _\alpha (0)=\Vert b\Vert ^{2}_{2}. \end{aligned}$$

Thus

$$\begin{aligned} \lambda \sum _{i=1}^n (1-e^{-\alpha \mid x^{*}_i\mid ^\nu })\le \Vert b\Vert ^{2}_{2}, \end{aligned}$$
(A13)

which implies that

$$\begin{aligned} 1-e^{-\alpha \Vert x^{*}\Vert ^\nu _{\infty }}\le \frac{\Vert b\Vert ^{2}_{2}}{\lambda }. \end{aligned}$$

Furthermore, if \(\lambda >\Vert b\Vert ^{2}_{2}\), then (1) holds.

Because the function \(r(t)=t^{\nu -1}e^{-\alpha t^\nu }\ (\nu <1,\alpha >0)\) is strictly decreasing with \(t>0\), for any \(i\in supp(x^*)\),

$$\begin{aligned} \lambda \alpha \nu \mid x^{*}_i\mid ^{\nu -1} e^{-\alpha \mid x^{*}_i\mid ^\nu }\ge \lambda \alpha \nu (\Vert x^*\Vert _{\infty })^{\nu -1} e^{-\alpha (\Vert x^*\Vert _{\infty })^\nu }. \end{aligned}$$

This together with inequality (1) gives

$$\begin{aligned} \lambda \alpha \nu \mid x^{*}_i\mid ^{\nu -1} e^{-\alpha \mid x^{*}_i\mid ^\nu }\ge \alpha ^{\frac{1}{\nu }}\nu h(\lambda ). \end{aligned}$$
(A14)

Moreover, replacing h in equality (A3) with the base vector \(e_i\) for every \(i\in supp(x^*)\), we have

$$\begin{aligned} \left( A^{T}(b-Ax^{*})\right) _{i}=\lambda \alpha \nu \mathrm{sgn}(x^{*}_{i})\mid x^{*}_{i}\mid ^{\nu -1}e^{-\alpha \mid x^{*}_{i}\mid ^\nu }. \end{aligned}$$
(A15)

Combining this with the inequality (A14), the inequality (2) holds.

(2) Suppose that \(x^{*}\ne 0\) is the optimal solution to \((\mathrm E^\lambda _\alpha )\) and \(\Vert x^*\Vert _0=k\). By equality (A15), we have

$$\begin{aligned} A^{T}(Ax^{*}-b)+\lambda \alpha \nu \mathrm{diag}(\mathrm{sgn}(x^{*})\mid x^{*}\mid ^{\nu -1})e^{-\alpha \mid x^{*}\mid ^\nu }=0. \end{aligned}$$

Multiplying both sides of the equality above by \(x^{*T}\) yields that

$$\begin{aligned} x^{*T}A^{T}Ax^{*}-x^{*T}A^{T}b+x^{*T}(\lambda \alpha \nu \mathrm{diag}(\mathrm{sgn}(x^{*})\mid x^{*}\mid ^{\nu -1})e^{-\alpha \mid x^{*}\mid ^\nu })=0. \end{aligned}$$

Since \(A^{T}A\) is positive semidefinite, there is

$$\begin{aligned} -x^{*T}A^{T}b+x^{*T}(\lambda \alpha \nu \mathrm{diag}(\mathrm{sgn}(x^{*})\mid x^{*}\mid ^{\nu -1})e^{-\alpha \mid x^{*}\mid ^\nu })\le 0. \end{aligned}$$

Equivalently,

$$\begin{aligned} \sum _{i=1}^k (\lambda \alpha \nu \mid x^{*}_i\mid ^{\nu }e^{-\alpha \mid x^{*}_i\mid ^\nu }-(A^{T}b)_{i}x^{*}_i)\le 0. \end{aligned}$$
(A16)

On the other hand, using the fact that the function \(h(\lambda )\) is strictly increasing, we have

$$\begin{aligned} \alpha ^{\frac{1}{\nu }}\nu h(\lambda )>\Vert A^Tb\Vert _{\infty } \end{aligned}$$

when

$$\begin{aligned} \lambda >\hat{\lambda }= h^{-1}\left( \frac{\Vert A^Tb\Vert _{\infty }}{\nu \alpha ^\frac{1}{\nu }}\right) . \end{aligned}$$

For any \(i\in supp(x^*)\), combining this inequality with the inequality (A14),

$$\begin{aligned} \lambda \alpha \nu \mid x^*_i\mid ^{\nu -1}e^{-\alpha \mid x^*_i\mid ^\nu }>\Vert A^{T}b\Vert _{\infty } \end{aligned}$$

holds. Consequently,

$$\begin{aligned} \lambda \alpha \nu \mid x^*_i\mid ^{\nu -1}e^{-\alpha \mid x^*_i\mid ^\nu }-\mathrm{sgn}(x^*_i)(A^{T}b)_i>0, \end{aligned}$$

which leads to

$$\begin{aligned} \lambda \alpha \nu \mid x^*_i\mid ^{\nu }e^{-\alpha \mid x^*_i\mid ^\nu }-x^*_i(A^{T}b)_i>0, \end{aligned}$$

which is a contradiction to inequality (A16), as claimed. \(\square \)

Proof of Theorem 5

Let \(x^{\lambda }\) and \(x^{\alpha }\) be the optimal solutions to \((\mathrm{E}^{\lambda }_\alpha )\) and \((\mathrm{E}_{\alpha })\), respectively. To prove the conclusion, it suffices to show that \(Ax^{\lambda }=b\). In fact, if \(Ax^{\lambda }=b\), then we have

$$\begin{aligned} \Phi _\alpha (x^\lambda )=\Phi _\alpha (x^\alpha ) \end{aligned}$$

and

$$\begin{aligned} \frac{1}{2}\Vert Ax^\lambda -b\Vert _2^2+\lambda \Phi _\alpha (x^\lambda )=\frac{1}{2}\Vert Ax^\alpha -b\Vert _2^2+\lambda \Phi _\alpha (x^\alpha ), \end{aligned}$$

which implies that \((\mathrm{E}^{\lambda }_\alpha )\) and \((\mathrm{E}_\alpha )\) are equivalent.

Suppose that \(Ax^{\lambda }=b^{\lambda }\ne b\). Let \(y^{\alpha }\) be the optimal solution to

$$\begin{aligned} \min \limits _{y\in {\mathbb {R}}^n}\Phi _\alpha (y)\ \ \mathrm{subject\ to}\ \ Ay=b-b^\lambda , \end{aligned}$$

then \(A(x^\lambda +y^\alpha )=b\). By Theorem 1, the column submatrix \(A^*\) of A consisting of the columns indexed by the set of \(supp(y^\alpha )\) is full column rank and \(\Vert y^\alpha \Vert _0=k\le m\). Besides, due to the definition of \(\sigma _{\min }\), we have

$$\begin{aligned} \sigma _{\min }^2\le \frac{\Vert A^*\hat{y}^\alpha \Vert _2^2}{\Vert \hat{y}^\alpha \Vert _2^2}, \end{aligned}$$
(A17)

where \(\hat{y}^\alpha \in {\mathbb {R}}^k\) is composed of non-zero elements of \(y^\alpha \). With the inequality of matrix-norm, there is

$$\begin{aligned} \Vert \hat{y}^\alpha \Vert _2^2=\Vert y^\alpha \Vert _2^2&\ge \frac{1}{\Vert A\Vert _2^2}\Vert Ay^\alpha \Vert _2^2\\&=\frac{1}{\Vert A\Vert _2^2}\Vert Ax^\lambda -b\Vert _2^2\\&\quad \ge \frac{1}{\Vert A\Vert _2^4}\Vert A^T(Ax^\lambda -b)\Vert _2^2\\&\quad \ge \frac{\alpha ^{\frac{2}{\nu }}\nu ^2}{\Vert A\Vert _2^4}h^2(\lambda ), \end{aligned}$$

in which the last inequality holds by the inequality (2).

Furthermore, as \(\lambda \Phi _\alpha (y^\alpha )\le \lambda m\), it is true that

$$\begin{aligned} \frac{\lambda \Phi _\alpha (y^\alpha )}{\Vert y^\alpha \Vert _2^2}\le \frac{\lambda m\Vert A\Vert _2^4}{\alpha ^{\frac{2}{\nu }}\nu ^2h^2(\lambda )}. \end{aligned}$$

Combining it with (6) and (A17), we get

$$\begin{aligned} \frac{\lambda \Phi _\alpha (y^\alpha )}{\Vert y^\alpha \Vert _2^2}<\frac{1}{2}\sigma ^2_{min}\le \frac{1}{2}\frac{\Vert A^*\hat{y}^\alpha \Vert _2^2}{\Vert \hat{y}^\alpha \Vert _2^2}=\frac{1}{2}\frac{\Vert Ay^\alpha \Vert _2^2}{\Vert y^\alpha \Vert _2^2}, \end{aligned}$$

which implies that

$$\begin{aligned} \lambda \Phi _\alpha (y^\alpha )<\frac{1}{2}\Vert Ay^\alpha \Vert _2^2=\frac{1}{2}\Vert Ax^\lambda -b\Vert _2^2. \end{aligned}$$
(A18)

Therefore, we obtain

$$\begin{aligned} \lambda \Phi _\alpha (x^\alpha )&\le \lambda \Phi _\alpha (x^\lambda +y^\alpha )\\&=\frac{1}{2}\Vert A(x^\lambda +y^\alpha )-b\Vert _2^2+\lambda \Phi _\alpha (x^\lambda +y^\alpha )\nonumber \\&\quad \le \lambda E_\alpha (x^\lambda )+\lambda \Phi _\alpha (y^\alpha )\nonumber \\&\quad <\lambda \Phi _\alpha (x^\lambda )+\frac{1}{2}\Vert Ax^\lambda -b\Vert _2^2\nonumber \\&\quad \le \lambda \Phi _\alpha (x^\alpha ).\nonumber \end{aligned}$$
(A19)

Since \(x^{\lambda }\) and \(x^{\alpha }\) are the optimal solutions to \((\mathrm P_0)\) and \((\mathrm{E}_{\alpha })\), respectively, the first and the last inequality hold. The second equality holds because of \(A(x^\lambda +y^\alpha )=b\). The third inequality is true by Lemma 2, and the fourth inequality holds by (A18). This is a contradiction and hence \(Ax^\lambda =b\), as claimed. \(\square \)

Proof of Theorem 6

We define the objective function of (8) as

$$\begin{aligned} l(x)={\frac{1}{2}\Vert Ax-b\Vert ^{2}_{2}+\lambda \Vert x\Vert _{1}-\langle x,v^{k}\rangle } \end{aligned}$$

and its surrogate function is

$$\begin{aligned}&l_\mu (x,z)\nonumber \\ =&\frac{\mu }{2}\Vert Ax-b\Vert ^{2}_{2}+\mu \lambda \Vert x\Vert _{1}-\mu \langle x,v^{k}\rangle -\frac{\mu }{2}\Vert Ax-Az\Vert ^2_2+\frac{1}{2}\Vert x-z\Vert ^{2}_{2}\nonumber \\ =&\sum _{i=1}^n[\frac{1}{2} x^2_i-x_i(z_{i}+\mu v^{k}_{i}+\mu (A^{T}b)_i-\mu (A^{T}Az)_i)]+\lambda \mu \sum _{i=1}^n (\mid x_i\mid )\nonumber \\&~~+\frac{\mu }{2} \Vert b\Vert ^2-\frac{\mu }{2} \Vert Az\Vert ^2+\frac{1}{2} \Vert z\Vert ^2 \end{aligned}$$
(A20)

where z is an additional variable, \(\mu \) is a positive parameter. Then we have the following optimization problem

$$\begin{aligned} \min \limits _{x,z\in {\mathbb {R}}^n}l_\mu (x,z). \end{aligned}$$
(A21)

For a fixed \(z\in {\mathbb {R}}^n\), as each component \(x_i\) is decoupled, we can minimize function \(l_\mu (x,z)\) with respect to each \(x_i\) individually. Based on the equality (A20), the objective function of this component-wise minimization is

$$\begin{aligned} l_\mu (x_{i},z)=\frac{1}{2} x^{2}_{i}-x_{i}(z_{i}+\mu v^{k}_{i}+\mu (A^{T}b)_i-\mu (A^{T}Az)_i)+\lambda \mu \mid x_{i}\mid . \end{aligned}$$

Based on the reference [14], we know that the minimizer \(x^*=(x^*_1,\ldots ,x^*_n)\) to this minimization is

$$\begin{aligned} x^*_{i}=S_{\lambda \mu }(z_{i}+\mu v^{k}_{i}+\mu (A^{T}b)_i-\mu (A^{T}Az)_i), \end{aligned}$$
(A22)

where \(S_{\lambda \mu }(t)=\mathrm{sgn}(t)\max \{0,\mid t\mid -\lambda \mu \}\) for any \(t\in {\mathbb {R}}\).

Moreover, because of the condition \(0<\mu \le \Vert A\Vert ^2\), we have

$$\begin{aligned}&l_\mu (x,x^*)\\&\quad =\frac{\mu }{2}\Vert Ax-b\Vert ^{2}_{2}+\mu \lambda \Vert x\Vert _{1}-\mu \langle x,v^{k}\rangle -\frac{\mu }{2}\Vert Ax-Ax^*\Vert ^2_2+\frac{1}{2}\Vert x-x^*\Vert ^{2}_{2}\\&\quad \ge \frac{\mu }{2}\Vert Ax-b\Vert ^{2}_{2}+\mu \lambda \Vert x\Vert _{1}-\mu \langle x,v^{k}\rangle \\&\quad \ge \mu l(x^*)\\&\quad = l_\mu (x^*,x^*), \end{aligned}$$

which implies that \(x^*\) is a local minimizer of \(l_\mu (x,x^*)\) as long as \(x^*\) is the optimal solution to (8).

Taking \(z=x^*\) in the equality (A22), we have

$$\begin{aligned} x^*_{i}=S_{\lambda \mu }(x^*_{i}+\mu v^{k}_{i}+\mu (A^{T}b)_i-\mu (A^{T}Ax^*)_i), \end{aligned}$$

as claimed \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., Zhang, Q., Lin, S. et al. Sparse signal recovery via generalized gaussian function. J Glob Optim 83, 783–801 (2022). https://doi.org/10.1007/s10898-022-01126-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-022-01126-2

Keywords

Mathematics Subject Classification

Navigation