Re-Weighted $$\ell _{1}$$ Algorithms within the Lagrange Duality Framework

Valdés, Matías; Fiori, Marcelo

doi:10.1007/978-3-030-31332-6_13

Matías Valdés¹² &
Marcelo Fiori¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11867))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

1461 Accesses

Abstract

We consider an important problem in signal processing, which consists in finding the sparsest solution of a linear system $\varPhi x=b$. This problem has applications in several areas, but is NP-hard in general. Usually an alternative convex problem is considered, based on minimizing the (weighted) $\ell _{1}$ norm. For this alternative to be useful, weights should be chosen as to obtain a solution of the original NP-hard problem. A well known algorithm for this is the Re-Weighted $\ell _{1}$, proposed by Candès, Wakin and Boyd. In this article we introduce a new methodology for updating the weights of a Re-Weighted $\ell _{1}$ algorithm, based on identifying these weights as Lagrange multipliers. This is then translated into an algorithm with performance comparable to the usual methodology, but allowing an interpretation of the weights as Lagrange multipliers. The methodology may also be used for a noisy linear system, obtaining in this case a Re-Weighted LASSO algorithm, with a promising performance according to the experimental results.

You have full access to this open access chapter, Download conference paper PDF

A Continuous-Time Iteratively Reweighted Least Squares Algorithm for $$L_\infty $$ Estimation

Sufficient Conditions for the Uniqueness of Solution of the Weighted Norm Minimization Problem

A Generalized Class of Hard Thresholding Algorithms for Sparse Signal Recovery

Keywords

1 Introduction

An important problem in signal processing, particularly in the field of compressed sensing and sparse coding, is to find the “sparsest” solution of a linear system $\varPhi x =b$; being $\varPhi \in {{\mathbb {R}}}^{m \times n}$, $m<n$. That is: a solution with as many null coordinates as possible. This problem has applications in several areas like [9]: medical imaging, error correcting, digital cameras and wireless communication. Sparsity may be measured by the $\ell _{0}$ pseudo-norm $||x||_{0}$, which counts the number of non zero coordinates. The problem of interest is then:

This problem is NP-hard in general [8]. A usual alternative is to replace the $\ell _{0}$ pseudo-norm by a weighted $\ell _{1}$ norm:

Problem ($P_{1}W$) is convex and so it may be solved efficiently, although it is not always equivalent to ($P_{0}$). Note that $\ell _{1}$ minimization is obtained by using unit weights in ($P_{1}W$). For this particular case there are important results about its equivalence with ($P_{0}$), mainly due to Donoho [7] and Candès, Romberg and Tao [4]. For the general case, the task is to choose “useful weights” for ($P_{1}W$), defined as those that make $x^{w}$ be a solution of ($P_{0}$). Candès, Wakin and Boyd (CWB) proposed an iterative algorithm, known as Re-Weighted $\ell _{1}$ (RW$\ell _{1}$), to estimate useful weights [5]. The algorithm updates weights as follows:

$$\begin{aligned} w_{i}^{k+1} = \frac{1}{|x_{i}^{k}|+ \epsilon _{k}}, \forall k \ge 0; \end{aligned}$$

(1)

for some $\epsilon _{k}>0$ and with:

$$\begin{aligned} \begin{array}{ccc} x^{k} \in &{} {{\,\mathrm{arg\!\min }\,}}&{} \sum \limits _{i=1}^{n} w_{i}^{k} |x_{i}|, \forall k \ge 0.\\ &{}\varPhi x=b &{}\\ &{}x \in {{\mathbb {R}}}^{n}&{} \end{array} \end{aligned}$$

(2)

In this work we propose a new methodology to estimate weights, based on the theory of Lagrange duality. Using this methodology, together with an algorithm for estimating solutions from a dual problem, we obtain a new RW$\ell _{1}$ algorithm. The methodology is also applied to a noisy linear system, obtaining in this case a Re-Weighted LASSO algorithm (RW-LASSO).

The rest of the paper is organized as follows: Sect. 2 introduces the proposed methodology in the oracle case, in which a solution of ($P_{0}$) is known. Here an oracle dual problem is obtained. Section 3 describes some solutions of this dual problem. In Sect. 4 a new RW$\ell _{1}$ algorithm is obtained by applying the proposed methodology with the subgradient algorithm. Section 5 extends the methodology and the RW$\ell _{1}$ subgradient algorithm to the non-oracle case, in which no solution of ($P_{0}$) is known. Section 6 generalizes the methodology for the case in which the linear system is affected by noise. Here a RW-LASSO algorithm is obtained. Section 7 analices the performance of the proposed RW$\ell _{1}$ algorithm in the noiseless case, and the RW-LASSO algorithm in the noisy case, both applied to random linear systems. Section 8 gives the final conclusions.

2 Methodology with Oracle

The proposed methodology is introduced in the ideal case in which a solution $x^{*}$ of ($P_{0}$) is known. Consider the ideal primal problem defined as:

This is a convex problem, so it can be solved efficiently. Also, any solution of (P) is a solution of ($P_{0}$). Of course (P) is ideal, since $x^{*}$ is assumed to be known, so it has no practical value. Consider the Lagrange relaxation obtained by relaxing only the constraints involving $x^{*}$. The associated Lagrangian is:

$$\begin{aligned} L(x,w) = \sum \limits _{i=1}^{n} w_{i} \left( |x_{i}| - |x^{*}_{i}| \right) = \sum \limits _{i=1}^{n} w_{i} |x_{i}| - \sum \limits _{i=1}^{n} w_{i} |x^{*}_{i}|, \end{aligned}$$

(3)

where $w_{i} \ge 0$ are the Lagrange multipliers. The dual function is then:

$$\begin{aligned} d(w) := \min _{ \begin{array}{ll} \varPhi x=b \\ x \in {{\mathbb {R}}}^{n} \end{array} } L \left( w, x \right) = \left( \min _{ \begin{array}{ll} \varPhi x=b \\ x \in {{\mathbb {R}}}^{n} \end{array} } \sum \limits _{i=1}^{n} w_{i} |x_{i}| \right) - \sum \limits _{i=1}^{n} w_{i} |x^{*}_{i}|. \end{aligned}$$

(4)

This dual function involves a Weighted $\ell _{1}$ problem, in which weights are Lagrange multipliers. This is the key idea behind the proposed methodology: identify weights of ($P_{1}W$) as Lagrange multipliers. The problem is then in the context of Lagrange duality. In particular, weights may be estimated by any algorithm to estimate multipliers. Equivalently, weights may be estimated as solutions of the dual problem, given by:

This maximization problem is always concave, so it may be solved efficiently. One drawback is that usually the dual function is non differentiable, so for example gradient based algorithms must be replaced by subgradient.

3 Solutions of the Dual Problem

Now the interest is to find “useful solutions” of the dual (D). That is: $w \ge 0$ such that $x^{w}$ is a solution of ($P_{0}$). This section shows that such solutions always exist, although not every solution of (D) has this property.

Proposition 1

Primal problem (P) satisfies strong duality: $d^{*}=f^{*}$.

Proof

The primal optimal value is clearly $f^{*}=0$. By weak duality: $d^{*} \le f^{*}=0$. So it suffices to show that $d(w)=0$, for some $w \ge 0$. Taking $w=\mathbf {0}$:

$$\begin{aligned} d(\mathbf {0}) = \left( \min _{ \begin{array}{ll} \varPhi x=b \\ x \in {{\mathbb {R}}}^{n} \end{array} } \sum \limits _{i=1}^{n} 0 |x_{i}| \right) - \sum \limits _{i=1}^{n} 0 |x^{*}_{i}|=0. \end{aligned}$$

(5)

$\square $

It was also shown that $w=\mathbf {0}$ is a solution of (D). Clearly $w = \mathbf {0}$ is not necessarily a useful solution, since $x^{\mathbf {0}}$ could be any solution of the linear system:

$$\begin{aligned} \begin{array}{ccc} x^{\mathbf {0}} \in &{} {{\,\mathrm{arg\!\min }\,}}&{} \sum \limits _{i=1}^{n} 0 |x_{i}| = \{ \varPhi x=b \}.\\ &{}\varPhi x=b &{}\\ &{}x \in {{\mathbb {R}}}^{n}&{}\\ \end{array} \end{aligned}$$

A consequence of strong duality is that the set of Lagrange multipliers and of dual solutions are equal. Therefore, useful weights may be estimated as dual solutions. The following result shows that the dual problem always admits useful weights as solutions.

Proposition 2

Let ${\hat{w}} \ge 0$ such that: ${\hat{w}}_{i}=0 \Leftrightarrow x_{i}^{*} \ne 0$. Then every solution $x^{{\hat{w}}}$ of the problem ($P_{1}W$) associated to ${\hat{w}}$, is a solution of ($P_{0}$).

Proof

Let $I = \{ i / x_{i}^{*}=0 \}$. By definition of ${\hat{w}}$ and $x^{{\hat{w}}}$, and using that $\varPhi x^{*} = b$:

$$\begin{aligned} 0 \le \sum \limits _{i \in I}^{} {\hat{w}}_{i}|x^{{\hat{w}}}_{i}| = \sum \limits _{i=1}^{n} {\hat{w}}_{i}|x^{{\hat{w}}}_{i}| \le \sum \limits _{i=1}^{n} {\hat{w}}_{i}|x^{*}_{i}| = \sum \limits _{i \in I}^{} {\hat{w}}_{i}|x^{*}_{i}| = 0. \end{aligned}$$

(6)

This implies: ${\hat{w}}_{i}|x^{{\hat{w}}}_{i}| = 0, \forall i \in I$. Since ${\hat{w}}_{i} > 0, \forall i \in I$, then we must have: $x^{{\hat{w}}}_{i} = 0, \forall i \in I$. So: $||x^{{\hat{w}}}||_{0} \le ||x^{*}||_{0}$. By definition $\varPhi x^{{\hat{w}}}=b$, then it solves ($P_{0}$). $\square $

4 RW$\ell _{1}$ with Projected Subgradient Algorithm

In this section we give an implementation of the proposed methodology, by using the projected subgradient algorithm for estimating solutions of the dual problem. This algorithm may be thought as a (sub)gradient “ascent”, with a projection on the dual feasible set. More specifically, starting at $w^{0} \ge 0$, the update is:

$$\begin{aligned} \left\{ \begin{array}{ll} w^{k+1} = w^{k} + \alpha _{k} g^{k}\\ w^{k+1} = \max \{ 0,w^{k+1} \} \end{array} \right. , \forall k \ge 0; \end{aligned}$$

(7)

where $g^{k} \in \partial d(w^{k})$ is a subgradient of the dual function at $w^{k}$, and $\alpha _{k} >0$ the stepsize. Although this is not strictly an ascent method, it is always possible to choose the stepsize in order to decrease the distance of $w^{k}$ to the dual solution set. A way for this is to update the stepsize as [3]:

$$\begin{aligned} \alpha _{k} = \frac{d^{*}-d(w^{k})}{||g^{k}||^{2}_{2}} \ge 0, \; \forall k \ge 0. \end{aligned}$$

(8)

Applying [2] [Example 3.1.2] to (P), it can be seen that a subgradient $g^{k} \in \partial d(w^{k})$ can be obtained by solving a Weighted $\ell _{1}$ problem:

$$\begin{aligned} \begin{array}{ccc} x^{k} \in &{} {{\,\mathrm{arg\!\min }\,}}&{} \sum \limits _{i=1}^{n} w_{i}^{k} |x_{i}| \Rightarrow g(x^{k}) \in \partial d(w^{k}), \forall k \ge 0.\\ &{}\varPhi x=b &{}\\ &{}x \in {{\mathbb {R}}}^{n}&{}\\ \end{array} \end{aligned}$$

(9)

Note that the stepsize can now be written as:

$$\begin{aligned} \alpha _{k} = \frac{d^{*}-d(w^{k})}{||g^{k}||^{2}_{2}} = \frac{0-L(x^{k},w^{k})}{||g(x^{k})||_{2}^{2}} = - \frac{ \sum \limits _{i=1}^{n} w_{i}^{k} \left( |x_{i}^{k}| - |x^{*}| \right) }{\sum \limits _{i=1}^{n} \left( |x_{i}^{k}| - |x^{*}| \right) ^{2}}, \forall k \ge 0. \end{aligned}$$

(10)

Algorithm 1 shows a pseudocode of the proposed RW$\ell _{1}$ subgradient algorithm.

5 Methodology and Algorithm Without Oracle

The proposed methodology is now extended to the practical case, in which no solution of ($P_{0}$) is known. A simple way for doing this is to replace $x^{*}$ in the ideal constraints by its best known estimate $x^{k}$, “amplified” by some $\epsilon _{k}>0$:

$$\begin{aligned} g_{i}^{k}(x)= |x_{i}| - \left( 1 + \epsilon _{k} \right) |x_{i}^{k}|, \forall k \ge 0; \end{aligned}$$

(11)

where $x^{k}$ is calculated in the same way as in the oracle case:

$$\begin{aligned} \begin{array}{ccc} x^{k} \in &{} {{\,\mathrm{arg\!\min }\,}}&{} \sum \limits _{i=1}^{n} w_{i}^{k} |x_{i}|, \forall k \ge 0.\\ &{}\varPhi x=b &{}\\ &{}x \in {{\mathbb {R}}}^{n}&{}\\ \end{array} \end{aligned}$$

This gives specific constraints $g^{k}(\cdot )$ for each step k, and their respective primal problem:

Since $x^{k}$ is always feasible at ($P^{k}$), this problem has optimal value $f^{k}=0$. By relaxing its non-ideal constraints, a dual problem may be obtained. The Lagrange an dual functions are, respectively:

$$\begin{aligned} L^{k}(x,w)= & {} \sum \limits _{i=1}^{n} w_{i} |x_{i}| - \sum \limits _{i=1}^{n} w_{i} \left( 1 + \epsilon _{k} \right) |x^{k}_{i}|, \end{aligned}$$

(12)

$$\begin{aligned} d^{k}(w)= & {} \left( \min _{ \begin{array}{ll} \varPhi x=b \\ x \in {{\mathbb {R}}}^{n} \end{array} } \sum \limits _{i=1}^{n} w_{i} |x_{i}| \right) - \sum \limits _{i=1}^{n} w_{i} \left( 1 + \epsilon _{k} \right) |x^{k}_{i}|. \end{aligned}$$

(13)

Like in the oracle case, each dual function involves a Weighted $\ell _{1}$ problem, with weights as Lagrange multipliers. This allows to extend the methodology, by estimating weights of ($P^{k}$) as Lagrange multipliers, or solving its dual problem:

Solutions of ($D^{k}$) may be analized in a similar way as for (D). In particular, it can be easily seen that ($P^{k}$) satisfies strong duality, with optimal values $f^{k}=d^{k}=0$. It is very useful to know the optimal value $d^{k}$ for ($D^{k}$), in order to compute the stepsize for the subgradient algorithm, when applied to ($D^{k}$):

$$\begin{aligned} \alpha _{k} = \frac{d^{k}-d^{k}(w^{k})}{||g^{k}(x^{k})||_{2}^{2}} = \frac{0-L^{k}(x^{k},w^{k})}{||g^{k}(x^{k})||_{2}^{2}} = \frac{1}{\epsilon _{k}} \frac{ \Vert W^{k} x^{k} \Vert _{1} }{\Vert x^{k}\Vert _{2}^{2}} \ge 0. \end{aligned}$$

(14)

Algorithm 2 shows the pseudo-code of the non-oracle RW$\ell _{1}$ method, obtained by combining the proposed methodology with the projected subgradient algorithm.

At each step of Algorithm 2, and before the projection, the update is:

$$\begin{aligned} w_{i}^{k+1} = w_{i}^{k} + \alpha _{k} g_{i}^{k}(x^{k}) = w_{i}^{k} - \frac{ \Vert W^{k} x^{k} \Vert _{1} }{\Vert x^{k}\Vert _{2}^{2}} |x_{i}^{k}|, \forall k \ge 0; \end{aligned}$$

so Algorithm 2 is independent of $\epsilon _{k}>0$. We take $\epsilon _{k}=1, \forall k \ge 0$.

6 Problem with Noise

In this section we consider the case in which the linear system is affected by noise. That is: $b = \varPhi x^{*} + z$, where z represents the noise. The problem of interest is now:

This problem is also NP-hard in general, for any level of noise $\eta \ge 0$ [8]. Replacing the $\ell _{0}$ pseudo-norm by a weighted $\ell _{1}$ norm, we obtain a convex alternative:

The proposed methodology is the same as in the noiseless case. Now the oracle primal problem is:

$$\begin{aligned} \begin{array}{ll} \qquad {{\,\mathrm{arg\!\min }\,}}&{} 0.\\ \frac{1}{2} ||\varPhi x-b||^{2}_{2} \le \frac{\eta ^{2}}{2}&{}\\ |x_{i}| \le |x^{*}_{i}|, \forall i&{} \end{array} \end{aligned}$$

(15)

The Lagrangian obtained by relaxing the ideal constraints is the same as for the noiseless case. The dual function is now:

$$\begin{aligned} d(w) = \left( \min _{ \begin{array}{ll} \frac{1}{2} ||\varPhi x-b||^{2}_{2} \le \frac{\eta ^{2}}{2} \end{array} } \sum \limits _{i=1}^{n} w_{i} |x_{i}| \right) - \sum \limits _{i=1}^{n} w_{i} |x_{i}^{*}|. \end{aligned}$$

(16)

This is a Weighted $\ell _{1}$ problem with quadratic constraints. Such as in the noiseless case, weights can be identified with Lagrange multipliers. So the methodology and the RW$\ell _{1}$ subgradient algorithm are the same as for the noiseless case, but replacing ($P_{1}W$) with ($P_{1}^{\eta }W$). Going a step further, if the quadratic constraints are also relaxed, a new dual function may be obtained:

$$\begin{aligned} d(w,\lambda ) = \left( \min _{ \begin{array}{ll} x \in {{\mathbb {R}}}^{n} \end{array} } \frac{\lambda }{2}\Vert \varPhi x-b\Vert _{2}^{2} + \sum \limits _{i=1}^{n} w_{i} |x_{i}| \right) - \left( \frac{\lambda }{2} \eta ^{2} + \sum \limits _{i=1}^{n} w_{i} |x_{i}^{*}| \right) . \end{aligned}$$

(17)

This involves the well known Weighted LASSO problem, which is a simple generalization of the LASSO problem, introduced by Tibshirani in the area of statistics [10]. Chen, Donoho and Saunders introduced the same LASSO problem in the context of signal representation, but with the name of Basis Pursuit Denoising [6]. Note that useful weights of ($P_{1}^{\eta }W$) can still be estimated as part of the Lagrange multipliers; which are now $w \in {{\mathbb {R}}}^{n}_{+}$ and $\lambda \in {{\mathbb {R}}}_{+}$. When combined with the projected subgradient algorithm, this gives a RW-LASSO algorithm, in which at each step a Weighted-LASSO problem must be solved instead of ($P_{1}^{\eta }W$):

$$\begin{aligned} x^{k} \in \mathop {{{\,\mathrm{arg\!\min }\,}}}\limits _{x \in {{\mathbb {R}}}^{n}} \frac{\lambda ^{k}}{2}||\varPhi x-b||^{2}_{2} + \sum \limits _{i=1}^{n} w_{i}^{k} |x_{i}|, \forall k \ge 0. \end{aligned}$$

(18)

Algorithm 3 shows a pseudocode for the proposed subgradient RW-LASSO algorithm.

CWB RW$\ell _{1}$ algorithm can also be extended to the noisy model, by updating weights as in the noiseless case, but taking [5]:

$$\begin{aligned} \begin{array}{ccc} x^{k} \in &{} {{\,\mathrm{arg\!\min }\,}}&{} \sum \limits _{i=1}^{n} w_{i}^{k} |x_{i}|, \forall k \ge 0.\\ &{}\frac{1}{2} ||\varPhi x-b||^{2}_{2} \le \frac{\eta ^{2}}{2}&{}\\ \end{array} \end{aligned}$$

(19)

7 Experimental Results

7.1 Results for the Noise-Free Setting

This section analyzes the performance of the proposed RW$\ell _{1}$ subgradient algorithm, when applied to a random linear system, and taking the method by CWB as reference. For a given level of sparsity s, a random linear system $\varPhi x=b$ is generated, with a solution $x^{*}$ such that $\Vert x^{*}\Vert _{0} \le s$. The experimental setting is based on [5]:

1.
Generate $\varPhi \in {{\mathbb {R}}}^{m \times n}$, with $n=256$, $m=100$ and Gaussian independent entries:
$$\begin{aligned} \varPhi _{ij} \sim N \left( 0,\sigma =\frac{1}{\sqrt{m}} \right) , \forall i,j. \end{aligned}$$
Note that in particular $\varPhi $ will have normalized columns (in expected value).
2.
Select randomly a set $I_{s} \subset \{ 1,\ldots ,n \}$ of s indexes, representing the coordinates of $x^{*}$ where non-null values are allowed.
3.
Generate the values of $x^{*}_{i}, i \in I_{s}$, with independent Gaussian distribution:
$$\begin{aligned} x^{*}_{i} \sim N \left( 0, \sigma = \frac{1}{\sqrt{s}} \right) , \forall i \in I_{s}. \end{aligned}$$
Note that in particular $x^{*}$ will be normalized in expected value.
4.
Generate the independent term: $b=\varPhi x^{*} \in {{\mathbb {R}}}^{m}$.

For both RW algorithms, the proposed one and the method by CWB, we use $w^{0}=\mathbf {1}$. For CWB we take $\epsilon _{k}=0.1, \forall k \ge 0$. Following [5], we say $x^{*}$ was recovered if:

$$\begin{aligned} \Vert x^{\text {RWIter}}-x^{*}\Vert _{\infty } \le 1 \times 10^{-3}. \end{aligned}$$

(20)

For each level of sparsity $s \in [15,55]$, a recovery rate is calculated as the percentage of recovery over $N_{p}=300$ random problems. Figure 1 shows the results for different number of RW iterations. Results for $\ell _{1}$ minimization are also shown for reference. Considering only one RW iteration, the proposed algorithm is slightly better than CWB. This difference disappears for two or more RW iterations, where both algorithms show the same performance; with the additional interpretability of the weights in the proposed methodology.

7.2 Results for the Noisy Setting

Following [5], random problems with noise are generated with $n=256$ and $m=128$. $\varPhi $ and $x^{*}$ are generated in the same way as in the noiseless case. Noise z in b is taken with Gaussian independent coordinates, and such that $x^{*}$ is feasible with high probability. For this we take: $z_{i}=\sigma v_{i}, v_{i} \sim N(0,1) \text { independent}$, so:

$$\begin{aligned} \Vert z\Vert _{2}^{2} = \sigma ^{2} \Vert v\Vert _{2}^{2} = \sigma ^{2} \left( \sum \limits _{i=1}^{m} v_{i}^{2} \right) \sim \sigma ^{2} \chi _{m}^{2}. \end{aligned}$$

(21)

Taking for example $\eta ^{2} = \sigma ^{2} \left( m + 2 \sqrt{2m} \right) $, we have:

$$\begin{aligned} P \left( \Vert \varPhi x^{*}-b\Vert _{2}^{2} \le \eta ^{2} \right) = 1 - P \left( \chi ^{2}_{128} \ge 160 \right) \simeq 0.971. \end{aligned}$$

(22)

We use $w^{0}=\mathbf {1}$ for both algorithms. For subgradient RW-LASSO we take $\lambda ^{0} = \frac{n}{\Vert z\Vert _{1}}$, where z is a solution of $\varPhi x =b$ with minimum $\ell _{2}$ norm. FISTA algorithm is used for solving each Weighted-LASSO problem [1]. Performance is measured by the improvement with respect to a solution $x_{\ell _{1}}^{\eta }$ of noisy $\ell _{1}$ minimization:

$$\begin{aligned} a=100 \times \left( 1-\frac{||x^{RW}-x^{*}||_{2}}{||x_{\ell _{1}}^{\eta }-x^{*}||_{2}} \right) \%. \end{aligned}$$

(23)

Figure 2 shows the performance with noisy measures for both RW methods: the proposed RW-LASSO algorithm and the RW$\ell _{1}$ CWB algorithm. Results correspond to $N_{p}=300$ tests on random problems with fixed sparsity $s=38$. The mean improvement ${\bar{x}}$ is also shown (vertical red line), together with ± one standard deviation ${\bar{\sigma }}$ (vertical violet and green lines). CWB RW$\ell _{1}$ algorithm shows a mean improvement of $21\%$ with respect to $\ell _{1}$ minimization. For RW-LASSO subgradient this improvement is $32\%$, significantly higher than CWB.

We also considered the RW-LASSO algorithm with weights updated as CWB, but the performance was very poor. The reason for this may be that $\lambda _{k}$ remains fixed at $\lambda _{0}$, as there is no obvious rule for updating it.

8 Conclusions

In this paper the important problem of finding sparse solutions of a linear system was considered. A usual alternative to this NP-hard problem is the Weighted $\ell _{1}$ problem, where the choice of weights is crucial. A new methodology for estimating weights was proposed, based on identifying weights as solutions of a Lagrange dual problem. It was shown that this problem always admits “useful” solutions. The proposed methodology was then applied using the projected subgradient algorithm, obtaining a RW$\ell _{1}$ algorithm, alternative to the classical one, due to CWB. This new algorithm was tested on random problems in the noiseless case, obtaining almost the same performance as that of CWB, but allowing an interpretation of weights. The proposed methodology was then extended to the noisy case. Here a RW-LASSO algorithm was obtained, by introducing a new Lagrange multiplier. This last algorithm showed a considerable improvement in performance, with respect to the RW$\ell _{1}$ algorithm proposed by CWB.

References

Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MathSciNet Google Scholar
Bertsekas, D.P., Scientific, A.: Convex Optimization Algorithms. Athena Scientific, Belmont (2015)
Google Scholar
Bertsekas, D.: Nonlinear Programming. Athena Scientific Optimization and Computation Series. Athena Scientific, Belmont (1999)
MATH Google Scholar
Candes, E., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. arXiv preprint arXiv:math/0409186 (2004)
Candès, E.J., Wakin, M.B., Boyd, S.P.: Enhancing sparsity by reweighted $l_{1}$ minimization. J. Fourier Anal. Appl. 14(5–6), 877–905 (2008)
Article MathSciNet Google Scholar
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM Rev. 43(1), 129–159 (2001)
Article MathSciNet Google Scholar
Donoho, D.L., et al.: Compressed sensing. IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006)
Article MathSciNet Google Scholar
Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Birkhäuser, Basel (2013)
Book Google Scholar
Qaisar, S., Bilal, R.M., Iqbal, W., Naureen, M., Lee, S.: Compressive sensing: from theory to applications, a survey. J. Commun. Netw. 15(5), 443–456 (2013)
Article Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar

Download references

Acknowledgment

This work was supported by a grant from Comisión Académica de Posgrado (CAP), Universidad de la República, Uruguay.

Author information

Authors and Affiliations

Instituto de Matemática y Estadística Rafael Laguardia, Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay
Matías Valdés & Marcelo Fiori

Authors

Matías Valdés
View author publications
You can also search for this author in PubMed Google Scholar
Marcelo Fiori
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matías Valdés .

Editor information

Editors and Affiliations

Universidad Autónoma de Madrid, Madrid, Spain
Aythami Morales
Universidad Autónoma de Madrid, Madrid, Spain
Julian Fierrez
Universitat Jaume I, Castellón de la Plana, Spain
José Salvador Sánchez
University of Coimbra, Coimbra, Portugal
Bernardete Ribeiro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Valdés, M., Fiori, M. (2019). Re-Weighted $\ell _{1}$ Algorithms within the Lagrange Duality Framework. In: Morales, A., Fierrez, J., Sánchez, J., Ribeiro, B. (eds) Pattern Recognition and Image Analysis. IbPRIA 2019. Lecture Notes in Computer Science(), vol 11867. Springer, Cham. https://doi.org/10.1007/978-3-030-31332-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-31332-6_13
Published: 22 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31331-9
Online ISBN: 978-3-030-31332-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Re-Weighted \(\ell _{1}\) Algorithms within the Lagrange Duality Framework