Skip to main content
Log in

A Simple Compressive Sensing Algorithm for Parallel Many-Core Architectures

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

In this paper we consider the l 1-compressive sensing problem. We propose an algorithm specifically designed to take advantage of shared memory, vectorized, parallel and many-core microprocessors such as the Cell processor, new generation Graphics Processing Units (GPUs) and standard vectorized multi-core processors (e.g. quad-core CPUs). Besides its implementation is easy. We also give evidence of the efficiency of our approach and compare the algorithm on the three platforms, thus exhibiting pros and cons for each of them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8

Similar content being viewed by others

Notes

  1. More informations can be found on the website: http://www.nvidia.com/object/cuda_home.html.

References

  1. Al-Kiswany, S., Gharaibeh, A., Santos-Neto, E., Yuan, G., & Ripeanu, M. (2008). StoreGPU: Exploiting graphics processing units to accelerate distributed storage systems. In: Proceedings of the 17th international symposium on high performance distributed computing (HPDC) (pp. 165–174).

  2. Andrecut, M. (2009). Fast GPU implementation of sparse signal recovery from random projections. Engineering Letters, 17(3), 151–158.

    Google Scholar 

  3. Bader, D. A., & Agarwal, V. (2007). FFTC: Fastest fourier transform for the IBM cell broadband engine. In: S. Aluru, M. Parashar, R. Badrinath, & V. K. Prasanna (Eds.), HiPC, Lecture of Notes in Computer Science (Vol. 4873, pp. 172–184). Springer.

  4. Bajwa, W. U., Haupt, J. D., Raz, G. M., Wright, S. J., & Nowak, R. D. (2007). Toeplitz-structured compressed sensing matrices. In: Proceedings of the 14th IEEE/SP workshop on Statistical Signal Processing (SSP) (pp. 294–298).

  5. Bernstein, D. J. (2007). The tangent FFT. In: S. Boztas & H. Feng Lu (Eds.), Lecture notes in computer science, applied algebra, algebraic algorithms and error-correcting codes (Vol. 4851, pp. 291–300). Springer.

  6. Bertsekas, D. (1996). Constrained optimization and lagrange multiplier methods. Athena Scientific.

  7. Bioucas-Dias, J., & Figueiredo, M. (2007). A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration. IEEE Transactions on Image Processing, 16(12), 2992–3004.

    Article  MathSciNet  Google Scholar 

  8. Bredies, K., & Lorenz, D. A. (2008). Iterated hard shrinkage for minimization problems with sparsity constraints. SIAM Journal on Scientific Computing, 30(2), 657–683.

    Article  MathSciNet  MATH  Google Scholar 

  9. Candès, E., & Romberg, J. (2006). Quantitative robust uncertainty principles and optimally sparse decompositions. Foundations of Computational Mathematics, 6, 227–254.

    Article  MathSciNet  MATH  Google Scholar 

  10. Candès, E., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 52(2), 489–509.

    Article  MATH  Google Scholar 

  11. Candès, E., & Tao, T. (2006). Near optimal signal recovery from random projections: universal encoding strategies?. IEEE Transactions on Information Theory, 52(12), 5406–5426.

    Article  Google Scholar 

  12. Cevher, V., Sankaranarayanan, A., Duarte, M. F., Reddy, D., Baraniuk, R. G., & Chellappa, R. (2008). Compressive sensing for background subtraction. In: Proceedings of the 10th European Conference on Computer Vision (ECCV) (Vol. 5303, pp. 155–168).

  13. Chambolle, A., DeVore, R. A., Lee, N.-Y., & Lucier, B. J. (1998). Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage. IEEE Transactions on Image Processing, 7, 319–335.

    Article  MathSciNet  MATH  Google Scholar 

  14. Combettes, P., & Pesquet, J.-C. (2007). Proximal thresholding algorithm for minimization over orthonormal bases. SIAM Journal on Optimization, 18(4), 1351–1376.

    Article  MathSciNet  MATH  Google Scholar 

  15. Dagum, L., & Menon, R. (1998). OpenMP: An industry standard API for shared-memory programming. IEEE Computational Science and Engineering, 5(1), 46–55.

    Article  Google Scholar 

  16. Daubechies, I., Defrise, M., & Mol, C. D. (2004). An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications in Pure and Applied Mathematics, 57(11), 1413–1457.

    Article  MATH  Google Scholar 

  17. DeVore, R. A. (2007). Deterministic constructions of compressed sensing matrices. Journal of Complexity, 4–6(23), 918–925.

    Article  MathSciNet  Google Scholar 

  18. Diefendorff, K., Dubey, P., Hochsprung, R., & Scale, H. (2000). Altivec extension to PowerPC accelerates media processing. IEEE Micro, 20(2), 85–95.

    Article  Google Scholar 

  19. Donoho, D., Tsaig, Y., Drori, I., & Starck, J.-L. (2006). Sparse solution of underdetermined linear equations by stagewise orthogonal matching pursuit. Tech. rep.

  20. Donoho, D. L. (2006). Compressed sensing. IEEE Transactions on Information Theory, 52(4), 1289–1306.

    Article  MathSciNet  Google Scholar 

  21. Dupe, F.-X., Fadili, J., & Starck, J.-L. (2008). A proximal iteration for deconvolving poisson noisy images using sparse representations. IEEE Transactions on Image Processing, 16(12), 2992–3004.

    Google Scholar 

  22. Hale, W. Y. E. T., & Zhang, Y. (2007). A fixed-point continuation method for l1-regularized minimization with applications to compressed sensing. Tech. rep., Rice University.

  23. Elad, M. (2006). Why simple shrinkage is still relevant for redundant representation? IEEE Transactions on Information Theory, 52, 5559–5569.

    Article  MathSciNet  Google Scholar 

  24. Fabritiis, G. D. (2007). Performance of the cell processor for biomolecular simulations. Computer Physics Communications, 176(11–12), 660–664.

    Article  Google Scholar 

  25. Figueiredo, M., & Nowak, R. (2003). An EM algorithm for wavelet-based image restoration. IEEE Transactions on Image Processing, 12(8), 906–916.

    Article  MathSciNet  Google Scholar 

  26. Figueiredo, M., Nowak, R., & Wright, S. (2007). Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE Journal of Selected Topics in Signal Processing, 1(3), 586–598.

    Article  Google Scholar 

  27. Frigo, M., & Johnson, S. (1998). FFTW: an adaptive software architecture for the FFT. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (Vol. 3, pp. 1381–1384).

  28. Gilbert, A. C., Strauss, M. J., Tropp, J. A., & Vershynin, R. (2006). Algorithmic linear dimension reduction in the l 1 norm for sparse vectors. In: Proceedings of the 44th Annual Allerton Conference on Communication, Control and Computing (pp. 1411–1418).

  29. Goldstein, T., & Osher, S. (2008). The split Bregman method for l1 regularized problems. Tech. Rep. CAM 08-29, UCLA.

  30. Griesse, R., & Lorenz, D. A. (2008). A semismooth Newton method for Tikhonov functionals with sparsity constraints. Inverse Problems, 24(3).

  31. Hegde, C., Wakin, M., & Baraniuk, R. (2007). Random projections for manifold learning. In: Neural Information Processing Systems (NIPS).

  32. Hiriart-Urruty, J.-B., & Lemaréchal, C. (1996). Convex analysis and minimization algorithms. Springer, Heidelberg, two volumes—2nd printing.

    Google Scholar 

  33. Kim, S.-J., Koh, K., Lustig, M., Boyd, S., & Gorinevsky, D. (2007). An interior-point method for large-scale l 1-regularized least squares. IEEE Journal of Selected Topics in Signal Processing, 1(4), 606–617.

    Article  Google Scholar 

  34. Kurzak, J., & Dongarra, J. (2007). Implementation of mixed-precision in solving systems of linear equations on the CELL processor. Concurrency and Computation: Practice and Experience, 19(10), 1371–1385.

    Article  Google Scholar 

  35. Lemaréchal, C., & Sagastizábal, C. (1997). Practical aspects of the Moreau-Yosida regularization: theoretical preliminaries. SIAM Journal on Optimization, 7(2), 367–385.

    Article  MathSciNet  MATH  Google Scholar 

  36. Lieberman, M. D., Sankaranarayanan, J., & Samet, H. (2008). A fast similarity join algorithm using graphics processing units. In: Proceedings of the IEEE international conference on data engineering (ICDE) (pp. 1111–1120).

  37. Lustig, M., Donoho, D., & Pauly, J. M. (2007). Sparse MRI: the application of compressed sensing for rapid MR imaging. Magnetic Resonance in Medicine, 58(6), 1182–1195.

    Article  Google Scholar 

  38. Mairal, J., Bach, F., Ponce, J., Sapiro, G., & Zisserman, A. (2008). Discriminative learned dictionaries for local image analysis. In: IEEE Conf. on computer vision and pattern recognition (CVPR) (pp. 23–28).

  39. Moreau, J. (1965). Proximité et dualité dans un espace hilbertien. Bulletin de la S.M.F., 93, 273–299.

    MathSciNet  MATH  Google Scholar 

  40. Nickolls, J., Buck, I., Garland, M., & Skadron, K. (2008). Scalable parallel programming with CUDA. Queue, 6(2), 40–53.

    Article  Google Scholar 

  41. Nikolova, M. (1999). Markovian reconstruction using a GNC approach. IEEE Transactions on Image Processing, 8(9), 1204–1220.

    Article  Google Scholar 

  42. Nikolova, M., Idier, J., & Mohammad-Djafari, A. (1998). Inversion of large-support ill-posed linear operators using a piecewise Gaussian MRF. IEEE Transactions on Image Processing, 8(4), 571–585.

    Article  MathSciNet  Google Scholar 

  43. Nikolova, M., Ng, M. K., Zhang, S. Q., & Ching, W. K. (2008). Efficient reconstruction of piecewise constant images using nonsmooth nonconvex minimization. SIAM Journal on Imaging Sciences, 1(1), 2–25.

    Article  MathSciNet  MATH  Google Scholar 

  44. Nowak, R., & Figueiredo, M. (2001). Fast wavelet-based image deconvolution using the em algorithm. In: Proceedings of the 35th Asilomar conference on signals, systems, and computers (pp. 371–275).

  45. O’Brien, K., O’Brien, K. M., Sura, Z., Chen, T., & Zhang, T. (2008). Supporting OpenMP on cell. International Journal of Parallel Programming, 36(3), 289–311.

    Article  MATH  Google Scholar 

  46. Owens, J. D., Luebke, D., Govindaraju, N., Harris, M., Kruger, J., Lefohn, A. E., et al. (2007). A survey of general-purpose computation on graphics hardware. Computer Graphics Forum, 26(1), 80–113.

    Article  Google Scholar 

  47. Pham, D., Asano, S., Bolliger, M., Day, M. N., Hofstee, H. P., Johns, C., et al. (2005). The design and implementation of a first-generation CELL processor. In: Proceedings of the solid-state circuits conference (pp. 184–185).

  48. Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., et al. (2008). Larrabee: a many-core x86 architecture for visual computing. In: ACM SIGGRAPH 2008 papers (pp. 1–15). New York, NY, USA: ACM.

    Chapter  Google Scholar 

  49. Tibshirani, R. (2006). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B, 58, 267–288.

    MathSciNet  Google Scholar 

  50. Tropp, J. (2006). Just relax: Convex programming methods for identifying sparse signals. IEEE Transactions on Information Theory, 51(3), 1030–1051.

    Article  MathSciNet  Google Scholar 

  51. Tropp, J. A. (2004). Greed is good: algorithmic results for sparse approximation. IEEE Transactions on Information Theory, 50(10), 2231–2242.

    Article  MathSciNet  Google Scholar 

  52. Williams, S., Carter, J., Oliker, L., Shalf, J., & Yelick, K. (2008). Lattice Boltzmann simulation optimization on leading multicore platforms. In IEEE international parallel and distributed processing symposium (pp. 1–14).

  53. Williams, S., Shalf, J., Oliker, L., Kamil, S., Husbands, P., & Yelick, K. (2006). The potential of the cell processor for scientific computing. In: Proceedings of the 3rd conference on computing frontiers (CF) (pp. 9–20). New York, NY, USA: ACM.

    Google Scholar 

  54. Wright, J., Yang, A., Ganesh, A., Sastry, S., & Ma, Y. (2009). Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(2), 210–227.

    Article  Google Scholar 

  55. Yin, W., Osher, S., Goldfarb, D., & Darbon, J. (2008). Bregman iterative algorithms for l1-minimization with applications to compressed sensing. SIAM Journal on Imaging Sciences, 1(1), 143–168.

    Article  MathSciNet  MATH  Google Scholar 

  56. Yosida, K. (1965). Functional analysis. Springer.

Download references

Acknowledgements

The research of A. Borghi on this work has been done while being at the mathematics department of UCLA and being supported by the ONR grant N000140710810. The research of J. Darbon and S. Osher was supported by ONR grant N000140710810. The research of T. Chan was supported by DMS-0610079 and ONR N00014-06-1-0345. This works has been supported in part by French National Research Agency (ANR) through COSINUS program project (MIDAS no. ANR-09-COSI-009).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jérôme Darbon.

Appendix: Proof of Convergence

Appendix: Proof of Convergence

We briefly show the standard elements of the proof of the convergence of the proximal iterations. Recall that the approach is quite standard and some proof can be adapted for [32, 35] for instance.

First, let us note that the series \(\{E_\mu\left(p(u^{(k)})\right)\}\) is clearly non-increasing and bounded by below, and thus converges toward some value referred to as η.

Then, let us recall a standard convex optimality result. Assume that \(g:\mathbb{\bf{R}}^N\rightarrow \mathbb{\bf{R}}\) is convex and differentiable and \(h:\mathbb{\bf{R}}^N \rightarrow \mathbb{\bf{R}}\) is convex, then u  ⋆  is a global minimizer of (g + h) if and only if the following holds:

$$ \label{eq.opt-sum-convex} \forall u \in \mathbb{\bf{R}}^N\,\, \langle \nabla g(u^\star), u - u^\star \rangle + h(u) - h(u^\star) \geq 0\enspace. $$
(6)

For our case, let us have \(g(\cdot) = \frac{1}{2}\| \cdot - u ^{(k)}\|_M^2\) and h(·) = E μ (·) and recall that \(p_\mu(u ^{(k)})\) is the global minimum of the inf-convolution when it is fed with u (k):

$$\begin{array}{rll} \label{eq.opt-sum-convex-for-our-case} \forall u \in \mathbb{\bf{R}}^N\,\, \left\langle p\left(u ^{(k)}\right) - u ^{(k)} , u - p\left(u ^{(k)}\right) \right\rangle_M \nonumber\\ + E_\mu(u) - E_\mu\left(p\left((u ^{(k)}\right)\right) \geq 0\enspace. \end{array} $$
(7)

Now, we consider this inequality for the two points \(\hat{u}\) and \(\bar{u}\) with associated proximal points \(p(\hat{u})\) and \(p(\bar{u})\), respectively. Some simple calculus lead to:

$$\begin{array}{rll} &&{\kern-7pt}\left\langle p(\hat{u}) - \hat{u}, p(\bar{u}) - p(\hat{u}) \right\rangle_M \\&&+ \left\langle p(\bar{u}) - \bar{u}, p(\hat{u}) - p(\bar{u}) \right\rangle_M \geq 0 \enspace, \end{array}$$

and thus we get:

$$ \left\langle \bar{u} - \hat{u}, p(\bar{u}) - p(\hat{u}) \right\rangle_M \geq \| p(\hat{u}) - p(\bar{u})\|_M^2 \enspace. $$

The latter is equivalent to:

$$ \left\| \hat{u} - \bar{u} \right\|_M^2 - \| p(\hat{u}) - \hat{u} - p(\bar{u}) + \bar{u} \|_M^2 \geq \| p(\hat{u}) - p(\bar{u}) \|_M^2 \enspace\!. $$

Now, let us set \(\bar{u}\) to a global minimizer of R μ , i.e., \(\bar{u} = u^\star\), and \(\hat{u} = u^{(k)}\), in the previous inequality. Note that \(p\left(u^\star\right) = u^\star\), and recall that \(u^{(k+1)} = p\left(u^{(k)}\right)\). The following inequality holds:

$$ \label{eq.ineq-wanted-for-convergence} \left\| u ^{(k)} - u^\star \right\|_M^2 - \left\| u^{(k+1)} - u^{(k)} \right\|_M^2 \geq \left\| u^{(k+1)} - u^\star \right\|_M^2 $$
(8)

With this inequality we can conclude using the following points.

The series \(\| u^{(k)} - u^\star\|_M^2\) is non-increasing and bounded by below (by \(E_\mu(u ^\star))\)) and thus does converge. Thus, we deduce that \(\lim_{k \rightarrow \infty} \left( \| u^{(k+1)} - u^\star \|_M^2 - \| u^{(k)} -\right.\) \( \left.u^\star \|_M^2 \right) = 0\,.\) Using this result and Eq. 7 we get that \(\lim_{k\rightarrow \infty} \| u^{(k+1)} - u^{(k)}\|_M^2 = 0\,.\)

Note that by the convexity of the energy E μ , we have:

$$ \label{eq.convergence-comvexite-optim} E_\mu\left(u ^\star\right) \geq E_\mu\left(u ^{(k+1)}\right) + \left\langle \partial E_\mu (u ^{(k+1)}) , u^{\star} - u^{(k+1)}\right\rangle $$
(9)

Since u (k + 1) is the global minimizer of F μ , we have that:

$$\begin{array}{rll} \label{eq.convergence-optim-prox} &&{\kern-8pt}\left\langle \partial E_\mu (u ^{(k+1)}) , u^{\star} - u^{(k+1)}\right\rangle\notag\\&& + \left\langle u^{(k+1)} - u^{(k)}, u^\star - u^{(k+1)} \right\rangle \geq 0\enspace. \end{array} $$
(10)

Recall that, as it has been shown above, \(\lim_{k\rightarrow + \infty} \| u^{(k+1)} - u^{(k)}\|^2 = 0\), and since ||u (k + 1) − u  ⋆ ||2 is bounded we have that \(\lim \inf_{k\rightarrow + \infty} \left\langle \partial E_\mu(u ^{(k+1)}),{\kern-4.5pt} \right.\) \(\left.u^{\star} - u^{(k+1)}\right\rangle \geq 0\,.\) Injecting this information into Eq. 8, we obtain that \(\lim_{k \rightarrow + \infty} E_\mu\left(u^{(k)}\right) \leq \eta\), and thus we conclude that \(\lim_{k\rightarrow + \infty} E_\mu(u^{(k)}) = \eta\,.\)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Borghi, A., Darbon, J., Peyronnet, S. et al. A Simple Compressive Sensing Algorithm for Parallel Many-Core Architectures. J Sign Process Syst 71, 1–20 (2013). https://doi.org/10.1007/s11265-012-0671-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-012-0671-9

Keywords

Navigation