Adaptive Optimization $$l_1$$ -Minimization Solvers on GPU

Gao, Jiaquan; Li, Zejie; Liang, Ronghua; He, Guixia

doi:10.1007/s10766-016-0430-9

Adaptive Optimization $l_1$-Minimization Solvers on GPU

Published: 21 April 2016

Volume 45, pages 508–529, (2017)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Jiaquan Gao¹,
Zejie Li¹,
Ronghua Liang¹ &
…
Guixia He²

443 Accesses
14 Citations
Explore all metrics

Abstract

$l_1$-minimization ($l_1$-min) algorithms for the $l_1$-min problem have been widely developed. For most $l_1$-min algorithms, their main components include dense matrix-vector multiplications such as Ax and $A^Tx$, and vector operations. We propose a novel warp-based implementation of the matrix-vector multiplication (Ax) on the graphics processing unit (GPU), called the GEMV kernel, and a novel thread-based implementation of the matrix-vector multiplication ($A^Tx$) on the GPU, called the GEMV-T kernel. For the GEMV kernel, a self-adaptive warp allocation strategy is used to assign the optimal warp number for each matrix row. Similar to the GEMV kernel, we design a self-adaptive thread allocation strategy to assign the optimal thread number to each matrix row for the GEMV-T kernel. Two popular $l_1$-min algorithms, fast iterative shrinkage-thresholding algorithm and augmented Lagrangian multiplier method, are taken for example. Based on the GEMV and GEMV-T kernels, we present two highly parallel $l_1$-min solvers on the GPU utilizing the technique of merging kernels and the sparsity of the solution of the $l_1$-min algorithms. Furthermore, we design a concurrent multiple $l_1$-min solver on the GPU, and optimize its performance by using new features of GPU such as the shuffle instruction and read-only data cache. The experimental results have validated high efficiency and good performance of our methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Revisiting the Gauss-Huard Algorithm for the Solution of Linear Systems on Graphics Accelerators

Accelerating Numerical Dense Linear Algebra Calculations with GPUs

Efficient Triangular Matrix Vector Multiplication on the GPU

References

Bruckstein, A., Donoho, D., Elad, M.: From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Review 51(1), 34–81 (2009)
Article MathSciNet MATH Google Scholar
Mallat, S.: A Wavelet Tour of Signal Processing—The Sparse Way, 3rd edn. Academic, Cambridge (2009)
MATH Google Scholar
Donoho, D.L., Elad, M.: Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization. Proc.Natl. Acad. Sci. 100(5), 2197–2202 (2003)
Article MathSciNet MATH Google Scholar
Donoho, D.L., Elad, M.: On the stability of the basis pursuit in the presence of noise. Signal Process. 86(3), 511–532 (2006)
Article MATH Google Scholar
Tropp, J.A.: Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50(10), 2231–2242 (2004)
Article MathSciNet MATH Google Scholar
Tropp, J.A.: Just relax: convex programming methods for subset selection and sparse approximation. IEEE Trans. Inf. Theory 52(3), 1030–1051 (2006)
Article MATH Google Scholar
Chen, S.S., Donoho, D.L., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1), 33–61 (1998)
Article MathSciNet MATH Google Scholar
Candès, E., Romberg, J., Tao, T.: Stable signal recovery from incomplete and inaccurate measurements. Commun. Pure Appl. Math. 59(8), 1207–1223 (2006)
Article MathSciNet MATH Google Scholar
Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. 31(2), 210–227 (2009)
Article Google Scholar
Elhamifar, E., Vidal, R.: Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans. Pattern Anal. 35(11), 2765–2781 (2013)
Article Google Scholar
Elhamifar, E., Vidal, R.: Sparse subspace clustering: computer vision and pattern recognition. In: IEEE Conference on CVPR 2009, pp. 2790–2797 (2009)
Wright, J., Ma, Y., Mairal, J., et al.: Sparse representation for computer vision and pattern recognition. Proc. IEEE 98(6), 1031–1044 (2010)
Article Google Scholar
Zibulevsky, M., Elad, M.: L1–L2 optimization in signal and image processing. IEEE Signal Proc. Mag. 27(3), 76–88 (2010)
Article Google Scholar
Figueiredo, M.A.T., Nowak, R.D., Wright, S.J.: Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. STSP 1(4), 586–597 (2007)
Google Scholar
Kim, S.J., Koh, K., Lustig, M., Boyd, S., Gorinevsky, D.: An interior-point method for large-scale 1-regularized least squares. IEEE J. STSP 1(4), 606–617 (2007)
Google Scholar
Donoho, D.L., Tsaig. Y.: Fast solution of L1-norm minimization problems when the solution may be sparse. Stanford University, Technical Report (2006)
Nesterov, Y.: Gradient methods for minimizing composite objective function. Gen. Inf. 38(3), 768–785 (2007)
MathSciNet Google Scholar
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Bertsekas, D.: Constrained Optimization and Lagrange Multiplier Methods. Athena Scientific, Belmont (1982)
MATH Google Scholar
Yang, A.Y., Zhou, Z., Balasubramanian, A.G., Sastry, S.S., Ma, Y.: Fast $l1$-minimization algorithms for robust face recognition. IEEE Trans. Image Process. 22(8), 3234–3246 (2013)
Article Google Scholar
Stephen, B., et al.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)
MathSciNet Google Scholar
Yang, A.Y., Sastry, S.S., Ganesh, A., Ma, Y.: Fast $l1$-minimization algorithms and an application in robust face recognition: a review. In: 17th IEEE International Conference on Image Processing (ICIP), pp.1849–1852 (2010)
NVIDIA: CUDA C Programming Guide 6.5. http://docs.nvidia.com/cuda/cuda-c-programming-guide/ (2014)
NVIDIA: CUBLAS Library 6.5. http://docs.nvidia.com/cuda/cublas-library/ (2014)
Nagesh, P., Gowda, R., Li, B.: Fast GPU implementation of large scale dictionary and sparse representation based vision problems. In: 2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp.1570–1573 (2010)
Shia, V., Yang, A.Y., Sastry, S.S.: Fast $l1$-minimization and parallelization for face recognition. In: 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), pp.1199–1203 (2011)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Series B 58, 267–288 (1996)
MathSciNet MATH Google Scholar
Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont (2003)
MATH Google Scholar
Gao, J., Liang, R., Wang, J.: Research on the conjugate gradient algorithm with a modified incomplete Cholesky preconditioner on GPU. J. Parallel Distrib. Comput. 74(2), 2088–2098 (2014)
Article Google Scholar
Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC09). Portland, Oregon, USA: ACM, November, pp.14–19 (2009)

Download references

Acknowledgments

We gratefully acknowledge the comments from anonymous reviewers, which greatly helped us to improve the contents of the paper.

Author information

Authors and Affiliations

College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, Zhejiang Province, 310023, China
Jiaquan Gao, Zejie Li & Ronghua Liang
Zhijiang College, Zhejiang University of Technology, Hangzhou, Zhejiang Province, 310024, China
Guixia He

Authors

Jiaquan Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zejie Li
View author publications
You can also search for this author in PubMed Google Scholar
Ronghua Liang
View author publications
You can also search for this author in PubMed Google Scholar
Guixia He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiaquan Gao.

Additional information

This work is supported by the National Science Foundation of China under Grant Number 61379017 and the Science and Technology Plan Project of Zhejiang Province, China under Grant Number 2014C33077.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, J., Li, Z., Liang, R. et al. Adaptive Optimization $l_1$-Minimization Solvers on GPU. Int J Parallel Prog 45, 508–529 (2017). https://doi.org/10.1007/s10766-016-0430-9

Download citation

Received: 17 September 2015
Accepted: 12 April 2016
Published: 21 April 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s10766-016-0430-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive Optimization \(l_1\)-Minimization Solvers on GPU

Abstract

Access this article

Similar content being viewed by others

Revisiting the Gauss-Huard Algorithm for the Solution of Linear Systems on Graphics Accelerators

Accelerating Numerical Dense Linear Algebra Calculations with GPUs

Efficient Triangular Matrix Vector Multiplication on the GPU

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Adaptive Optimization \(l_1\)-Minimization Solvers on GPU

Abstract

Access this article

Similar content being viewed by others

Revisiting the Gauss-Huard Algorithm for the Solution of Linear Systems on Graphics Accelerators

Accelerating Numerical Dense Linear Algebra Calculations with GPUs

Efficient Triangular Matrix Vector Multiplication on the GPU

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation