Dense and Sparse Matrix-Vector Multiplication on Maxwell GPUs with PyCUDA

Nurudín Álvarez, Francisco; Ortega-Toro, José Antonio; Ujaldón, Manuel

doi:10.1007/978-3-319-57972-6_16

Francisco Nurudín Álvarez¹³,
José Antonio Ortega-Toro¹³ &
Manuel Ujaldón¹³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 697))

Included in the following conference series:

Latin American High Performance Computing Conference

958 Accesses

Abstract

We present a study on Matrix-Vector Product operations in the Maxwell GPU generation through the PyCUDA python library. Through this lens, a broad analysis is performed over different memory management schemes. We identify the approaches that result in higher performance in current GPU generations when using dense matrices. The found guidelines are then applied to the implementation of the sparse matrix-vector product, covering structured (DIA) and unstructured (CSR) sparse matrix formats. Our experimental study on different datasets reveals that there is room for little improvement in the current state of the memory hierarchy, and that the expected Pascal GPU generation will get a major benefit from our techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Design Principles for Sparse Matrix Multiplication on the GPU

Accelerating approximate matrix multiplication for near-sparse matrices on GPUs

Article 14 February 2022

Performance Characteristics for Sparse Matrix-Vector Multiplication on GPUs

References

The NumPy library website. http://www.numpy.org
The pyCUDA library website. https://developer.nvidia.com/pycuda
The SciPy library website. http://www.scipy.org
Cuda books. http://developer.nvidia.com/cuda-books, April 2012
Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on cuda. Technical report, Nvidia Technical report NVR-2008-004, Nvidia Corporation (2008)
Google Scholar
GPGPU: General-Purpose Computation Using Graphics Hardware (2009). http://www.gpgpu.org
Maeda, H., Takahashi, D.: Parallel sparse matrix-vector multiplication using accelerators. In: Gervasi, O., et al. (eds.) ICCSA 2016. LNCS, vol. 9787, pp. 3–18. Springer, Cham (2016). doi:10.1007/978-3-319-42108-7_1
Chapter Google Scholar
NVIDIA Corporation: NVIDIA GeForce GTX 980 Whitepaper. Technical report (2015)
Google Scholar
Ujaldón, M.: HPC accelerators with 3D memory. In: 19th IEEE International Conference on Computational Science and Engineering (CSE 2016), August 2016
Google Scholar
Zlatev, Z.: Computational Methods for General Sparse Matrices, vol. 65. Kluwer Academic Publishers, Holland (1991)
Book MATH Google Scholar

Download references

Acknowledgements

This work was supported by the Junta de Andalucía of Spain under Project of Excellence P12-TIC-1741. We also thank Nvidia for hardware donations under GPU Education Center 2011–2016, GPU Research Center 2012–2016 and CUDA Fellow 2012–2016 Awards.

Author information

Authors and Affiliations

Computer Architecture Department, ESTI Informática, University of Málaga, Bulevar Louis Pasteur, s/n. Campus Teatinos, 29071, Malaga, Spain
Francisco Nurudín Álvarez, José Antonio Ortega-Toro & Manuel Ujaldón

Authors

Francisco Nurudín Álvarez
View author publications
You can also search for this author in PubMed Google Scholar
José Antonio Ortega-Toro
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Ujaldón
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel Ujaldón .

Editor information

Editors and Affiliations

Universidad Industrial de Santander, Bucaramanga, Colombia
Carlos Jaime Barrios Hernández
Centro de Investigación y de Estudios Avanzados, CINVESTAV-IPN, Ciudad de México, Mexico
Isidoro Gitler
Instituto Nacional de Investigaciones Nucleares, La Marquesa, Estado de México, Mexico
Jaime Klapp

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nurudín Álvarez, F., Ortega-Toro, J.A., Ujaldón, M. (2017). Dense and Sparse Matrix-Vector Multiplication on Maxwell GPUs with PyCUDA. In: Barrios Hernández, C., Gitler, I., Klapp, J. (eds) High Performance Computing. CARLA 2016. Communications in Computer and Information Science, vol 697. Springer, Cham. https://doi.org/10.1007/978-3-319-57972-6_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-57972-6_16
Published: 29 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57971-9
Online ISBN: 978-3-319-57972-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics