Abstract
We present a study on Matrix-Vector Product operations in the Maxwell GPU generation through the PyCUDA python library. Through this lens, a broad analysis is performed over different memory management schemes. We identify the approaches that result in higher performance in current GPU generations when using dense matrices. The found guidelines are then applied to the implementation of the sparse matrix-vector product, covering structured (DIA) and unstructured (CSR) sparse matrix formats. Our experimental study on different datasets reveals that there is room for little improvement in the current state of the memory hierarchy, and that the expected Pascal GPU generation will get a major benefit from our techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
The NumPy library website. http://www.numpy.org
The pyCUDA library website. https://developer.nvidia.com/pycuda
The SciPy library website. http://www.scipy.org
Cuda books. http://developer.nvidia.com/cuda-books, April 2012
Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on cuda. Technical report, Nvidia Technical report NVR-2008-004, Nvidia Corporation (2008)
GPGPU: General-Purpose Computation Using Graphics Hardware (2009). http://www.gpgpu.org
Maeda, H., Takahashi, D.: Parallel sparse matrix-vector multiplication using accelerators. In: Gervasi, O., et al. (eds.) ICCSA 2016. LNCS, vol. 9787, pp. 3–18. Springer, Cham (2016). doi:10.1007/978-3-319-42108-7_1
NVIDIA Corporation: NVIDIA GeForce GTX 980 Whitepaper. Technical report (2015)
Ujaldón, M.: HPC accelerators with 3D memory. In: 19th IEEE International Conference on Computational Science and Engineering (CSE 2016), August 2016
Zlatev, Z.: Computational Methods for General Sparse Matrices, vol. 65. Kluwer Academic Publishers, Holland (1991)
Acknowledgements
This work was supported by the Junta de Andalucía of Spain under Project of Excellence P12-TIC-1741. We also thank Nvidia for hardware donations under GPU Education Center 2011–2016, GPU Research Center 2012–2016 and CUDA Fellow 2012–2016 Awards.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Nurudín Álvarez, F., Ortega-Toro, J.A., Ujaldón, M. (2017). Dense and Sparse Matrix-Vector Multiplication on Maxwell GPUs with PyCUDA. In: Barrios Hernández, C., Gitler, I., Klapp, J. (eds) High Performance Computing. CARLA 2016. Communications in Computer and Information Science, vol 697. Springer, Cham. https://doi.org/10.1007/978-3-319-57972-6_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-57972-6_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57971-9
Online ISBN: 978-3-319-57972-6
eBook Packages: Computer ScienceComputer Science (R0)