Comparing Different Programming Approaches for SpMV-Operations on GPUs

Ecker, Jan P.; Berrendorf, Rudolf; Razzaq, Javed; Scholl, Simon E.; Mannuss, Florian

doi:10.1007/978-3-319-32149-3_50

Jan P. Ecker⁷,
Rudolf Berrendorf⁷,
Javed Razzaq⁷,
Simon E. Scholl⁷ &
…
Florian Mannuss⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9573))

Included in the following conference series:

International Conference on Parallel Processing and Applied Mathematics

1258 Accesses

Abstract

There exist various different high- and low-level approaches for GPU programming. These include the newer directive based OpenACC programming model, Nvidia’s programming platform CUDA and existing libraries like cuSPARSE with a fixed functionality. This work compares the attained performance and development effort of different approaches based on the example of implementing the SpMV operation, which is an important and performance critical building block in many application fields. We show that the main differences in development effort using CUDA and OpenACC are related to the memory management and the thread mapping.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
version 6.5 for Tesla K20m and M2050, version 7.0 for Tesla K80.

References

LAMA - Library for Accelerated Math Applications. http://www.libama.org/. Accessed 5 August 2015
Paralution - The library for iterative sparse methods on CPU and GPU. http://www.paralution.com/. Accessed 5 August 2015
PGI Accelerator Compilers with OpenACC Directives. https://www.pgroup.com/resources/accel.htm. Accessed 6 September 2015
Vienna Computing Library (ViennaCL). http://viennacl.sourceforge.net/. Accessed 5 August 2015
Barrett, R., Berry, M., Chan, T.F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., der Vorst, H.V.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, 2nd edn. SIAM, Philadelphia (1994)
Book Google Scholar
Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on CUDA. Technical report NVR-2008-004, Nvidia Corp., December 2008
Google Scholar
Christgau, S., Spazier, J., Schnor, B., Hammitsch, M., Babeyko, A., Wächter, J.: A comparison of CUDA and OpenACC: accelerating the tsunami simulation easywave. In: 2014 27th International Conference on Architecture of Computing Systems (ARCS), pp. 1–5. IEEE, February 2014
Google Scholar
Davis, T.A., Hu, Y.: The university of florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1: 1–1: 25 (2010)
MathSciNet MATH Google Scholar
GNU GCC: GCC 5 Release Series - Changes, new Features, and Fixes. https://gcc.gnu.org/gcc-5/changes.html. Accessed 5 August 2015
Herdman, J., Gaudin, W., McIntosh-Smith, S., Boulton, M., Beckingsale, D., Mallison, A., Jarvis, S.: Accelerating hydrocodes with OpenACC, OpenCL and CUDA. In: 2012 SC Companion on High Performance Computing, Networking, Storage and Analysis (SCC), pp. 465–471. IEEE, November 2012
Google Scholar
Hoshino, T., Maruyama, N., Matsuoka, S., Takaki, R.: CUDA vs OpenACC: performance case studies with kernel benchmarks and a memory-bound CFD application. In: Proceedings of 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pp. 136–143. IEEE (2013)
Google Scholar
Khronos OpenCL Working Group: The OpenCL Specification (API Specification), 2 edn. https://www.khronos.org/registry/cl/specs/opencl-2.0.pdf. Accessed 5 August 2015
Maggioni, M., Berger-Wolf, T.: An architecture-aware technique for optimizing sparse matrix-vector multiplication on GPUs. Procedia Comput. Sci. 18, 329–338 (2013). Proceedings of 2013 International Conference on Computational Science. Elsevier B.V
Article Google Scholar
Nvidia Corp: Nvidia cuSPARSE. https://developer.nvidia.com/cusparse. Accessed 5 August 2015
Nvidia Corp: NVIDIA Visual Profiler. https://developer.nvidia.com/nvidia-visual-profiler. Accessed 5 August 2015
Nvidia Corp: Parallel Thread Execution ISA - Application Guide. v4.1st edn., August 2014. http://docs.nvidia.com/cuda/pdf/ptx_isa_4.1.pdf. Accessed 5 August 2015
Nvidia Corp: CUDA C Programming Guide. pg-02829-001_v7.0 edn., March 2015. http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf. Accessed 06 December 2015
OpenACC: OpenACC™Application Programming Interface, Version 2.0a, August 2013. http://www.openacc-standard.org/. Accessed 5 August 2015
Rahman, R.: Intel\(^{\textregistered }\) Xeon Phi™Core Micro-architecture, pp. 1–15 (2013)
Google Scholar
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. SIAM, Philadelphia (2003)
Book Google Scholar
Society of Petroleum Engineers: SPE Comparative Solution Project. http://www.spe.org/web/csp/
Sugawara, M., Hirasawa, S., Komatsu, K., Takizawa, H., Kubayashi, H.: A comparison of performance tunabilities between OpenCL and OpenACC. In: Proceedings of 2013 IEEE 7th International Symposium on Embedded Multicore/Manycore System-on-Chip, pp. 147–152. IEEE (2013)
Google Scholar
Tang, W., Tan, W., Ray, R., Wong, Y., Chen, W., Kuo, S., Goh, R., Turner, S., Wong, W.: Accelerating sparse matrix-vector multiplication on gpus using bit-representation-optimized schemes. In: Proceedings of SC 2013 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. No. 26 in Proceedings of ACM/IEEE Supercomputing. ACM (2013)
Google Scholar
Wienke, S., Springer, P., Terboven, C., an Mey, D.: OpenACC — first experiences with real-world applications. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 859–870. Springer, Heidelberg (2012)
Chapter Google Scholar

Download references

Acknowledgements

We would like to thank the CMT team at Saudi Aramco EXPEC ARC for their support and input. Especially we want to thank Ali H. Dogru for making this research project possible.

Author information

Authors and Affiliations

Computer Science Department, Bonn-Rhein-Sieg University of Applied Sciences, Sankt Augustin, Germany
Jan P. Ecker, Rudolf Berrendorf, Javed Razzaq & Simon E. Scholl
EXPEC Advanced Research Center, Saudi Arabian Oil Company, Dhahran, Saudi Arabia
Florian Mannuss

Authors

Jan P. Ecker
View author publications
You can also search for this author in PubMed Google Scholar
Rudolf Berrendorf
View author publications
You can also search for this author in PubMed Google Scholar
Javed Razzaq
View author publications
You can also search for this author in PubMed Google Scholar
Simon E. Scholl
View author publications
You can also search for this author in PubMed Google Scholar
Florian Mannuss
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jan P. Ecker .

Editor information

Editors and Affiliations

Czestochowa University of Technolog, Czestochowa, Poland
Roman Wyrzykowski
Department of Computer Science, University of Southern California, Marina Del Rey, California, USA
Ewa Deelman
Electrical Engineering & Comput. Science, University of Tennessee, Knoxville, Tennessee, USA
Jack Dongarra
Czestochowa University of Technology, Institute of Computer & Information Sci., Czestochowa, Poland
Konrad Karczewski
Department of Computer Science, AGH University of Science and Technology, Krakow, Poland
Jacek Kitowski
Systèmes d’informations, Big Data et Rec, AGH University of Science and Technology, Krakow, Poland
Kazimierz Wiatr

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ecker, J.P., Berrendorf, R., Razzaq, J., Scholl, S.E., Mannuss, F. (2016). Comparing Different Programming Approaches for SpMV-Operations on GPUs. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2015. Lecture Notes in Computer Science(), vol 9573. Springer, Cham. https://doi.org/10.1007/978-3-319-32149-3_50

Download citation

DOI: https://doi.org/10.1007/978-3-319-32149-3_50
Published: 02 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32148-6
Online ISBN: 978-3-319-32149-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics