Performance evaluation of the sparse matrix-vector multiplication on modern architectures

Goumas, Georgios; Kourtis, Kornilios; Anastopoulos, Nikos; Karakasis, Vasileios; Koziris, Nectarios

doi:10.1007/s11227-008-0251-8

Performance evaluation of the sparse matrix-vector multiplication on modern architectures

Published: 25 November 2008

Volume 50, pages 36–77, (2009)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Georgios Goumas¹,
Kornilios Kourtis¹,
Nikos Anastopoulos¹,
Vasileios Karakasis¹ &
…
Nectarios Koziris¹

499 Accesses
68 Citations
3 Altmetric
Explore all metrics

Abstract

In this paper, we revisit the performance issues of the widely used sparse matrix-vector multiplication (SpMxV) kernel on modern microarchitectures. Previous scientific work reports a number of different factors that may significantly reduce performance. However, the interaction of these factors with the underlying architectural characteristics is not clearly understood, a fact that may lead to misguided, and thus unsuccessful attempts for optimization. In order to gain an insight into the details of SpMxV performance, we conduct a suite of experiments on a rich set of matrices for three different commodity hardware platforms. In addition, we investigate the parallel version of the kernel and report on the corresponding performance results and their relation to each architecture’s specific multithreaded configuration. Based on our experiments, we extract useful conclusions that can serve as guidelines for the optimization process of both single and multithreaded versions of the kernel.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance Characteristics for Sparse Matrix-Vector Multiplication on GPUs

Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi

Sparse Matrix-Vector Product

References

Agarwal RC, Gustavson FG, Zubair M (1992) a high performance algorithm using pre-processing for the sparse matrix-vector multiplication. In: Supercomputing’92, Minnesota, November 1992. IEEE, New York, pp 32–41
Google Scholar
Asanovic K, Bodik R, Catanzaro BC, Gebis JJ, Husbands P, Keutzer K, Patterson DA, Plishker WL, Shalf J, Williams SW, Yelick KA (2006) The landscape of parallel computing research: A view from Berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley
Athanasaki E, Anastopoulos N, Kourtis K, Koziris N (2008) Exploring the performance limits of simultaneous multithreading for memory intensive applications. J Supercomput 44(1):64–97
Article Google Scholar
Barrett R, Berry M, Chan TF, Demmel J, Donato JM, Dongarra J, Eijkhout V, Pozo R, Romine C, der Vorst HV (1994) Templates for the solution of linear systems: building blocks for iterative methods. SIAM, Philadelphia
Google Scholar
Buttari A, Eijkhout V, Langou J, Filippone S (2005) Performance optimization and modeling of blocked sparse kernels. Technical Report ICL-UT-04-05, Innovative Computing Laboratory, University of Tennessee
Catalyuerek UV, Aykanat C (1996) Decomposing irregularly sparse matrices for parallel matrix-vector multiplication. In: Lecture notes in computer science, vol 1117, pp 75–86
Davis T (1997) University of Florida Sparse Matrix Collection. http://www.cise.ufl.edu/research/sparse/matrices. NA Digest 97(23)
Geus R, Röllin S (1999) Towards a fast parallel sparse matrix-vector multiplication. In: Parallel computing: fundamentals and applications, international conference ParCo. Imperial College Press, 1999, pp 308–315
Gropp W, Kaushik D, Keyes D, Smith B (1999) Toward realistic performance bounds for implicit cfd codes. In: Ecer A et al. (eds) Proceedings of parallel CFD’99. Elsevier, Amsterdam
Google Scholar
Im E (2000) Optimizing the performance of sparse matrix-vector multiplication. PhD thesis, University of California, Berkeley
Im E, Yelick K (1999) Optimizing sparse matrix-vector multiplication on SMPs. In: 9th SIAM conference on parallel processing for scientific computing, SIAM, March 1999
Im E, Yelick K (2001) Optimizing sparse matrix computations for register reuse in SPARSITY. In: Lecture notes in computer science, vol 2073, pp 127–136
Kotakemori H, Hasegawa H, Kajiyama T, Nukada A, Suda R, Nishida A (2005) Performance evaluation of parallel sparse matrix-vector products on SGI Altix3700. In: 1st International workshop on OpenMP (IWOMP), Eugene, OR, USA, June 2005
Lo JL, Eggers SJ, Emer JS, Levy HM, Stamm RL, Tullsen DM (1997) Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading. ACM Trans Comput Syst 15(3):322–354
Article Google Scholar
Mellor-Crummey J, Garvin J (2004) Optimizing sparse matrix-vector product computations using unroll and jam. Int J High Perform Comput Appl 18(2):225
Article Google Scholar
Mitchell N, Carter L, Ferrante J, Tullsen D (1999) Instruction level parallelism vs. thread level parallelism on simultaneous multi-threading processors. In: Proceedings of supercomputing’99 (CD-ROM), Portland, OR, November 1999. ACM SIGARCH and IEEE
Paolini GV, Radicati di Brozolo G (1989) Data structures to vectorize CG algorithms for general sparsity patterns. BIT Numer Math 29(4):703–718
Article MATH MathSciNet Google Scholar
Pichel JC, Heras DB, Cabaleiro JC, Rivera FF (2004) Improving the locality of the sparse matrix-vector product on shared memory multiprocessors. In: PDP, IEEE Computer Society, 2004, pp 66–71
Pichel JC, Heras DB, Cabaleiro JC, Rivera FF (2005) Performance optimization of irregular codes based on the combination of reordering and blocking techniques. Parallel Comput 31(8–9):858–876
Article Google Scholar
Pinar A, Heath MT (1999) Improving performance of sparse matrix-vector multiplication. In: Supercomputing’99, Portland, OR, November 1999. ACM SIGARCH and IEEE
Saad Y (1990) Sparskit: A basic tool kit for sparse matrix computation. Technical report, Center for Supercomputing Research and Development, University of Illinois at Urbana Champaign
Saad Y (2003) Iterative methods for sparse linear systems. SIAM, Philadelphia
MATH Google Scholar
Temam O, Jalby W (1992) Characterizing the behavior of sparse algorithms on caches. In: Supercomputing’92, Minnesota, November 1992. IEEE, New York, pp 578–587
Google Scholar
Toledo S (1997) Improving the memory-system performance of sparse-matrix vector multiplication. IBM J Res Dev 41(6):711–725
Article Google Scholar
Vuduc R, Demmel J, Yelick K, Kamil S, Nishtala R, Lee B (2002) Performance optimizations and bounds for sparse matrix-vector multiply. In: Supercomputing, Baltimore, MD, November, 2002
Vuduc RW, Moon H (2005) Fast sparse matrix-vector multiplication by exploiting variable block structure. In: High performance computing and communications. Lecture notes in computer science, vol 3726. Springer, Berlin, pp 807–816
Chapter Google Scholar
White J, Sadayappan P (1997) On improving the performance of sparse matrix-vector multiplication. In: 4th International conference on high performance computing (HiPC ’97), 1997
Willcock J, Lumsdaine A (2006) Accelerating sparse matrix computations via data compression. In: ICS ’06: Proceedings of the 20th annual international conference on supercomputing, New York, NY, USA, 2006. ACM Press, New York, pp 307–316
Chapter Google Scholar
Williams S, Oilker L, Vuduc R, Shalf J, Yelick K, Demmel J (2007) Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In: Supercomputing’07, Reno, NV, November 2007

Download references

Author information

Authors and Affiliations

Computing Systems Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens, Zografou Campus, Zografou, 15780, Greece
Georgios Goumas, Kornilios Kourtis, Nikos Anastopoulos, Vasileios Karakasis & Nectarios Koziris

Authors

Georgios Goumas
View author publications
You can also search for this author in PubMed Google Scholar
Kornilios Kourtis
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Anastopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Vasileios Karakasis
View author publications
You can also search for this author in PubMed Google Scholar
Nectarios Koziris
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Georgios Goumas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goumas, G., Kourtis, K., Anastopoulos, N. et al. Performance evaluation of the sparse matrix-vector multiplication on modern architectures. J Supercomput 50, 36–77 (2009). https://doi.org/10.1007/s11227-008-0251-8

Download citation

Received: 16 November 2007
Accepted: 28 October 2008
Published: 25 November 2008
Issue Date: October 2009
DOI: https://doi.org/10.1007/s11227-008-0251-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Performance evaluation of the sparse matrix-vector multiplication on modern architectures

Abstract

Access this article

Similar content being viewed by others

Performance Characteristics for Sparse Matrix-Vector Multiplication on GPUs

Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi

Sparse Matrix-Vector Product

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Performance evaluation of the sparse matrix-vector multiplication on modern architectures

Abstract

Access this article

Similar content being viewed by others

Performance Characteristics for Sparse Matrix-Vector Multiplication on GPUs

Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi

Sparse Matrix-Vector Product

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation