Skip to main content
Log in

Performance evaluation of sparse matrix products in UPC

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Unified Parallel C (UPC) is a Partitioned Global Address Space (PGAS) language whose popularity has increased during the last years owing to its high programmability and reasonable performance through an efficient exploitation of data locality, especially on hierarchical architectures like multicore clusters. However, the performance issues that arise in this language due to the irregular structure of sparse matrix operations have not yet been studied. Among them, the selection of an adequate storage format for the sparse matrices can significantly improve the efficiency of the parallel codes. This paper presents an evaluation, using UPC, of the most common sparse storage formats with different implementations of the matrix-vector and matrix-matrix products, which are key kernels in many scientific applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Barton C, Casçaval C, Almási G, Zheng Y, Farreras M, Chatterjee S, Amaral JN (2006) Shared memory programming for large scale machines. In: Proc ACM SIGPLAN conf on programming language design and implementation (PLDI’06), Ottawa, Canada, pp 108–117

    Google Scholar 

  2. Bell C, Nishtala R (2004) UPC implementation of the sparse triangular solve and NAS FT. Last visit: April 2012. http://www.cs.berkeley.edu/~rajeshn/pubs/bell_nishtala_spts_ft.pdf

  3. Bell C, Bonachea D, Nishtala R, Yelick K (2006) Optimizing bandwidth limited problems using one-sided communication and overlap. In: Proc 20th intl parallel and distributed processing symp (IPDPS’06), Rhodes Island, Greece

    Google Scholar 

  4. Buluç A, Gilbert JR (2008) Challenges and advances in parallel sparse matrix-matrix multiplication. In: Proc 37th intl conf on parallel processing (ICPP’08), Portland, OR, USA, pp 503–510

    Google Scholar 

  5. Dongarra J (2000) Templates for the solution of algebraic eigenvalue problems: a practical guide. SIAM, Philadelphia, Chap 10

    MATH  Google Scholar 

  6. El-Ghazawi T, Cantonnet F (2002) UPC performance and potential: a NPB experimental study. In: Proc 15th ACM/IEEE conf on supercomputing (SC’02), Baltimore, MD, USA

    Google Scholar 

  7. González-Domínguez J, Martin MJ, Taboada GL, Touriño J, Doallo R, Mallón DA, Wibecan B (2012) UPCBLAS: a library for parallel matrix computations in unified parallel C. Concurr Comput Pract Exp. Available online. doi:10.1002/cpe.1914

  8. Hugues MR, Petiton SG (2010) Sparse matrix formats evaluation and optimization on a GPU. In: Proc 12th IEEE intl conf on high performance computing and communications (HPCC’10), Melbourne, Australia, pp 122–129

    Chapter  Google Scholar 

  9. Jiogo CD, Manneback P, Kuonen P (2006) Well balanced sparse matrix-vector multiplication on a parallel heterogeneous system. In: Proc. 8th IEEE intl conf on cluster computing (CLUSTER’06), Barcelona, Spain

    Google Scholar 

  10. Liu S, Zhang Y, Sun X, Qiu R (2009) Performance evaluation of multithreaded sparse matrix-vector multiplication using OpenMP. In: Proc 11th IEEE intl conf on high performance computing and communications (HPCC’09), Seoul, Korea, pp 659–665

    Chapter  Google Scholar 

  11. Luján M, Usman A, Freeman TL, Gurd JR (2005) Storage formats for sparse matrices in Java. In: Proc 5th intl conf on computational science (ICCS’05), Atlanta, GA, USA, pp 364–371

    Google Scholar 

  12. Mallón DA, Taboada GL, Teijeiro C, Touriño J, Fraguela BB, Gómez A, Doallo R, Mouriño JC (2009) Performance evaluation of MPI, UPC and OpenMP on multicore architectures. In: Proc 16th European PVM/MPI users’ group meeting (EuroPVM/MPI’09), Espoo, Finland, pp 174–184

    Google Scholar 

  13. Nishtala R, Hargrove PH, Bonachea D, Yelick K (2009) Scaling communication-intensive applications on BlueGene/P using one-sided communication and overlap. In: Proc 23rd intl parallel and distributed processing symp (IPDPS’09), Rome, Italy, 2009

    Google Scholar 

  14. Shahnaz R, Usman A, Chughtai IR (2006) Implementation and evaluation of parallel sparse matrix-vector products on distributed memory parallel computers. In: Proc 8th IEEE intl conf on cluster computing (CLUSTER’06), Barcelona, Spain

    Google Scholar 

  15. Shan H, Blagojević F, Min SJ, Hargrove P, Jin H, Fuerlinger K, Koniges A, Wright NJ (2010) A programming model performance study using the NAS parallel benchmarks. Sci Program 18(3–4):153–167

    Google Scholar 

  16. Shan H, Wright N, Shalf J, Yelick K, Wagner M, Wichmann N (2011) A preliminary evaluation of the hardware acceleration of the cray gemini interconnect for PGAS languages and comparison with MPI. In: Proc 2nd intl workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS’11), Seattle, WA, USA, pp 13–14

    Chapter  Google Scholar 

  17. Space Basic Linear Algebra Subprograms (SparseBLAS) Library (2012) Last visit: April 2012. http://math.nist.gov/spblas

  18. The University of Florida Sparse Matrix Collection (2012) Last visit: April 2012. http://www.cise.ufl.edu/research/sparse/matrices/

  19. Usman A, Luján M, Freeman L, Gurd JR (2006) Performance evaluation of storage formats for sparse matrices in Fortran. In: Proc 8th IEEE intl conf on high performance computing and communications (HPCC’06), Munich, Germany, pp 160–169

    Chapter  Google Scholar 

  20. Williams S, Oliker L, Vuduc W, Shalf J, Yelick K, Demmel J (2007) Optimization of sparse matrix-vector multiplication on emerging multicore platforms. In: Proc 20th ACM/IEEE conf on supercomputing (SC’07), Reno, NV, USA

    Google Scholar 

Download references

Acknowledgements

This work was funded by Hewlett-Packard (Project “Improving UPC Usability and Performance in Constellation Systems: Implementation/Extensions of UPC Libraries”), the Ministry of Science and Innovation of Spain (Project TIN2010-16735), the Ministry of Education (FPU Grant AP2008-01578), and the Spanish network CAPAP-H3 (Project TIN2010-12011-E). We gratefully thank CESGA (Galicia Supercomputing Center) for providing access to the Finis Terrae supercomputer.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jorge González-Domínguez.

Rights and permissions

Reprints and permissions

About this article

Cite this article

González-Domínguez, J., García-López, Ó., Taboada, G.L. et al. Performance evaluation of sparse matrix products in UPC. J Supercomput 64, 100–109 (2013). https://doi.org/10.1007/s11227-012-0796-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-012-0796-4

Keywords

Navigation