poster

Accelerating Sparse General Matrix-Matrix Multiplication for NVIDIA Volta GPU and Hygon DCU

Authors:

Zhuo Tian,

Shuai Yang,

Changyou ZhangAuthors Info & Claims

HPDC '23: Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing

Pages 329 - 330

https://doi.org/10.1145/3588195.3595936

Published: 07 August 2023 Publication History

Get Access

Abstract

Sparse general matrix-matrix multiplication (SpGEMM) is challenging especially on graphic accelerators. Existing solutions do not fully utilize the shared memory of the graphics accelerator. Our proposal could effectively utilize the graphics accelerator's on-chip shared memory and dynamically assign the device resources by grouping the rows based on a hybrid strategy for load balancing. Experiments show that our proposal achieves speedups of up to x7.43 in double precision compared to existing SpGEMM libraries. Our implementation is fully general and our optimization strategy adaptively processes the SpGEMM workload row-wise to substantially improve performance by decreasing the work complexity and utilizing the memory hierarchy more effectively.

References

[1]

Tao Tang, Xuejun Yang, Yisong Lin, 2011. Cache Miss Analysis for GPU Programs Based on Stack Distance Profile. ICDCS. 623--634.

Google Scholar

[2]

Gonzalo Berger, Manuel Freire, Renzo Marini, Ernesto Dufrechou, Pablo Ezzatti, 2021. Unleashing the performance of bmSparse for the sparse matrix multiplication in GPUs. ScalA@SC. 19--26.

Google Scholar

[3]

Akrem Benatia, Weixing Ji, Yizhuo Wang, Feng Shi, 2020. Sparse matrix partitioning for optimizing SpMV on CPU-GPU heterogeneous platforms. Int. J. High Perform. Comput. Appl. 34(1).

Google Scholar

Index Terms

Accelerating Sparse General Matrix-Matrix Multiplication for NVIDIA Volta GPU and Hygon DCU
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel algorithms
      1. Shared memory algorithms

Recommendations

Adaptive sparse matrix-matrix multiplication on the GPU
PPoPP '19: Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming

In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, ...
A framework for general sparse matrix-matrix multiplication on GPUs and heterogeneous processors

General sparse matrix-matrix multiplication (SpGEMM) is a fundamental building block for numerous applications such as algebraic multigrid method (AMG), breadth first search and shortest path problem. Compared to other sparse BLAS routines, an efficient ...
GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication

Many high performance computing applications require computing both sparse matrix-vector product SMVP and sparse matrix-transpose vector product SMTVP for better overall performance. Under such a circumstance, it is critical to maintain a similarly high ...

Comments

Information & Contributors

Information

Published In

HPDC '23: Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing

August 2023

350 pages

ISBN:9798400701559

DOI:10.1145/3588195

General Chair:
Ali R. Butt
Virginia Tech, USA
,
Program Chairs:
Ningfang Mi
Northeastern University, USA
,
Kyle Chard
University of Chicago & Argonne National Laboratory, USA

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2023

Check for updates

Author Tags

Qualifiers

Poster

Funding Sources

China?s National Key Research and Development Project
GHfund

Conference

HPDC '23

Sponsor:

HPDC '23: The 32nd International Symposium on High-Performance Parallel and Distributed Computing

June 16 - 23, 2023

FL, Orlando, USA

Acceptance Rates

Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
160
Total Downloads

Downloads (Last 12 months)65
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Index Terms

Recommendations

Adaptive sparse matrix-matrix multiplication on the GPU

A framework for general sparse matrix-matrix multiplication on GPUs and heterogeneous processors

GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations