research-article

PanguLU: A Scalable Regular Two-Dimensional Block-Cyclic Sparse Direct Solver on Distributed Heterogeneous Systems

Authors:

Xu Fu,

Weifeng LiuAuthors Info & Claims

SC '23: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

Article No.: 51, Pages 1 - 14

https://doi.org/10.1145/3581784.3607050

Published: 11 November 2023 Publication History

Get Access

Abstract

Sparse direct solvers play a vital role in large-scale high performance computing in science and engineering. Existing distributed sparse direct methods employ multifrontal/supernodal patterns to aggregate columns of nearly identical forms and to exploit dense basic linear algebra subprograms (BLAS) for computation. However, such a data layout may bring more unevenness when the structure of the input matrix is not ideal, and using dense BLAS may waste many floating-point operations on zero fill-ins.

In this paper, we propose a new sparse direct solver called PanguLU. Unlike the multifrontal/supernodal layout, our work relies on simpler regular 2D blocking and stores the blocks in their sparse forms to avoid any extra fill-ins. Based on the sparse patterns of the blocks, a variety of block-wise sparse BLAS methods are developed and selected for higher efficiency on local GPUs. To make PanguLU more scalable, we also adjust the mapping of blocks to processes for overall more balanced workload, and propose a synchronisation-free communication strategy considering the dependencies among different sub-tasks to reduce overall latency overhead.

Experiments on two distributed heterogeneous platforms consisting of 128 NVIDIA A100 GPUs and 128 AMD MI50 GPUs demonstrate that PanguLU achieves up to 11.70x and 17.97x speedups over the latest SuperLU_DIST, and scales up to 47.51x and 74.84x on the 128 A100 and MI50 GPUs over a single GPU, respectively.

Supplemental Material

MP4 File - SC23 paper presentation recording for "PanguLU: A Scalable Regular Two-Dimensional Block-Cyclic Sparse Direct Solver on Distributed Heterogeneous Systems"

SC23 paper presentation recording for "PanguLU: A Scalable Regular Two-Dimensional Block-Cyclic Sparse Direct Solver on Distributed Heterogeneous Systems", by Xu Fu, Bingbin Zhang, Tengcheng Wang, Wenhao Li, Yuechen Lu, Enxin Yi, Jianqi Zhao, Xiaohan Geng, Fangying Li, Jingwen Zhang, Zhou Jin, Weifeng Liu

Download
400.00 MB

References

[1]

https://www.top500.org/. 2023.

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

A graphics processing unit accelerated sparse direct solver and preconditioner with block low rank compression

swSuperLU: A highly scalable sparse direct solver on Sunway manycore architecture

A Sparse Direct Solver for Distributed Memory Xeon Phi-Accelerated Systems

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Badges

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations