skip to main content
10.1145/3437801.3441590acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
poster

On the parallel I/O optimality of linear algebra kernels: near-optimal LU factorization

Published: 17 February 2021 Publication History

Abstract

Dense linear algebra kernels are fundamental components of many scientific computing applications. In this work we present a novel method of deriving parallel I/O lower bounds for this broad family of programs. Based on the X-Partitioning abstraction, our method explicitly captures inter-statement dependencies. Applying our analysis to LU factorization, we derive COnfLUX, an LU algorithm with the parallel I/O cost of N3/([EQUATION]) communicated elements per processor - only 1/3× over our established lower bound. We evaluate COnfLUX on various problem sizes, demonstrating empirical results that match our theoretical analysis, communicating less than Cray ScaLAPACK, SLATE, and the asymptotically-optimal CANDMC library. Running on 1,024 nodes of Piz Daint, COnfLUX communicates 1.6× less than the second-best implementation and is expected to communicate 2.1× less on a full-scale run on Summit.

Reference

[1]
Grzegorz Kwasniewski et al. 2020. On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal LU Factorization. (2020). arXiv:cs.DC/2010.05975

Cited By

View all
  • (2024)Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency AnalysisIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.3303431(1-20)Online publication date: 2024
  • (2024)Towards faster and robust solution for dynamic LR and QR factorizationScientific Reports10.1038/s41598-024-76537-014:1Online publication date: 13-Nov-2024
  • (2023)Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph QueriesACM Computing Surveys10.1145/360493256:2(1-40)Online publication date: 15-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
February 2021
507 pages
ISBN:9781450382946
DOI:10.1145/3437801
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 February 2021

Check for updates

Author Tags

  1. complexity measures
  2. models of computation
  3. numerical linear algebra

Qualifiers

  • Poster

Funding Sources

Conference

PPoPP '21

Acceptance Rates

PPoPP '21 Paper Acceptance Rate 31 of 150 submissions, 21%;
Overall Acceptance Rate 230 of 1,014 submissions, 23%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency AnalysisIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.3303431(1-20)Online publication date: 2024
  • (2024)Towards faster and robust solution for dynamic LR and QR factorizationScientific Reports10.1038/s41598-024-76537-014:1Online publication date: 13-Nov-2024
  • (2023)Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph QueriesACM Computing Surveys10.1145/360493256:2(1-40)Online publication date: 15-Sep-2023
  • (2023)High-Performance and Programmable Attentional Graph Neural Networks with Global Tensor FormulationsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607067(1-16)Online publication date: 12-Nov-2023
  • (2023)The Role of Machine Learning in CybersecurityDigital Threats: Research and Practice10.1145/35455744:1(1-38)Online publication date: 7-Mar-2023
  • (2023)Data Distribution Schemes for Dense Linear Algebra Factorizations on Any Number of Nodes2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00047(390-401)Online publication date: May-2023
  • (2022)Multitask Balanced and Recalibrated Network for Medical Code PredictionACM Transactions on Intelligent Systems and Technology10.1145/356304114:1(1-20)Online publication date: 9-Nov-2022
  • (2022)Automatic Differentiation of C++ Codes on Emerging Manycore Architectures with SacadoACM Transactions on Mathematical Software10.1145/356026248:4(1-29)Online publication date: 19-Dec-2022
  • (2022)A New Similarity Space Tailored for Supervised Deep Metric LearningACM Transactions on Intelligent Systems and Technology10.1145/355976614:1(1-25)Online publication date: 9-Nov-2022
  • (2022)Decentralized Online Learning: Take Benefits from Others’ Data without Sharing Your Own to Track Global TrendACM Transactions on Intelligent Systems and Technology10.1145/355976514:1(1-22)Online publication date: 9-Nov-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media