skip to main content
10.1145/1989493.1989531acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
abstract

Brief announcement: communication bounds for heterogeneous architectures

Published: 04 June 2011 Publication History

Abstract

As the gap between the cost of communication (i.e., data movement) and computation continues to grow, the importance of pursuing algorithms which minimize communication also increases. Toward this end, we seek asymptotic communication lower bounds for general memory models and classes of algorithms. Recent work has established lower bounds for a wide set of linear algebra algorithms on a sequential machine and on a parallel machine with identical processors. This work extends these previous bounds to a heterogeneous model in which processors access data and perform floating point operations at differing speeds. We also present an algorithm for dense matrix multiplication which attains the lower bound.

References

[1]
G. Ballard, J. Demmel, and A. Gearhart. Communication bounds for heterogeneous architectures. Technical report, UC Berkeley EECS-2011-13, Feb. 2011.
[2]
G. Ballard, J. Demmel, O. Holtz, and O. Schwartz. Minimizing communication in linear algebra. Technical report, UC Berkeley EECS-2011-15, Feb. 2011.
[3]
R. Blumofe, M. Frigo, C. Joerg, C. Leiserson, and K. Randall. DAG-consistent distributed shared memory. In IPPS '96: Proceedings of the 10th international parallel processing symposium, pages 132--141, 1996.
[4]
J. W. Hong and H. T. Kung. I/O complexity: The red-blue pebble game. In STOC '81: Proceedings of the thirteenth annual ACM symposium on theory of computing, pages 326--333, New York, NY, USA, 1981. ACM.
[5]
D. Irony, S. Toledo, and A. Tiskin. Communication lower bounds for distributed-memory matrix multiplication. J. Parallel Distrib. Comput., 64(9):1017--1026, 2004.
[6]
D. Wise. Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free. In A. Bode, T. Ludwig, W. Karl, and R. Wismüller, editors, Euro-Par 2000 Parallel Processing, volume 1900 of Lecture Notes in Computer Science, pages 774--783. Springer Berlin / Heidelberg, 2000.

Cited By

View all
  • (2023)Stragglers in Distributed Matrix MultiplicationJob Scheduling Strategies for Parallel Processing10.1007/978-3-031-43943-8_4(74-96)Online publication date: 15-Sep-2023
  • (2021)Processor-Aware Cache-Oblivious Algorithms✱Proceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472506(1-10)Online publication date: 9-Aug-2021
  • (2019)Layer based partition for matrix multiplication on heterogeneous mesh networksProceedings of the High Performance Computing Symposium10.5555/3338075.3338079(1-12)Online publication date: 29-Apr-2019
  • Show More Cited By

Index Terms

  1. Brief announcement: communication bounds for heterogeneous architectures

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SPAA '11: Proceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures
    June 2011
    404 pages
    ISBN:9781450307437
    DOI:10.1145/1989493

    Sponsors

    In-Cooperation

    • EATCS: European Association for Theoretical Computer Science

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 June 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. communication-avoiding
    2. heterogeneity

    Qualifiers

    • Abstract

    Conference

    SPAA '11

    Acceptance Rates

    Overall Acceptance Rate 447 of 1,461 submissions, 31%

    Upcoming Conference

    SPAA '25
    37th ACM Symposium on Parallelism in Algorithms and Architectures
    July 28 - August 1, 2025
    Portland , OR , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Stragglers in Distributed Matrix MultiplicationJob Scheduling Strategies for Parallel Processing10.1007/978-3-031-43943-8_4(74-96)Online publication date: 15-Sep-2023
    • (2021)Processor-Aware Cache-Oblivious Algorithms✱Proceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472506(1-10)Online publication date: 9-Aug-2021
    • (2019)Layer based partition for matrix multiplication on heterogeneous mesh networksProceedings of the High Performance Computing Symposium10.5555/3338075.3338079(1-12)Online publication date: 29-Apr-2019
    • (2013)Graph expansion and communication costs of fast matrix multiplicationJournal of the ACM10.1145/2395116.239512159:6(1-23)Online publication date: 9-Jan-2013
    • (2013)Perfect Strong Scaling Using No Additional EnergyProceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing10.1109/IPDPS.2013.32(649-660)Online publication date: 20-May-2013
    • (2011)Graph expansion and communication costs of fast matrix multiplicationProceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures10.1145/1989493.1989495(1-12)Online publication date: 4-Jun-2011

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media