skip to main content
10.1145/2628071.2671423acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
poster

Design of a Hybrid MPI-CUDA Benchmark Suite for CPU-GPU Clusters

Published:24 August 2014Publication History

ABSTRACT

In the last few years, GPUs have become an integral part of HPC clusters. To test these heterogeneous CPU-GPU systems, we designed a hybrid CUDA-MPI benchmark suite that consists of three communication- and compute-intensive applications: Matrix Multiplication (MM), Needleman-Wunsch (NW) and the ADFA compression algorithm [1]. The main goal of this work is to characterize these workloads on CPU-GPU clusters. Our benchmark applications are designed to allow cluster administrators to identify bottlenecks in the cluster, to decide if scaling applications to multiple nodes would improve or decrease overall throughput and to design effective scheduling policies. Our experiments show that inter-node communication can significantly degrade the throughput of communication-intensive applications. We conclude that the scalability of the applications depends primarily on two factors: the cluster configuration and the applications characteristics.

References

  1. M. Becchi and P. Crowley, ?A-DFA: A Time- and Space- Efficient DFA Compression Algorithm for Fast Regular Expression Evaluation,? ACM TACO, vol. 10, no. 1, pp. 1--26, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. B. Needleman, and C. D. Wunsch, ?A general method applicable to the search for similarities in the amino acid sequence of two proteins,? J. of Molecular Biology, vol. 48,no. 3, pp. 443--453, 1970.Google ScholarGoogle Scholar
  3. How to Optimize Data Transfers in CUDA C/C++, http://devblogs.nvidia.com/parallelforall/how-optimize-datatransfers- cuda-cc.Google ScholarGoogle Scholar
  4. An Introduction to CUDA-Aware MPI, http://devblogs.nvidia.com/parallelforall/introduction-cudaaware-mpi.Google ScholarGoogle Scholar

Index Terms

  1. Design of a Hybrid MPI-CUDA Benchmark Suite for CPU-GPU Clusters

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation
      August 2014
      514 pages
      ISBN:9781450328098
      DOI:10.1145/2628071

      Copyright © 2014 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 August 2014

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      PACT '14 Paper Acceptance Rate54of144submissions,38%Overall Acceptance Rate121of471submissions,26%

      Upcoming Conference

      PACT '24
      International Conference on Parallel Architectures and Compilation Techniques
      October 14 - 16, 2024
      Southern California , CA , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader