poster

OpenCL as a unified programming model for heterogeneous CPU/GPU clusters

Authors:

Jun Lee,

Jaejin LeeAuthors Info & Claims

PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming

Pages 299 - 300

https://doi.org/10.1145/2145816.2145863

Published: 25 February 2012 Publication History

Get Access

Abstract

In this paper, we propose an OpenCL framework for heterogeneous CPU/GPU clusters, and show that the framework achieves both high performance and ease of programming. The framework provides an illusion of a single system for the user. It allows the application to utilize multiple heterogeneous compute devices, such as multicore CPUs and GPUs, in a remote node as if they were in a local node. No communication API, such as the MPI library, is required in the application source. We implement the OpenCL framework and evaluate its performance on a heterogeneous CPU/GPU cluster that consists of one host node and nine compute nodes using eleven OpenCL benchmark applications.

References

[1]

AMD Accelerated Parallel Processing (APP) SDK With OpenCL 1.1 Support. AMD, 2011. http://developer.amd.com/sdks/AMDAPPSDK/Pages/default.aspx.

Google Scholar

[2]

C. Bienia, S. Kumar, J. P. Singh, and K. Li. The PARSEC benchmark suite: characterization and architectural implications. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08, pages 72--81, 2008.

Digital Library

Google Scholar

[3]

The OpenCL Specification Version 1.1. Khronos OpenCL Working Group, 2010. http://www.khronos.org/opencl.

Google Scholar

[4]

C. Lattner and V. Adve. LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization, CGO '04, pages 75--86, 2004.

Digital Library

Google Scholar

[5]

J. Lee, J. Kim, S. Seo, S. Kim, J. Park, H. Kim, T. T. Dao, Y. Cho, S. J. Seo, S. H. Lee, S. M. Cho, H. J. Song, S.-B. Suh, and J.-D. Choi. An OpenCL framework for heterogeneous multicores with local memory. In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, pages 193--204, 2010.

Digital Library

Google Scholar

[6]

NVIDIA. NVIDIA CUDA Toolkit 4.0. http://developer.nvidia.com/cuda-toolkit-40.

Google Scholar

[7]

S. Seo, G. Jo, and J. Lee. Performance Characterization of the NAS Parallel Benchmarks in OpenCL. In Proceedings of the 2011 IEEE International Symposium on Workload Characterization, IISWC '11, 2011.

Digital Library

Google Scholar

[8]

The IMPACT Research Group. Parboil Benchmark suite. http://impact.crhc.illinois.edu/parboil.php.

Google Scholar

Cited By

View all

Kucher VFey FGorlatch S(2018)Unified Cross-Platform Profiling of Parallel C++ Applications2018 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)10.1109/PMBS.2018.8641652(57-62)Online publication date: Nov-2018
https://doi.org/10.1109/PMBS.2018.8641652
Xu SChang T(2017)A Feasible Architecture for ARM-Based Microserver Systems Considering Energy EfficiencyIEEE Access10.1109/ACCESS.2017.26576585(4611-4620)Online publication date: 2017
https://doi.org/10.1109/ACCESS.2017.2657658
Deepika HMangala NBabu S(2016)Automatic program generation for heterogeneous architectures2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI)10.1109/ICACCI.2016.7732032(102-109)Online publication date: Sep-2016
https://doi.org/10.1109/ICACCI.2016.7732032
Show More Cited By

Index Terms

OpenCL as a unified programming model for heterogeneous CPU/GPU clusters
1. Computing methodologies
  1. Concurrent computing methodologies
    1. Concurrent programming languages
2. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Runtime environments
      2. Source code generation
    2. General programming languages
      1. Language types
        Concurrent programming languages

Recommendations

A distributed OpenCL framework using redundant computation and data replication
PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation

Applications written solely in OpenCL or CUDA cannot execute on a cluster as a whole. Most previous approaches that extend these programming models to clusters are based on a common idea: designating a centralized host node and coordinating the other ...
SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters
ICS '12: Proceedings of the 26th ACM international conference on Supercomputing

In this paper, we propose SnuCL, an OpenCL framework for heterogeneous CPU/GPU clusters. We show that the original OpenCL semantics naturally fits to the heterogeneous cluster programming environment, and the framework achieves high performance and ease ...
OpenCL as a unified programming model for heterogeneous CPU/GPU clusters
PPOPP '12

In this paper, we propose an OpenCL framework for heterogeneous CPU/GPU clusters, and show that the framework achieves both high performance and ease of programming. The framework provides an illusion of a single system for the user. It allows the ...

Comments

Information & Contributors

Information

Published In

PPoPP '12: Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming

February 2012

352 pages

ISBN:9781450311601

DOI:10.1145/2145816

General Chair:
J. Ramanujam
Louisiana State University, USA
,
Program Chair:
P. Sadayappan
The Ohio State University, USA

ACM SIGPLAN Notices Volume 47, Issue 8
PPOPP '12
August 2012
334 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2370036
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 February 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Poster

Conference

PPoPP '12

Sponsor:

SIGPLAN

PPoPP '12: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

February 25 - 29, 2012

Louisiana, New Orleans, USA

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
835
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)2

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Kucher VFey FGorlatch S(2018)Unified Cross-Platform Profiling of Parallel C++ Applications2018 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)10.1109/PMBS.2018.8641652(57-62)Online publication date: Nov-2018
https://doi.org/10.1109/PMBS.2018.8641652
Xu SChang T(2017)A Feasible Architecture for ARM-Based Microserver Systems Considering Energy EfficiencyIEEE Access10.1109/ACCESS.2017.26576585(4611-4620)Online publication date: 2017
https://doi.org/10.1109/ACCESS.2017.2657658
Deepika HMangala NBabu S(2016)Automatic program generation for heterogeneous architectures2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI)10.1109/ICACCI.2016.7732032(102-109)Online publication date: Sep-2016
https://doi.org/10.1109/ICACCI.2016.7732032
HUANG DXUN CWU NWEN MZHANG CCAI XYANG Q(2015)Enabling a Uniform OpenCL Device View for Heterogeneous PlatformsIEICE Transactions on Information and Systems10.1587/transinf.2014EDP7244E98.D:4(812-823)Online publication date: 2015
https://doi.org/10.1587/transinf.2014EDP7244
Vishwanathan MShah RKim KChoi M(2015)Silent Data Corruption (SDC) vulnerability of GPU on various GPGPU workloads2015 International SoC Design Conference (ISOCC)10.1109/ISOCC.2015.7401681(11-12)Online publication date: Nov-2015
https://doi.org/10.1109/ISOCC.2015.7401681
Meraji SKeenleyside JKamath SBlainey B(2015)Towards a Combined Grouping and Aggregation Algorithm for Fast Query Processing in Columnar Databases with GPUsProceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop10.1109/IPDPSW.2015.21(594-603)Online publication date: 25-May-2015
https://dl.acm.org/doi/10.1109/IPDPSW.2015.21
Perelygin KLam SWu X(2014)Graphics Processing Units and Open Computing Language for parallel computingComputers and Electrical Engineering10.1016/j.compeleceng.2013.11.01540:1(241-251)Online publication date: 1-Jan-2014
https://dl.acm.org/doi/10.1016/j.compeleceng.2013.11.015
Wu BZhao ZZhang EJiang YShen X(2013)Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPUACM SIGPLAN Notices10.1145/2517327.244252348:8(57-68)Online publication date: 23-Feb-2013
https://dl.acm.org/doi/10.1145/2517327.2442523
Kofler KGrasso ICosenza BFahringer TMalony ANemirovsky MMidkiff S(2013)An automatic input-sensitive approach for heterogeneous task partitioningProceedings of the 27th international ACM conference on International conference on supercomputing10.1145/2464996.2465007(149-160)Online publication date: 10-Jun-2013
https://dl.acm.org/doi/10.1145/2464996.2465007
Wu BZhao ZZhang EJiang YShen XNicolau AShen XAmarasinghe SVuduc R(2013)Complexity analysis and algorithm design for reorganizing data to minimize non-coalesced memory accesses on GPUProceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming10.1145/2442516.2442523(57-68)Online publication date: 23-Feb-2013
https://dl.acm.org/doi/10.1145/2442516.2442523
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

A distributed OpenCL framework using redundant computation and data replication

SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters

OpenCL as a unified programming model for heterogeneous CPU/GPU clusters