skip to main content
10.1145/2909437.2909440acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiwoclConference Proceedingsconference-collections
research-article

The Hitchhiker's Guide to Cross-Platform OpenCL Application Development

Published: 19 April 2016 Publication History

Abstract

One of the benefits to programming of OpenCL is platform portability. That is, an OpenCL program that follows the OpenCL specification should, in principle, execute reliably on any platform that supports OpenCL. To assess the current state of OpenCL portability, we provide an experience report examining two sets of open source benchmarks that we attempted to execute across a variety of GPU platforms, via OpenCL. We report on the portability issues we encountered, where applications would execute successfully on one platform but fail on another. We classify issues into three groups: (1) framework bugs, where the vendor-provided OpenCL framework fails; (2) specification limitations, where the OpenCL specification is unclear and where different GPU platforms exhibit different behaviours; and (3) programming bugs, where non-portability arises due to the program exercising behaviours that are incorrect or undefined according to the OpenCL specification. The issues we encountered slowed the development process associated with our sets of applications, but we view the issues as providing exciting motivation for future testing and verification efforts to improve the state of OpenCL portability; we conclude with a discussion of these.

References

[1]
J. Alglave, M. Batty, A. F. Donaldson, G. Gopalakrishnan, J. Ketema, D. Poetzl, T. Sorensen, and J. Wickerson. GPU concurrency: Weak behaviours and programming assumptions. In ASPLOS, pages 577--591. ACM, 2015.
[2]
R. Baghdadi, U. Beaugnon, A. Cohen, T. Grosser, M. Kruse, C. Reddy, S. Verdoolaege, A. Betts, A. F. Donaldson, J. Ketema, J. Absar, S. van Haastregt, A. Kravets, A. Lokhmotov, R. David, and E. Hajiyev. PENCIL: A platform-neutral compute intermediate language for accelerator programming. In PACT, pages 138--149, 2015.
[3]
E. Bardsley, A. Betts, N. Chong, P. Collingbourne, P. Deligiannis, A. F. Donaldson, J. Ketema, D. Liew, and S. Qadeer. Engineering a static verification tool for GPU kernels. In CAV, pages 226--242. Springer, 2014.
[4]
E. Bardsley and A. F. Donaldson. Warps and atomics: Beyond barrier synchronization in the verification of GPU kernels. In NFM, pages 230--245. Springer, 2014.
[5]
M. Batty, A. F. Donaldson, and J. Wickerson. Overhauling SC atomics in C11 and OpenCL. In POPL, pages 634--648. ACM, 2016.
[6]
M. Batty, K. Memarian, S. Owens, S. Sarkar, and P. Sewell. Clarifying and compiling C/C++ concurrency: from C++11 to POWER. In POPL, pages 509--520. ACM, 2012.
[7]
A. Betts, N. Chong, A. F. Donaldson, J. Ketema, S. Qadeer, P. Thomson, and J. Wickerson. The design and implementation of a verification technique for GPU kernels. ACM Trans. Program. Lang. Syst., 37(3):10, 2015.
[8]
M. Burtscher, R. Nasre, and K. Pingali. A quantitative study of irregular programs on GPUs. In IISWC, pages 141--151. IEEE, 2012.
[9]
S. Che, B. M. Beckmann, S. K. Reinhardt, and K. Skadron. Pannotia: Understanding irregular GPGPU graph applications. In IISWC, pages 185--195. IEEE, 2013.
[10]
P. Collingbourne, C. Cadar, and P. H. J. Kelly. Symbolic crosschecking of data-parallel floating-point code. IEEE Trans. Software Eng., 40(7):710--737, 2014.
[11]
K. Gupta, J. Stuart, and J. D. Owens. A study of persistent threads style GPU programming for GPGPU workloads. In InPar, pages 1--14. IEEE, 2012.
[12]
Intel. Opencl optimization guide, 2014. https://software.intel.com/sites/default/files/managed/72/2c/gfxOptimizationGuide.pdf.
[13]
ISO/IEC. Standard for programming language C++, 2012.
[14]
Khronos Group. The OpenCL C specification, version 2.0. https://www.khronos.org/registry/cl/specs/opencl-2.0-openclc.pdf.
[15]
Khronos Group. The OpenCL specification, version 1.0. https://www.khronos.org/registry/cl/specs/opencl-1.0.pdf.
[16]
Khronos Group. The OpenCL specification, version 1.2. https://www.khronos.org/registry/cl/specs/opencl-1.2.pdf.
[17]
Khronos Group. The OpenCL specification, version 2.0. https://www.khronos.org/registry/cl/specs/opencl-2.0.pdf.
[18]
G. Li, P. Li, G. Sawaya, G. Gopalakrishnan, I. Ghosh, and S. P. Rajan. GKLEE: concolic verification and test generation for GPUs. In PPoPP, pages 215--224. ACM, 2012.
[19]
C. Lidbury, A. Lascu, N. Chong, and A. F. Donaldson. Many-core compiler fuzzing. In PLDI, pages 65--76. ACM, 2015.
[20]
Nvidia. CUDA-memcheck. https://developer.nvidia.com/CUDA-MEMCHECK.
[21]
Nvidia. CUDA C programming guide, version 7, March 2015. http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf.
[22]
Nvidia. Parallel thread execution ISA: Version 4.3, March 2015. http://docs.nvidia.com/cuda/pdf/ptx_isa_4.3.pdf.
[23]
M. Pflanzer, A. F. Donaldson, and A. Lascu. Automatic test case reduction for OpenCL. In IWOCL. ACM, 2016.
[24]
J. Price and S. McIntosh-Smith. Oclgrind: An extensible opencl device simulator. In IWOCL, pages 12:1--12:7. ACM, 2015.
[25]
J. Shen, J. Fang, H. J. Sips, and A. L. Varbanescu. Performance gaps between OpenMP and OpenCL for multi-core CPUs. In ICPP Workshops, pages 116--125. IEEE Computer Society, 2012.
[26]
M. Steuwer, C. Fensch, S. Lindley, and C. Dubach. Generating performance portable code using rewrite rules: From high-level functional expressions to high-performance opencl code. In ICFP, pages 205--217. ACM, 2015.
[27]
N. Whitehead and A. Fit-florea. Precision & performance: Floating point and IEEE 754 compliance for NVIDIA GPUs, 2015. http://developer.download.nvidia.com/assets/cuda/files/NVIDIA-CUDA-Floating-Point.pdf.

Cited By

View all
  • (2023)GPUHarbor: Testing GPU Memory Consistency at Large (Experience Paper)Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598095(779-791)Online publication date: 12-Jul-2023
  • (2023)Program Reconditioning: Avoiding Undefined Behaviour When Finding and Reducing Compiler BugsProceedings of the ACM on Programming Languages10.1145/35912947:PLDI(1801-1825)Online publication date: 6-Jun-2023
  • (2023)Redwood: Flexible and Portable Heterogeneous Tree Traversal Workloads2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS57527.2023.00028(201-213)Online publication date: Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
IWOCL '16: Proceedings of the 4th International Workshop on OpenCL
April 2016
131 pages
ISBN:9781450343381
DOI:10.1145/2909437
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

In-Cooperation

  • The University of Bristol: The University of Bristol

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 April 2016

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

IWOCL '16
IWOCL '16: The 4th International Workshop on OpenCL
April 19 - 21, 2016
Vienna, Austria

Acceptance Rates

Overall Acceptance Rate 84 of 152 submissions, 55%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)2
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)GPUHarbor: Testing GPU Memory Consistency at Large (Experience Paper)Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598095(779-791)Online publication date: 12-Jul-2023
  • (2023)Program Reconditioning: Avoiding Undefined Behaviour When Finding and Reducing Compiler BugsProceedings of the ACM on Programming Languages10.1145/35912947:PLDI(1801-1825)Online publication date: 6-Jun-2023
  • (2023)Redwood: Flexible and Portable Heterogeneous Tree Traversal Workloads2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS57527.2023.00028(201-213)Online publication date: Apr-2023
  • (2022)Parallel fractal image compression using quadtree partition with task and dynamic parallelismJournal of Real-Time Image Processing10.1007/s11554-021-01193-w19:2(391-402)Online publication date: 8-Jan-2022
  • (2020)Compute framework aspects for integrated GPUs2020 19th RoEduNet Conference: Networking in Education and Research (RoEduNet)10.1109/RoEduNet51892.2020.9324851(1-6)Online publication date: 11-Dec-2020
  • (2020)Challenges Porting Blockchain Library to OpenCLIntelligent Methods in Computing, Communications and Control10.1007/978-3-030-53651-0_13(158-166)Online publication date: 28-Jul-2020
  • (2019)On the correctness of GPU programsProceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3293882.3338989(443-447)Online publication date: 10-Jul-2019
  • (2019)CLTestCheck: Measuring Test Effectiveness for GPU KernelsFundamental Approaches to Software Engineering10.1007/978-3-030-16722-6_19(315-331)Online publication date: 4-Apr-2019
  • (2016)Portable inter-workgroup barrier synchronisation for GPUsACM SIGPLAN Notices10.1145/3022671.298403251:10(39-58)Online publication date: 19-Oct-2016
  • (2016)Portable inter-workgroup barrier synchronisation for GPUsProceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications10.1145/2983990.2984032(39-58)Online publication date: 19-Oct-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media