On the Roles of the Programmer, the Compiler and the Runtime System When Programming Accelerators in OpenMP

Ozen, Guray; Ayguadé, Eduard; Labarta, Jesús

doi:10.1007/978-3-319-11454-5_16

Guray Ozen²⁰,
Eduard Ayguadé²⁰ &
Jesús Labarta²⁰

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8766))

Included in the following conference series:

International Workshop on OpenMP

978 Accesses
11 Citations

Abstract

OpenMP includes in its latest 4.0 specification the accelerator model. In this paper we present a partial implementation of this specification in the OmpSs programming model developed at the Barcelona Supercomputing Center with the aim of identifying which should be the roles of the programmer, the compiler and the runtime system in order to facilitate the asynchronous execution of tasks in architectures with multiple accelerator devices and processors. The design of OmpSs is highly biassed to delegate most of the decisions to the runtime system, which based on the task graph built at runtime (depend clauses) is able to schedule tasks in a data flow way to the available processors and accelerator devices and orchestrate data transfers and reuse among multiple address spaces. For this reason our implementation is partial, just considering from 4.0 those directives that enable the compiler the generation of the so called “kernels” to be executed on the target device. Several extensions to the current specification are also presented, such as the specification of tasks in “native” CUDA and OpenCL or how to specify the device and data privatization in the target construct. Finally, the paper also discusses some challenges found in code generation and a preliminary performance evaluation with some kernel applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

FOTV: A Generic Device Offloading Framework for OpenMP

The Celerity High-level API: C++20 for Accelerator Clusters

Article Open access 22 April 2022

Enhancing OpenMP Tasking Model: Performance and Portability

References

Nvidia CUDA parallel computing and programming, http://www.nvidia.com/cuda
OpenCL Open Computing Language, http://www.khronos.org/opencl/
OpenACC: Directives for Accelerators, http://www.openacc-standard.org
The OpenMP API Specification for Parallel programming, http://www.openmp.org
Barcelona Supercomputing Center. The OmpSs programming model, http://pm.bsc.es/ompss
Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: CellSs: A programming model for the Cell/B.E. architecture. In: ACM/IEEE Supercomputing (November 2006)
Google Scholar
Bueno, J., Planas, J., Duran, A., Badia, R.M., Martorell, X., Ayguade, E., Labarta, J.: Productive programming of GPU clusters with OmpSs. In: IEEE 26th International on Parallel Distributed Processing Symposium (IPDPS) (May 2012)
Google Scholar
Perez, J.M., Badia, R.M., Labarta, J.: A dependency-aware task-based programming environment for multi-core architectures. In: IEEE International Conference on Cluster Computing (September 2008)
Google Scholar
Barcelona Supercomputing Center. Mercurium source-to-source compiler, http://pm.bsc.es/mcxx
Barcelona Supercomputing Center. Nanos++ runtime library, http://pm.bsc.es/nanos
Vadlamani, S., Kim, Y., Dennis, J.: DG-kernel: A climate benchmark on accelerated and conventional architectures. In: Extreme Scaling Workshop (XSW) (August 2013)
Google Scholar
NAS Division. NAS parallel benchmarks, http://www.nas.nasa.gov/resources/software/npb.html
CAPS Entreprise, CAPS Compiler, http://www.caps-entreprise.com
PGI Accelerator Compilers, http://www.pgroup.com/resources/accel.htm
Han, T.D., Abdelrahman, T.S.: Hicuda: A high-level directive-based language for gpu programming. In: 2nd Workshop on General Purpose Processing on Graphics Processing Units (GPGPU) (March 2009)
Google Scholar
Lee, S., Min, S.-J., Eigenmann, R.: OpenMp to GPGPU: A compiler framework for automatic translation and optimization. In: 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP) (February 2009)
Google Scholar
Liao, C., Yan, Y., de Supinski, B.R., Quinlan, D.J., Chapman, B.: Early experiences with the openMP accelerator model. In: Rendell, A.P., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2013. LNCS, vol. 8122, pp. 84–98. Springer, Heidelberg (2013)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Barcelona Supercomputing Center (BSC-CNS), Universitat Politècnica de Catalunya (UPC–BarcelonaTECH), Barcelona, Spain
Guray Ozen, Eduard Ayguadé & Jesús Labarta

Authors

Guray Ozen
View author publications
You can also search for this author in PubMed Google Scholar
Eduard Ayguadé
View author publications
You can also search for this author in PubMed Google Scholar
Jesús Labarta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Cray Inc., Cray Plaza, 380 Jackson St., Suite 210, 55101, St. Paul, MN, USA
Luiz DeRose
Lawrence Livermore National Laboratory, 94551-0808, Livermore, CA, USA
Bronis R. de Supinski
Sandia National Laboratories, Albuquerque, NM, USA
Stephen L. Olivier
University of Houston, Houston, TX, USA
Barbara M. Chapman
RWTH Aachen, Aachen, Germany
Matthias S. Müller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ozen, G., Ayguadé, E., Labarta, J. (2014). On the Roles of the Programmer, the Compiler and the Runtime System When Programming Accelerators in OpenMP. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds) Using and Improving OpenMP for Devices, Tasks, and More. IWOMP 2014. Lecture Notes in Computer Science, vol 8766. Springer, Cham. https://doi.org/10.1007/978-3-319-11454-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-11454-5_16
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11453-8
Online ISBN: 978-3-319-11454-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Roles of the Programmer, the Compiler and the Runtime System When Programming Accelerators in OpenMP

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

FOTV: A Generic Device Offloading Framework for OpenMP

The Celerity High-level API: C++20 for Accelerator Clusters

Enhancing OpenMP Tasking Model: Performance and Portability

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

On the Roles of the Programmer, the Compiler and the Runtime System When Programming Accelerators in OpenMP

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

FOTV: A Generic Device Offloading Framework for OpenMP

The Celerity High-level API: C++20 for Accelerator Clusters

Enhancing OpenMP Tasking Model: Performance and Portability

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation