HIPLZ: Enabling Performance Portability for Exascale Systems

Zhao, Jisheng; Bertoni, Colleen; Young, Jeffrey; Harms, Kevin; Sarkar, Vivek; Videau, Brice

doi:10.1007/978-3-031-31209-0_15

Jisheng Zhao¹³,
Colleen Bertoni¹⁴,
Jeffrey Young¹³,
Kevin Harms¹⁴,
Vivek Sarkar¹³ &
…
Brice Videau¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13835))

Included in the following conference series:

European Conference on Parallel Processing

417 Accesses

Abstract

While heterogeneous computing has emerged as a dominant trend in current and future High-Performance Computing (HPC) systems, it is also widely recognized that this shift has led to increased software complexity due to a proliferation of programming systems for different heterogeneous processors. One such example is the Heterogeneous-Computing Interface for Portability from AMD (HIP), which is composed of a C Runtime API and C++ Kernel Language. Many HPC applications will likely use HIP on future exascale systems (e.g., Frontier and El Capitan), but HIP currently only targets AMD and NVIDIA processors. This limitation creates challenges for users who would also like to run their applications on exascale systems based on other architectures (e.g., Aurora, which is based on Intel hardware) that are currently not targeted by HIP.

In this paper, we introduce the design and implementation of HIPLZ, a compiler and runtime system that uses the Intel Level Zero API to support HIP on Intel GPU architectures. We discuss the design of HIPLZ, derived from HIPCL (an implementation of HIP on top of OpenCL), and portability issues that occur from using the Level Zero runtime as a backend. We evaluate our implementation by running several performance benchmarks and mini-apps written in HIP on Intel architectures using HIPLZ. Our results show that this approach provides competitive performance relative to Intel’s OpenCL implementations on Intel Gen9 GPUs, while providing good coverage of features needed by HPC applications. Overall, this approach is a promising demonstration of enabling performance portability for exascale systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Top 500 List. https://www.top500.org/lists/top500/list/2021/11/
Aurora. https://www.alcf.anl.gov/aurora
SuperMUC-NG. https://www.hpcwire.com/2021/05/05/lrz-announces-new-phase-of-supermuc-ng-supercomputer-with-intels-ponte-vecchio-gpu/
Frontier. https://www.olcf.ornl.gov/frontier/
EL Captain Supercomputer. https://www.hpe.com/us/en/compute/hpc/cray/doe-el-capitan-press-release.html
LUMI Supercomputer. hhttps://www.csc.fi/en/-/lumi-one-of-the-worlds-mightiest-supercomputers
Intel Level Zero Spec. http://spec.oneapi.io/level-zero/latest/index.html
Heterogeneous compute Interface for Portability (HIP). https://rocmdocs.amd.com/en/latest/Programming_Guides/Programming-Guides.html
Nvidia CUDA Toolkit. https://developer.nvidia.com/cuda-toolkit
The Industry Open Standard Intermediate Language for Parallel Compute and Graphics. https://www.khronos.org/spir/
Open Standard for Parallel Programming of Heterogeneous Systems(OpenCL). https://www.khronos.org/opencl/
Michal Babej and Pekka Jääskeläinen. Hipcl: Tool for porting cuda applications to advanced opencl platforms through hip. IWOCL ’20, 2020
Google Scholar
oneMKL. https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html
OCML. https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/amd-stg-open/doc/OCML.md
The Compute Architecture of Intel Processor Graphics Gen9. https://software.intel.com/content/dam/develop/external/us/en/documents/the-compute-architecture-of-intel-processor-graphics-gen9-v1d0.pdf
Argonne Joint Lab for System Evaluation (JLSE). https://www.jlse.anl.gov
HIP Test Set. https://github.com/jz10/hip-test_suite
OpenMP. https://www.openmp.org/
OpenACC. https://www.openacc.org/
Clang Compiler. http://clang.llvm.org/
Intel oneAPI. https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html
DPC++. https://www.intel.com/content/www/us/en/developer/tools/oneapi/data-parallel-c-plus-plus.html
ZLUDA: CUDA on Intel GPUs. https://github.com/vosen/ZLUDA
Parallel Thread Execution (PTX) and ISA. https://docs.nvidia.com/cuda/parallel-thread-execution/
Jääskeläinen, P., et al.: Pocl: A performance-portable opencl implementation. Int. J. Parallel Program. 43(5), 752–785 (2015)
Article Google Scholar
hipsycl. https://hipsycl.github.io/

Download references

Acknowledgements

This work was supported by the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357, and by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration). We also gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.

Author information

Authors and Affiliations

Georgia Institute of Technology, Atlanta, GA, USA
Jisheng Zhao, Jeffrey Young & Vivek Sarkar
Argonne National Laboratory, Lemont, IL, USA
Colleen Bertoni, Kevin Harms & Brice Videau

Authors

Jisheng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Colleen Bertoni
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Young
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Harms
View author publications
You can also search for this author in PubMed Google Scholar
Vivek Sarkar
View author publications
You can also search for this author in PubMed Google Scholar
Brice Videau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jisheng Zhao .

Editor information

Editors and Affiliations

University of Glasgow, Glasgow, UK
Jeremy Singer
University of Glasgow, Glasgow, UK
Yehia Elkhatib
University of Santiago de Compostela, Santiago de Compostela, La Coruña, Spain
Dora Blanco Heras
Louisiana State University, Baton Rouge, LA, USA
Patrick Diehl
University of Edinburgh, Edinburgh, UK
Nick Brown
Universidade de Lisboa, Lisbon, Portugal
Aleksandar Ilic

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, J., Bertoni, C., Young, J., Harms, K., Sarkar, V., Videau, B. (2023). HIPLZ: Enabling Performance Portability for Exascale Systems. In: Singer, J., Elkhatib, Y., Blanco Heras, D., Diehl, P., Brown, N., Ilic, A. (eds) Euro-Par 2022: Parallel Processing Workshops. Euro-Par 2022. Lecture Notes in Computer Science, vol 13835. Springer, Cham. https://doi.org/10.1007/978-3-031-31209-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-31209-0_15
Published: 02 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31208-3
Online ISBN: 978-3-031-31209-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

HIPLZ: Enabling Performance Portability for Exascale Systems