Abstract
While heterogeneous computing has emerged as a dominant trend in current and future High-Performance Computing (HPC) systems, it is also widely recognized that this shift has led to increased software complexity due to a proliferation of programming systems for different heterogeneous processors. One such example is the Heterogeneous-Computing Interface for Portability from AMD (HIP), which is composed of a C Runtime API and C++ Kernel Language. Many HPC applications will likely use HIP on future exascale systems (e.g., Frontier and El Capitan), but HIP currently only targets AMD and NVIDIA processors. This limitation creates challenges for users who would also like to run their applications on exascale systems based on other architectures (e.g., Aurora, which is based on Intel hardware) that are currently not targeted by HIP.
In this paper, we introduce the design and implementation of HIPLZ, a compiler and runtime system that uses the Intel Level Zero API to support HIP on Intel GPU architectures. We discuss the design of HIPLZ, derived from HIPCL (an implementation of HIP on top of OpenCL), and portability issues that occur from using the Level Zero runtime as a backend. We evaluate our implementation by running several performance benchmarks and mini-apps written in HIP on Intel architectures using HIPLZ. Our results show that this approach provides competitive performance relative to Intel’s OpenCL implementations on Intel Gen9 GPUs, while providing good coverage of features needed by HPC applications. Overall, this approach is a promising demonstration of enabling performance portability for exascale systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Top 500 List. https://www.top500.org/lists/top500/list/2021/11/
Aurora. https://www.alcf.anl.gov/aurora
Frontier. https://www.olcf.ornl.gov/frontier/
EL Captain Supercomputer. https://www.hpe.com/us/en/compute/hpc/cray/doe-el-capitan-press-release.html
LUMI Supercomputer. hhttps://www.csc.fi/en/-/lumi-one-of-the-worlds-mightiest-supercomputers
Intel Level Zero Spec. http://spec.oneapi.io/level-zero/latest/index.html
Heterogeneous compute Interface for Portability (HIP). https://rocmdocs.amd.com/en/latest/Programming_Guides/Programming-Guides.html
Nvidia CUDA Toolkit. https://developer.nvidia.com/cuda-toolkit
The Industry Open Standard Intermediate Language for Parallel Compute and Graphics. https://www.khronos.org/spir/
Open Standard for Parallel Programming of Heterogeneous Systems(OpenCL). https://www.khronos.org/opencl/
Michal Babej and Pekka Jääskeläinen. Hipcl: Tool for porting cuda applications to advanced opencl platforms through hip. IWOCL ’20, 2020
oneMKL. https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html
OCML. https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/amd-stg-open/doc/OCML.md
The Compute Architecture of Intel Processor Graphics Gen9. https://software.intel.com/content/dam/develop/external/us/en/documents/the-compute-architecture-of-intel-processor-graphics-gen9-v1d0.pdf
Argonne Joint Lab for System Evaluation (JLSE). https://www.jlse.anl.gov
HIP Test Set. https://github.com/jz10/hip-test_suite
OpenMP. https://www.openmp.org/
OpenACC. https://www.openacc.org/
Clang Compiler. http://clang.llvm.org/
Intel oneAPI. https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html
DPC++. https://www.intel.com/content/www/us/en/developer/tools/oneapi/data-parallel-c-plus-plus.html
ZLUDA: CUDA on Intel GPUs. https://github.com/vosen/ZLUDA
Parallel Thread Execution (PTX) and ISA. https://docs.nvidia.com/cuda/parallel-thread-execution/
Jääskeläinen, P., et al.: Pocl: A performance-portable opencl implementation. Int. J. Parallel Program. 43(5), 752–785 (2015)
hipsycl. https://hipsycl.github.io/
Acknowledgements
This work was supported by the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357, and by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration). We also gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, J., Bertoni, C., Young, J., Harms, K., Sarkar, V., Videau, B. (2023). HIPLZ: Enabling Performance Portability for Exascale Systems. In: Singer, J., Elkhatib, Y., Blanco Heras, D., Diehl, P., Brown, N., Ilic, A. (eds) Euro-Par 2022: Parallel Processing Workshops. Euro-Par 2022. Lecture Notes in Computer Science, vol 13835. Springer, Cham. https://doi.org/10.1007/978-3-031-31209-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-31209-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31208-3
Online ISBN: 978-3-031-31209-0
eBook Packages: Computer ScienceComputer Science (R0)