Skip to main content

HIPLZ: Enabling Performance Portability for Exascale Systems

  • Conference paper
  • First Online:
Euro-Par 2022: Parallel Processing Workshops (Euro-Par 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13835))

Included in the following conference series:

  • 417 Accesses

Abstract

While heterogeneous computing has emerged as a dominant trend in current and future High-Performance Computing (HPC) systems, it is also widely recognized that this shift has led to increased software complexity due to a proliferation of programming systems for different heterogeneous processors. One such example is the Heterogeneous-Computing Interface for Portability from AMD (HIP), which is composed of a C Runtime API and C++ Kernel Language. Many HPC applications will likely use HIP on future exascale systems (e.g., Frontier and El Capitan), but HIP currently only targets AMD and NVIDIA processors. This limitation creates challenges for users who would also like to run their applications on exascale systems based on other architectures (e.g., Aurora, which is based on Intel hardware) that are currently not targeted by HIP.

In this paper, we introduce the design and implementation of HIPLZ, a compiler and runtime system that uses the Intel Level Zero API to support HIP on Intel GPU architectures. We discuss the design of HIPLZ, derived from HIPCL (an implementation of HIP on top of OpenCL), and portability issues that occur from using the Level Zero runtime as a backend. We evaluate our implementation by running several performance benchmarks and mini-apps written in HIP on Intel architectures using HIPLZ. Our results show that this approach provides competitive performance relative to Intel’s OpenCL implementations on Intel Gen9 GPUs, while providing good coverage of features needed by HPC applications. Overall, this approach is a promising demonstration of enabling performance portability for exascale systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Top 500 List. https://www.top500.org/lists/top500/list/2021/11/

  2. Aurora. https://www.alcf.anl.gov/aurora

  3. SuperMUC-NG. https://www.hpcwire.com/2021/05/05/lrz-announces-new-phase-of-supermuc-ng-supercomputer-with-intels-ponte-vecchio-gpu/

  4. Frontier. https://www.olcf.ornl.gov/frontier/

  5. EL Captain Supercomputer. https://www.hpe.com/us/en/compute/hpc/cray/doe-el-capitan-press-release.html

  6. LUMI Supercomputer. hhttps://www.csc.fi/en/-/lumi-one-of-the-worlds-mightiest-supercomputers

  7. Intel Level Zero Spec. http://spec.oneapi.io/level-zero/latest/index.html

  8. Heterogeneous compute Interface for Portability (HIP). https://rocmdocs.amd.com/en/latest/Programming_Guides/Programming-Guides.html

  9. Nvidia CUDA Toolkit. https://developer.nvidia.com/cuda-toolkit

  10. The Industry Open Standard Intermediate Language for Parallel Compute and Graphics. https://www.khronos.org/spir/

  11. Open Standard for Parallel Programming of Heterogeneous Systems(OpenCL). https://www.khronos.org/opencl/

  12. Michal Babej and Pekka Jääskeläinen. Hipcl: Tool for porting cuda applications to advanced opencl platforms through hip. IWOCL ’20, 2020

    Google Scholar 

  13. oneMKL. https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html

  14. OCML. https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/amd-stg-open/doc/OCML.md

  15. The Compute Architecture of Intel Processor Graphics Gen9. https://software.intel.com/content/dam/develop/external/us/en/documents/the-compute-architecture-of-intel-processor-graphics-gen9-v1d0.pdf

  16. Argonne Joint Lab for System Evaluation (JLSE). https://www.jlse.anl.gov

  17. HIP Test Set. https://github.com/jz10/hip-test_suite

  18. OpenMP. https://www.openmp.org/

  19. OpenACC. https://www.openacc.org/

  20. Clang Compiler. http://clang.llvm.org/

  21. Intel oneAPI. https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html

  22. DPC++. https://www.intel.com/content/www/us/en/developer/tools/oneapi/data-parallel-c-plus-plus.html

  23. ZLUDA: CUDA on Intel GPUs. https://github.com/vosen/ZLUDA

  24. Parallel Thread Execution (PTX) and ISA. https://docs.nvidia.com/cuda/parallel-thread-execution/

  25. Jääskeläinen, P., et al.: Pocl: A performance-portable opencl implementation. Int. J. Parallel Program. 43(5), 752–785 (2015)

    Article  Google Scholar 

  26. hipsycl. https://hipsycl.github.io/

Download references

Acknowledgements

This work was supported by the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357, and by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration). We also gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jisheng Zhao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhao, J., Bertoni, C., Young, J., Harms, K., Sarkar, V., Videau, B. (2023). HIPLZ: Enabling Performance Portability for Exascale Systems. In: Singer, J., Elkhatib, Y., Blanco Heras, D., Diehl, P., Brown, N., Ilic, A. (eds) Euro-Par 2022: Parallel Processing Workshops. Euro-Par 2022. Lecture Notes in Computer Science, vol 13835. Springer, Cham. https://doi.org/10.1007/978-3-031-31209-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-31209-0_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-31208-3

  • Online ISBN: 978-3-031-31209-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics