Manage OpenMP GPU Data Environment Under Unified Address Space
OpenMP has supported the offload of computations to accelerators such as GPUs since version 4.0. A crucial aspect in OpenMP offloading is to manage the accelerator data environment. Currently, this has to be explicitly programmed by users, which is non-trival and often results in suboptimal performance. The unified memory feature available in recent GPU architectures introduces another option, implicit management. However, our experiments show that it incurs several performance issues, especially under GPU memory oversubscription. In this paper, we propose a compiler and runtime collaborative approach to manage OpenMP GPU data under unified memory. In our framework, the compiler performs data reuse analysis to assist runtime data management. The runtime combines static and dynamic information to make optimized data management decisions.We have implement the proposed technology in the LLVM framework. The evaluation shows our method can achieve significant performance improvement for OpenMP GPU offloading.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Brookhaven National Laboratory (BNL), Upton, NY (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21)
- DOE Contract Number:
- SC0012704
- OSTI ID:
- 1484438
- Report Number(s):
- BNL-209639-2018-COPA
- Resource Relation:
- Journal Volume: 11128; Conference: International Workshop on OpenMP 2018, Barcelona, Spain, 9/26/2018 - 9/28/2018
- Country of Publication:
- United States
- Language:
- English
Similar Records
Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support
MemHC: An Optimized GPU Memory Management Framework for Accelerating Many-body Correlation