skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Manage OpenMP GPU Data Environment Under Unified Address Space

Conference ·

OpenMP has supported the offload of computations to accelerators such as GPUs since version 4.0. A crucial aspect in OpenMP offloading is to manage the accelerator data environment. Currently, this has to be explicitly programmed by users, which is non-trival and often results in suboptimal performance. The unified memory feature available in recent GPU architectures introduces another option, implicit management. However, our experiments show that it incurs several performance issues, especially under GPU memory oversubscription. In this paper, we propose a compiler and runtime collaborative approach to manage OpenMP GPU data under unified memory. In our framework, the compiler performs data reuse analysis to assist runtime data management. The runtime combines static and dynamic information to make optimized data management decisions.We have implement the proposed technology in the LLVM framework. The evaluation shows our method can achieve significant performance improvement for OpenMP GPU offloading.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Brookhaven National Laboratory (BNL), Upton, NY (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21)
DOE Contract Number:
SC0012704
OSTI ID:
1484438
Report Number(s):
BNL-209639-2018-COPA
Resource Relation:
Journal Volume: 11128; Conference: International Workshop on OpenMP 2018, Barcelona, Spain, 9/26/2018 - 9/28/2018
Country of Publication:
United States
Language:
English

References (15)

Page Placement Strategies for GPUs within Heterogeneous Memory Systems
  • Agarwal, Neha; Nellans, David; Stephenson, Mark
  • Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '15 https://doi.org/10.1145/2694344.2694381
conference January 2015
Offloading Support for OpenMP in Clang and LLVM conference November 2016
Rodinia: A benchmark suite for heterogeneous computing conference October 2009
Directive-Based Partitioning and Pipelining for Graphics Processing Units conference May 2017
A Pattern for Overlapping Communication and Computation with OpenMP $^*$ Target Directives book January 2017
Automatic CPU-GPU communication management and optimization
  • Jablin, Thomas B.; Prabhu, Prakash; Jablin, James A.
  • Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation - PLDI '11 https://doi.org/10.1145/1993498.1993516
conference January 2011
High performance cache replacement using re-reference interval prediction (RRIP) conference January 2010
LLVM: A compilation framework for lifelong program analysis & transformation conference January 2004
Optimal bypass monitor for high performance last-level caches conference January 2012
Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading conference January 2017
Double Buffering for MCDRAM on Second Generation $\hbox {Intel}^{\circledR }$ Xeon Phi $^{\text {TM}}$ Processors with OpenMP book January 2017
Fast and efficient automatic memory management for GPUs using compiler-assisted runtime coherence scheme
  • Pai, Sreepathi; Govindarajan, R.; Thazhuthaveetil, Matthew J.
  • Proceedings of the 21st international conference on Parallel architectures and compilation techniques - PACT '12 https://doi.org/10.1145/2370816.2370824
conference January 2012
Adaptive insertion policies for high performance caching conference January 2007
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems journal May 2010
Optimizing bandwidth and power of graphics memory with hybrid memory technologies and adaptive data migration conference January 2012

Similar Records

Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading
Conference · Sun Jan 01 00:00:00 EST 2017 · OSTI ID:1484438

Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support
Conference · Fri Sep 01 00:00:00 EDT 2023 · OSTI ID:1484438

MemHC: An Optimized GPU Memory Management Framework for Accelerating Many-body Correlation
Journal Article · Thu Mar 24 00:00:00 EDT 2022 · ACM Transactions on Architecture and Code Optimization · OSTI ID:1484438