Abstract:
Power consumption poses a significant challenge in current and emerging GPU-enabled high-performance computing (HPC) systems. In modern GPUs, controls like dynamic voltag...Show MoreMetadata
Abstract:
Power consumption poses a significant challenge in current and emerging GPU-enabled high-performance computing (HPC) systems. In modern GPUs, controls like dynamic voltage frequency scaling (DVFS), among others, exist to regulate power consumption. Due to varying computational intensities and the availability of a wide range of frequency settings, selecting the optimal frequency configuration for a given GPU workload is non-trivial. Applying a power control with the single objective of reducing power may cause performance degradation, leading to more energy consumption. In this study, we characterize and identify GPU utilization metrics that influence both the power and execution time of a given workload. Analytical models for power and execution time are then proposed using the charac-terized feature set. Multi-objective functions (i.e., energy-delay product (EDP) and ED2p) are used to select an optimal GPU DVFS configuration for a workload such that power consumption is reduced with no or negligible degradation in performance. The evaluation was conducted using SPEC ACCEL benchmarks on NVIDIA GV100 GPU. The proposed power and performance analytical models demonstrated prediction accuracies of up to 99.2% and 98.8%, respectively. On average, the benchmarks showed 28.6% and 25.2% energy savings using EDP and ED2p approaches, respectively, without performance degradation. Fur-thermore, the proposed models require metric collection at only the maximum frequency rather than all supported DVFS configurations.
Date of Conference: 19-23 September 2022
Date Added to IEEE Xplore: 01 November 2022
ISBN Information: