Loading [a11y]/accessibility-menu.js
GreenFlow: A Carbon-Efficient Scheduler for Deep Learning Workloads | IEEE Journals & Magazine | IEEE Xplore

GreenFlow: A Carbon-Efficient Scheduler for Deep Learning Workloads


Abstract:

Deep learning (DL) has become a key component of modern software. Training DL models leads to huge carbon emissions. In data centers, it is important to reduce carbon emi...Show More

Abstract:

Deep learning (DL) has become a key component of modern software. Training DL models leads to huge carbon emissions. In data centers, it is important to reduce carbon emissions while completing DL training jobs early. In this article, we propose GreenFlow, a GPU cluster scheduler that reduces the average Job Completion Time (JCT) under a carbon emission budget. We first present performance models for DL training jobs to predict the throughput and energy consumption performance under different configurations. Based on the performance models and the carbon intensity of the grid, GreenFlow dynamically allocates GPUs, and adjusts the GPU-level and job-level configurations of DL training jobs. GreenFlow applies network packing and buddy allocation to job placement, thus avoiding extra carbon incurred by resource fragmentations. Evaluations on a real testbed show that when emitting the same amount of carbon, GreenFlow can improve the average JCT by up to 2.15×, compared to competitive baselines.
Published in: IEEE Transactions on Parallel and Distributed Systems ( Volume: 36, Issue: 2, February 2025)
Page(s): 168 - 184
Date of Publication: 14 October 2024

ISSN Information:

Funding Agency:


References

References is not available for this document.