research-article

Open access

Energy Time Fairness: Balancing Fair Allocation of Energy and Time for GPU Workloads

Authors:

Walid A. Hanafy,

Prashant ShenoyAuthors Info & Claims

SEC '23: Proceedings of the Eighth ACM/IEEE Symposium on Edge Computing

Pages 53 - 66

https://doi.org/10.1145/3583740.3628435

Published: 07 August 2024 Publication History

Abstract

Traditionally, multi-tenant cloud and edge platforms use fair-share schedulers to fairly multiplex resources across applications. These schedulers ensure applications receive processing time proportional to a configurable share of the total time. Unfortunately, enforcing time-fairness across applications often violates energy-fairness, such that some applications consume more than their fair share of energy. This occurs because applications either do not fully utilize their resources or operate at a reduced frequency/voltage during their time-slice. The problem is particularly acute for machine learning (ML) applications using GPUs, where model size largely dictates utilization and energy usage. Enforcing energy-fairness is also important since energy is a costly and limited resource. For example, in cloud platforms, energy dominates operating costs and is limited by the power delivery infrastructure, while in edge platforms, energy is often scarce and limited by energy harvesting and battery constraints.

To address the problem, we define the notion of Energy-Time Fairness (ETF), which enables a configurable tradeoff between energy and time fairness, and then design a scheduler that enforces it. We show that ETF satisfies many well-accepted fairness properties. ETF and the new tradeoff it offers are important, as some applications, especially ML models, are time/latency-sensitive and others are energy-sensitive. Thus, while enforcing pure energy-fairness starves time/latency-sensitive applications (of time) and enforcing pure time-fairness starves energy-sensitive applications (of energy), ETF is able to mind the gap between the two. We implement an ETF scheduler, and show that it improves fairness by up to 2×, incentivizes energy efficiency, and exposes a configurable knob to operate between energy- and time-fairness.

References

[1]

2023. Hadoop Fair Scheduler. https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/FairScheduler.html.

[2]

2023. Slurm Workload Manager Classic Fairshare Algorithm. https://slurm.schedmd.com/classic_fair_share.html.

[3]

Noman Bashir, Tian Guo, Mohammad Hajiesmaili, David Irwin, Prashant Shenoy, Ramesh Sitaraman, Abel Souza, and Adam Wierman. 2021. Enabling Sustainable Clouds: The Case for Virtualizing the Energy System. In Proceedings of the ACM Symposium on Cloud Computing (Seattle, WA, USA) (SoCC '21). Association for Computing Machinery, New York, NY, USA, 350--358.

Digital Library

[4]

Noman Bashir, David Irwin, Prashant Shenoy, and Jay Taneja. 2017. Enforcing Fair Grid Energy Access for Controllable Distributed Solar Capacity. In Proceedings of the 4th ACM International Conference on Systems for Energy-Efficient Built Environments (Delft, Netherlands) (BuildSys '17). Association for Computing Machinery, New York, NY, USA, Article 28, 10 pages.

Digital Library

[5]

Hadjer Benmeziane, Kaoutar El Maghraoui, Hamza Ouarnoughi, Smail Niar, Martin Wistuba, and Naigang Wang. 2021. A Comprehensive Survey on Hardware-Aware Neural Architecture Search. arXiv:2101.09336 [cs.LG]

[6]

Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes. 2016. Borg, Omega, and Kubernetes: Lessons Learned from Three Container-Management Systems Over a Decade. ACM Queue - Containers 14, 1 (January-February 2016).

Digital Library

[7]

Ermao Cai, Da-Cheng Juan, Dimitrios Stamoulis, and Diana Marculescu. 2017. Neuralpower: Predict and deploy energy-efficient convolutional neural networks. arXiv preprint arXiv:1710.05420 (2017).

[8]

Han Cai, Chuang Gan, Tianzhe Wang, Zhekai Zhang, and Song Han. 2020. Once-for-All: Train One Network and Specialize it for Efficient Deployment. arXiv:1908.09791 [cs.LG]

[9]

Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 578--594. https://www.usenix.org/conference/osdi18/presentation/chen

[10]

Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. 2009. Introduction to Algorithms, Third Edition (3rd ed.). The MIT Press.

Digital Library

[11]

Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17). USENIX Association, Boston, MA, 613--627. https://www.usenix.org/conference/nsdi17/technical-sessions/presentation/crankshaw

[12]

A. Demers, S. Keshav, and S. Shenker. 1989. Analysis and Simulation of a Fair Queueing Algorithm. SIGCOMM Comput. Commun. Rev. 19, 4 (aug 1989), 1--12.

Digital Library

[13]

Yiannis Georgiou, David Glesser, Krzysztof Rzadca, and Denis Trystram. 2015. A Scheduler-Level Incentive Mechanism for Energy Efficiency in HPC. 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (2015), 617--626.

Digital Library

[14]

Ali Ghodsi, Vyas Sekar, Matei Zaharia, and Ion Stoica. 2012. Multi-Resource Fair Queueing for Packet Processing. In Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (Helsinki, Finland) (SIGCOMM '12). Association for Computing Machinery, New York, NY, USA, 1--12.

Digital Library

[15]

Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, and Ion Stoica. 2011. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types. In 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11). USENIX Association, Boston, MA. https://www.usenix.org/conference/nsdi11/dominant-resource-fairness-fair-allocation-multiple-resource-types

[16]

Pawan Goyal, Harrick M. Vin, and Haichen Cheng. 1997. Start-Time Fair Queueing: A Scheduling Algorithm for Integrated Services Packet Switching Networks. IEEE/ACM Trans. Netw. 5, 5 (oct 1997), 690--704.

Digital Library

[17]

Peter Greenhalgh. 2011. Big. little processing with arm cortex-a15 & cortex-a7. ARM White paper 17 (2011).

[18]

Arpan Gujarati, Reza Karimi, Safya Alzayat, Wei Hao, Antoine Kaufmann, Ymir Vigfusson, and Jonathan Mace. 2020. Serving DNNs like Clockwork: Performance Predictability from the Bottom Up. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). USENIX Association, 443--462. https://www.usenix.org/conference/osdi20/presentation/gujarati

[19]

Daniel Hagimont, Christine Mayap Kamga, Laurent Broto, Alain Tchana, and Noel De Palma. 2013. DVFS Aware CPU Credit Enforcement in a Virtualized System. In Middleware 2013, David Eyers and Karsten Schwan (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 123--142.

[20]

Mingcong Han, Hanze Zhang, Rong Chen, and Haibo Chen. 2022. Microsecond-scale Preemption for Concurrent GPU-accelerated DNN Inferences. In 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). USENIX Association, Carlsbad, CA, 539--558. https://www.usenix.org/conference/osdi22/presentation/han

[21]

Song Han, Jeff Pool, John Tran, and William J. Dally. 2015. Learning Both Weights and Connections for Efficient Neural Networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1 (Montreal, Canada) (NIPS'15). MIT Press, Cambridge, MA, USA, 1135--1143.

Digital Library

[22]

Walid A. Hanafy, Tergel Molom-Ochir, and Rohan Shenoy. 2021. Design Considerations for Energy-Efficient Inference on Edge Devices. In Proceedings of the Twelfth ACM International Conference on Future Energy Systems (Virtual Event, Italy) (e-Energy '21). Association for Computing Machinery, New York, NY, USA, 302--308.

Digital Library

[23]

Soheil Hashemi, Nicholas Anthony, Hokchhay Tann, R. Iris Bahar, and Sherief Reda. 2017. Understanding the Impact of Precision Quantization on the Accuracy and Energy of Neural Networks. In Proceedings of the Conference on Design, Automation & Test in Europe (Lausanne, Switzerland) (DATE '17). European Design and Automation Association, Leuven, BEL, 1478--1483.

Digital Library

[24]

Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the Knowledge in a Neural Network. ArXiv abs/1503.02531 (2015).

[25]

Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, and Hartwig Adam. 2019. Searching for MobileNetV3. arXiv:1905.02244 [cs.CV]

[26]

F. P. Kelly, A. K. Maulloo, and D. K. H. Tan. 1998. Rate Control for Communication Networks: Shadow Prices, Proportional Fairness and Stability. The Journal of the Operational Research Society 49, 3 (1998), 237--252. http://www.jstor.org/stable/3010473

[27]

Changdae Kim and Jaehyuk Huh. 2018. Exploring the Design Space of Fair Scheduling Supports for Asymmetric Multicore Systems. IEEE Transactions on Computers 67, 8 (2018), 1136--1152.

Digital Library

[28]

Youngjin Kwon, Changdae Kim, Seungryoul Maeng, and Jaehyuk Huh. 2011. Virtualizing performance asymmetric multi-core systems. In 2011 38th Annual International Symposium on Computer Architecture (ISCA). 45--56.

Digital Library

[29]

Da Li, Xinbo Chen, Michela Becchi, and Ziliang Zong. 2016. Evaluating the Energy Efficiency of Deep Convolutional Neural Networks on CPUs and GPUs. In 2016 IEEE International Conferences on Big Data and Cloud Computing (BD-Cloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom) (BDCloud-SocialCom-SustainCom). 477--484.

[30]

Tong Li, Dan Baumberger, David A. Koufaty, and Scott Hahn. 2007. Efficient operating system scheduling for performance-asymmetric multi-core architectures. In SC '07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing. 1--11.

Digital Library

[31]

Tong Li, Paul Brett, Rob Knauerhase, David Koufaty, Dheeraj Reddy, and Scott Hahn. 2010. Operating system support for overlapping-ISA heterogeneous multi-core architectures. In HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture. 1--12.

[32]

Qianlin Liang, Walid A. Hanafy, Ahmed Ali-Eldin, and Prashant Shenoy. 2023. Model-Driven Cluster Resource Management for AI Workloads in Edge Clouds. ACM Trans. Auton. Adapt. Syst. 18, 1, Article 2 (mar 2023), 26 pages.

Digital Library

[33]

Qianlin Liang, Walid A. Hanafy, Noman Bashir, Ahmed Ali-Eldin, David Irwin, and Prashant Shenoy. 2023. DěLen: Enabling Flexible and Adaptive Model-Serving for Multi-Tenant Edge AI. In Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation (San Antonio, TX, USA) (IoTDI '23). Association for Computing Machinery, New York, NY, USA, 209--221.

Digital Library

[34]

Ching-Chi Lin, Hsiang-Hsin Li, Jan-Jan Wu, and Pangfeng Liu. 2016. An Energy-Efficient Scheduler for Throughput Guaranteed Jobs on Asymmetric Multi-Core Platforms. In 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS). 810--817.

[35]

Seyed Morteza Nabavinejad and Tian Guo. 2023. Opportunities of Renewable Energy Powered DNN Inference. arXiv:2306.12247 [cs.DC]

[36]

Chandandeep Singh Pabla. 2009. Completely Fair Scheduler. Linux J. 2009, 184, Article 4 (aug 2009).

[37]

A.K. Parekh and R.G. Gallager. 1993. A generalized processor sharing approach to flow control in integrated services networks: the single-node case. IEEE/ACM Transactions on Networking 1, 3 (1993), 344--357.

Digital Library

[38]

Valentin Rakovic, Ke-Jou Hsu, Ketan Bhardwaj, Ada Gavrilovska, and Liljana Gavrilovska. 2022. ShapeShifter: Resolving the Hidden Latency Contention Problem in MEC. In 2022 IEEE/ACM 7th Symposium on Edge Computing (SEC). 237--251.

[39]

Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee, Jeffery Liao, Anton Lokhmotov, Francisco Massa, Peng Meng, Paulius Micikevicius, Colin Osborne, Gennady Pekhimenko, Arun Tejusve Raghunath Rajan, Dilip Sequeira, Ashish Sirasao, Fei Sun, Hanlin Tang, Michael Thomson, Frank Wei, Ephrem Wu, Lingjie Xu, Koichi Yamada, Bing Yu, George Yuan, Aaron Zhong, Peizhao Zhang, and Yuchen Zhou. 2020. MLPerf Inference Benchmark. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA). 446--459.

Digital Library

[40]

Francisco Romero, Qian Li, Neeraja J. Yadwadkar, and Christos Kozyrakis. 2021. INFaaS: Automated Model-less Inference Serving. In 2021 USENIX Annual Technical Conference (USENIX ATC 21). USENIX Association, 397--411. https://www.usenix.org/conference/atc21/presentation/romero

[41]

Bagher Salami, Hamid Noori, and Mahmoud Naghibzadeh. 2021. Fairness-Aware Energy Efficient Scheduling on Heterogeneous Multi-Core Processors. IEEE Transactions on Computers 70, 1 (2021), 72--82.

[42]

Roy Schwartz, Jesse Dodge, Noah Smith, and Oren Etzioni. 2019. Green AI. Commun. ACM 63 (2019), 54 -- 63.

Digital Library

[43]

M. Shreedhar and G. Varghese. 1996. Efficient fair queuing using deficit round-robin. IEEE/ACM Transactions on Networking 4, 3 (1996), 375--385.

Digital Library

[44]

Abel Souza, Noman Bashir, Jorge Murillo, Walid Hanafy, Qianlin Liang, David Irwin, and Prashant Shenoy. 2023. Ecovisor: A Virtual Energy System for Carbon-Efficient Applications. In ASPLOS.

[45]

M. Tan and Quoc V. Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ArXiv abs/1905.11946 (2019).

[46]

Surat Teerapittayanon, Bradley McDanel, and H.T. Kung. 2016. BranchyNet: Fast inference via early exiting from deep neural networks. In 2016 23rd International Conference on Pattern Recognition (ICPR). 2464--2469.

[47]

Kenzo Van Craeynest, Shoaib Akram, Wim Heirman, Aamer Jaleel, and Lieven Eeckhout. 2013. Fairness-aware scheduling on single-ISA heterogeneous multi-cores. In Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques. 177--187.

[48]

Midhul Vuppalapati, Giannis Fikioris, Rachit Agarwal, Asaf Cidon, Anurag Khandelwal, and Éva Tardos. 2023. Karma: Resource Allocation for Dynamic Demands. In 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23). USENIX Association, Boston, MA, 645--662. https://www.usenix.org/conference/osdi23/presentation/vuppalapati

[49]

Carl A Waldspurger and William E Weihl. 1994. Lottery scheduling: Flexible proportional-share resource management. In Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation. 1--es.

[50]

Chengcheng Wan, Muhammad Santriaji, Eri Rogers, Henry Hoffmann, Michael Maire, and Shan Lu. 2020. ALERT: Accurate Learning for Energy and Timeliness. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). USENIX Association, 353--369. https://www.usenix.org/conference/atc20/presentation/wan

[51]

Chengjian Wen, Jun He, Jiong Zhang, and Xiang Long. 2010. PCFS: Power Credit Based Fair Scheduler Under DVFS for Muliticore Virtualization Platform. In 2010 IEEE/ACM Int'l Conference on Green Computing and Communications & Int'l Conference on Cyber, Physical and Social Computing. 163--170.

Digital Library

[52]

Xen. 2018. Credit Scheduler. https://wiki.xenproject.org/wiki/Credit_Scheduler.

[53]

Tien-Ju Yang, Yu-Hsin Chen, Joel Emer, and Vivienne Sze. 2017. A method to estimate the energy consumption of deep neural networks. In 2017 51st Asilomar Conference on Signals, Systems, and Computers. 1916--1920.

[54]

Tien-Ju Yang, Yu-Hsin Chen, and Vivienne Sze. 2017. Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]

Chaoqun You, Yangming Zhao, Gang Feng, Tony Q. S. Quek, and Lemin Li. 2023. Hierarchical Multiresource Fair Queueing for Packet Processing. IEEE Transactions on Network and Service Management 20, 1 (2023), 726--740.

Digital Library

[56]

Jiahui Yu, Linjie Yang, Ning Xu, Jianchao Yang, and Thomas Huang. 2018. Slimmable Neural Networks. arXiv:1812.08928 [cs.CV]

[57]

Li Lyna Zhang, Yuqing Yang, Yuhang Jiang, Wenwu Zhu, and Yunxin Liu. 2020. Fast Hardware-Aware Neural Architecture Search. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.

[58]

Barret Zoph and Quoc V. Le. 2017. Neural Architecture Search with Reinforcement Learning. arXiv:1611.01578 [cs.LG]

Index Terms

Energy Time Fairness: Balancing Fair Allocation of Energy and Time for GPU Workloads
1. General and reference
  1. Cross-computing tools and techniques

Recommendations

A scheduler-level incentive mechanism for energy efficiency in HPC
CCGRID '15: Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing

Energy consumption has become one of the most important factors in High Performance Computing platforms. However, while there are various algorithmic and programming techniques to save energy, a user has currently no incentive to employ them, as they ...
Minimizing Total Busy Time for Energy-Aware Virtual Machine Allocation Problems
SoICT '15: Proceedings of the 6th International Symposium on Information and Communication Technology

This paper investigates the energy-aware virtual machine (VM) allocation problems in clouds along characteristics: multiple resources, fixed interval time and non-preemption of virtual machines. Many previous works have been proposed to use a minimum ...
Towards Energy Budget Control in HPC
CCGrid '17: Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

Energy consumption has become one of the most critical issues in the evolution of High Performance Computing systems (HPC). Controlling the energy consumption of HPC platforms is not only a way to control the cost but also a step forward on the road ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SEC '23: Proceedings of the Eighth ACM/IEEE Symposium on Edge Computing

December 2023

405 pages

ISBN:9798400701238

DOI:10.1145/3583740

Chair:
Kewei Sha,
Program Chairs:
Suman Banerjee,
Jiasi Chen
University of Michigan

Copyright © 2023 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMOBILE: ACM Special Interest Group on Mobility of Systems, Users, Data and Computing
IEEE Computer Society

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

SEC '23

Sponsor:

SIGMOBILE

SEC '23: Eighth ACM/IEEE Symposium on Edge Computing

December 6 - 9, 2023

DE, Wilmington, USA

Acceptance Rates

Overall Acceptance Rate 40 of 100 submissions, 40%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
76
Total Downloads

Downloads (Last 12 months)76
Downloads (Last 6 weeks)11

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents