ABSTRACT
In this paper, we investigate both the energy consumption and running time of different CNN tasks on GPUs or CPUs, and analyze their characterization for different CNN models under different application and system configuration factors. We find that this joint energy consumption and makespan optimization problem can be formulated as an integer linear programming problem. Then we propose CHESS (CNN-task Heterogeneous Efficient Scheduling System) with a two-stage heuristic scheduling algorithm, to better allocate computing resources for the upcoming tasks, and to schedule them dynamically on the heterogeneous cluster. Experiments show that our CHESS can save up to 15.9% energy and decrease up to 32.7% makespan over existing approaches.
- Yixin Bao, Yanghua Peng, Chuan Wu, and Zongpeng Li. 2018. Online job scheduling in distributed machine learning clusters. In INFOCOM. IEEE, 495–503.Google Scholar
- Zhaoyun Chen, Lei Luo, Wei Quan, Yang Shi, Jie Yu, Mei Wen, and Chunyuan Zhang. 2018. Multiple CNN-based tasks scheduling across shared GPU platform in research and development scenarios. In HPCC/SmartCity/DSS. IEEE, 578–585.Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770–778.Google Scholar
- Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google Scholar
- Horng-Ruey Huang, Ding-Yong Hong, Jan-Jan Wu, Pangfeng Liu, and Wei-Chung Hsu. 2021. Efficient video captioning on heterogeneous system architectures. In IPDPS. IEEE, 1035–1045.Google Scholar
- Myeongjae Jeon, Shivaram Venkataraman, Amar Phanishayee, Junjie Qian, Wencong Xiao, and Fan Yang. 2019. Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads. In USENIX ATC. 947–960.Google Scholar
- Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.Google ScholarCross Ref
- Deepak Narayanan, Keshav Santhanam, Fiodar Kazhamiaka, Amar Phanishayee, and Matei Zaharia. 2020. Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads. In OSDI. 481–498.Google Scholar
- Yanghua Peng, Yixin Bao, Yangrui Chen, Chuan Wu, and Chuanxiong Guo. 2018. Optimus: an efficient dynamic resource scheduler for deep learning clusters. In Proceedings of the Thirteenth EuroSys Conference. 1–14.Google ScholarDigital Library
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In ICML. PMLR, 6105–6114.Google Scholar
- Minjie Wang, Chien-chin Huang, and Jinyang Li. 2019. Supporting very large models using automatic dataflow graph partitioning. In Proceedings of the Fourteenth EuroSys Conference 2019. 1–17.Google ScholarDigital Library
- Qiang Wang, Xinxin Mei, Hai Liu, Yiu-Wing Leung, Zongpeng Li, and Xiaowen Chu. 2022. Energy-Aware Non-Preemptive Task Scheduling With Deadline Constraint in DVFS-Enabled Heterogeneous Clusters. IEEE Transactions on Parallel and Distributed Systems (2022).Google ScholarDigital Library
- Yidi Wang, Mohsen Karimi, Yecheng Xiang, and Hyoseung Kim. 2021. Balancing Energy Efficiency and Real-Time Performance in GPU Scheduling. In RTSS. IEEE, 110–122.Google Scholar
- Wayne Xiong, Lingfeng Wu, Fil Alleva, Jasha Droppo, Xuedong Huang, and Andreas Stolcke. 2018. The Microsoft 2017 conversational speech recognition system. In ICASSP. IEEE, 5934–5938.Google Scholar
Index Terms
- CHESS: Joint Energy and Makespan Optimization for Dynamic CNN Task Scheduling on the Heterogeneous System
Recommendations
Prior node selection for scheduling workflows in a heterogeneous system
Many workflow scheduling algorithms for heterogeneous systems have been developed to satisfy multiple requirements such as minimizing schedule length while maximizing throughput. In particular, in list-based scheduling approaches, the schedule length ...
System-level energy-efficient dynamic task scheduling
DAC '05: Proceedings of the 42nd annual Design Automation ConferenceDynamic voltage scaling (DVS) is a well-known low power design technique that reduces the processor energy by slowing down the DVS processor and stretching the task execution time. But in a DVS system consisting of a DVS processor and multiple devices, ...
Energy-efficient dynamic task scheduling algorithms for DVS systems
Dynamic voltage scaling (DVS) is a well-known low-power design technique that reduces the processor energy by slowing down the DVS processor and stretching the task execution time. However, in a DVS system consisting of a DVS processor and multiple ...
Comments