skip to main content
10.1145/3587716.3587744acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlcConference Proceedingsconference-collections
research-article

CHESS: Joint Energy and Makespan Optimization for Dynamic CNN Task Scheduling on the Heterogeneous System

Published:07 September 2023Publication History

ABSTRACT

In this paper, we investigate both the energy consumption and running time of different CNN tasks on GPUs or CPUs, and analyze their characterization for different CNN models under different application and system configuration factors. We find that this joint energy consumption and makespan optimization problem can be formulated as an integer linear programming problem. Then we propose CHESS (CNN-task Heterogeneous Efficient Scheduling System) with a two-stage heuristic scheduling algorithm, to better allocate computing resources for the upcoming tasks, and to schedule them dynamically on the heterogeneous cluster. Experiments show that our CHESS can save up to 15.9% energy and decrease up to 32.7% makespan over existing approaches.

References

  1. Yixin Bao, Yanghua Peng, Chuan Wu, and Zongpeng Li. 2018. Online job scheduling in distributed machine learning clusters. In INFOCOM. IEEE, 495–503.Google ScholarGoogle Scholar
  2. Zhaoyun Chen, Lei Luo, Wei Quan, Yang Shi, Jie Yu, Mei Wen, and Chunyuan Zhang. 2018. Multiple CNN-based tasks scheduling across shared GPU platform in research and development scenarios. In HPCC/SmartCity/DSS. IEEE, 578–585.Google ScholarGoogle Scholar
  3. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770–778.Google ScholarGoogle Scholar
  4. Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google ScholarGoogle Scholar
  5. Horng-Ruey Huang, Ding-Yong Hong, Jan-Jan Wu, Pangfeng Liu, and Wei-Chung Hsu. 2021. Efficient video captioning on heterogeneous system architectures. In IPDPS. IEEE, 1035–1045.Google ScholarGoogle Scholar
  6. Myeongjae Jeon, Shivaram Venkataraman, Amar Phanishayee, Junjie Qian, Wencong Xiao, and Fan Yang. 2019. Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads. In USENIX ATC. 947–960.Google ScholarGoogle Scholar
  7. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.Google ScholarGoogle ScholarCross RefCross Ref
  8. Deepak Narayanan, Keshav Santhanam, Fiodar Kazhamiaka, Amar Phanishayee, and Matei Zaharia. 2020. Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads. In OSDI. 481–498.Google ScholarGoogle Scholar
  9. Yanghua Peng, Yixin Bao, Yangrui Chen, Chuan Wu, and Chuanxiong Guo. 2018. Optimus: an efficient dynamic resource scheduler for deep learning clusters. In Proceedings of the Thirteenth EuroSys Conference. 1–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  11. Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In ICML. PMLR, 6105–6114.Google ScholarGoogle Scholar
  12. Minjie Wang, Chien-chin Huang, and Jinyang Li. 2019. Supporting very large models using automatic dataflow graph partitioning. In Proceedings of the Fourteenth EuroSys Conference 2019. 1–17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Qiang Wang, Xinxin Mei, Hai Liu, Yiu-Wing Leung, Zongpeng Li, and Xiaowen Chu. 2022. Energy-Aware Non-Preemptive Task Scheduling With Deadline Constraint in DVFS-Enabled Heterogeneous Clusters. IEEE Transactions on Parallel and Distributed Systems (2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Yidi Wang, Mohsen Karimi, Yecheng Xiang, and Hyoseung Kim. 2021. Balancing Energy Efficiency and Real-Time Performance in GPU Scheduling. In RTSS. IEEE, 110–122.Google ScholarGoogle Scholar
  15. Wayne Xiong, Lingfeng Wu, Fil Alleva, Jasha Droppo, Xuedong Huang, and Andreas Stolcke. 2018. The Microsoft 2017 conversational speech recognition system. In ICASSP. IEEE, 5934–5938.Google ScholarGoogle Scholar

Index Terms

  1. CHESS: Joint Energy and Makespan Optimization for Dynamic CNN Task Scheduling on the Heterogeneous System
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              ICMLC '23: Proceedings of the 2023 15th International Conference on Machine Learning and Computing
              February 2023
              619 pages
              ISBN:9781450398411
              DOI:10.1145/3587716

              Copyright © 2023 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 7 September 2023

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited
            • Article Metrics

              • Downloads (Last 12 months)30
              • Downloads (Last 6 weeks)6

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format