skip to main content
10.1145/3605573.3605612acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

DeepPower: Deep Reinforcement Learning based Power Management for Latency Critical Applications in Multi-core Systems

Published:13 September 2023Publication History

ABSTRACT

Latency-critical (LC) applications are widely deployed in modern datacenters. Effective power management for LC applications can yield significant cost savings. However, it poses a significant challenge in maintaining the desired Service Level Aggrement (SLA) levels. Prior researches have mainly emphasized predicting the service time of request and utilize heuristic algorithms for CPU frequency adjustment. Unfortunately, the control granularity is limited to the request level and manual feature selection is needed.

This paper proposes DeepPower, a deep reinforcement learning (DRL) based power management solution for LC applications. DeepPower comprises two key components, a DRL agent for monitoring the system load changes and a thread controller for CPU frequency adjustment. Considering the high overhead of the neural network and the short service time of requests, it is infeasible to employ DRL for direct adjustment of CPU frequency at the request level. Instead, DeepPower proposes a hierarchical control mechanism. That means the DRL agent adjusts the parameter of thread controller with longer intervals, and thread controller adjusts the CPU frequency with shorter intervals. This control mechanism enables DeepPower to adapt to dynamic workloads and achieves fine-grained frequency adjustments. We evaluate DeepPower with some common LC applications under dynamic workload. The experimental results show that DeepPower saves up to 28.4% power compared with state-of-the-art methods and reduces the percentage of request timeout.

References

  1. [n. d.]. E-commerce search benchmark. https://github.com/alibaba/eCommerceSearchBench.Google ScholarGoogle Scholar
  2. Luiz André Barroso and Urs Hölzle. 2007. The case for energy-proportional computing. Computer 40, 12 (2007), 33–37.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Li Chen, Justinas Lingys, Kai Chen, and Feng Liu. 2018. Auto: Scaling deep reinforcement learning for datacenter-scale automatic traffic optimization. In Proceedings of the 2018 conference of the ACM special interest group on data communication. 191–205.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Shuang Chen, Christina Delimitrou, and José F Martínez. 2019. Parties: Qos-aware resource partitioning for multiple interactive services. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems. 107–120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Shuang Chen, Angela Jin, Christina Delimitrou, and José F Martínez. 2022. ReTail: Opting for Learning Simplicity to Enable QoS-Aware Power Management in the Cloud. In 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, 155–168.Google ScholarGoogle ScholarCross RefCross Ref
  6. Chih-Hsun Chou, Laxmi N Bhuyan, and Daniel Wong. 2019. μ DPM: Dynamic power management for the microsecond era. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 120–132.Google ScholarGoogle ScholarCross RefCross Ref
  7. Chih-Hsun Chou, Daniel Wong, and Laxmi N Bhuyan. 2016. Dynsleep: Fine-grained power management for a latency-critical data center application. In Proceedings of the 2016 International Symposium on Low Power Electronics and Design. 212–217.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: Resource-efficient and qos-aware cluster management. ACM SIGPLAN Notices 49, 4 (2014), 127–144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, 2018. Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905 (2018).Google ScholarGoogle Scholar
  10. Chang-Hong Hsu, Yunqi Zhang, Michael A. Laurenzano, David Meisner, Thomas F. Wenisch, Jason Mars, Lingjia Tang, and Ronald G. Dreslinski. 2015. Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting. high-performance computer architecture (2015).Google ScholarGoogle Scholar
  11. Myeongjae Jeon, Yuxiong He, Sameh Elnikety, Alan L. Cox, and Scott Rixner. 2013. Adaptive parallelism for web search. european conference on computer systems (2013).Google ScholarGoogle Scholar
  12. Harshad Kasture, Davide B. Bartolini, Nathan Beckmann, and Daniel Sanchez. 2015. Rubik: fast analytical power management for latency-critical systems. international symposium on microarchitecture (2015).Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Harshad Kasture and Daniel Sanchez. 2014. Ubik: efficient cache sharing with strict qos for latency-critical workloads. architectural support for programming languages and operating systems (2014).Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Harshad Kasture and Daniel Sanchez. 2016. Tailbench: a benchmark suite and evaluation methodology for latency-critical applications. In 2016 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 1–10.Google ScholarGoogle ScholarCross RefCross Ref
  15. Wonyoung Kim, Meeta S. Gupta, Gu-Yeon Wei, and David Brooks. 2008. System level analysis of fast, per-core DVFS using on-chip switching regulators. high-performance computer architecture (2008).Google ScholarGoogle Scholar
  16. Young Geun Kim and Carole-Jean Wu. 2020. Autoscale: Energy efficiency optimization for stochastic edge inference using reinforcement learning. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1082–1096.Google ScholarGoogle ScholarCross RefCross Ref
  17. Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv: Learning (2015).Google ScholarGoogle Scholar
  18. Yanpei Liu, Guilherme Cox, Qingyuan Deng, Stark C Draper, and Ricardo Bianchini. 2016. Fastcap: An efficient and fair algorithm for power capping in many-core systems. In 2016 IEEE International symposium on performance analysis of systems and software (ISPASS). IEEE, 57–68.Google ScholarGoogle ScholarCross RefCross Ref
  19. David Lo, Liqun Cheng, Rama K. Govindaraju, Luiz Andre Barroso, and Christos Kozyrakis. 2014. Towards energy proportionality for large-scale latency-critical workloads. international symposium on computer architecture (2014).Google ScholarGoogle Scholar
  20. David Meisner and Thomas F. Wenisch. 2012. DreamWeaver: architectural support for deep sleep. architectural support for programming languages and operating systems (2012).Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature (2015).Google ScholarGoogle Scholar
  22. Ciamac C Moallemi and Mehmet Saglam. 2010. The cost of latency. SSRN eLibrary (2010).Google ScholarGoogle Scholar
  23. Charles Reiss, Alexey Tumanov, Gregory R Ganger, Randy H Katz, and Michael A Kozuch. 2012. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the third ACM symposium on cloud computing. 1–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Efraim Rotem, Alon Naveh, Doron Rajwan, Avinash N. Ananthakrishnan, and Eliezer Weissmann. 2012. Power-Management Architecture of the Intel Microarchitecture Code-Named Sandy Bridge. IEEE Micro (2012).Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Eric Schurman and Jake Brutlag. 2009. The user and business impact of server delays, additional bytes, and http chunking in web search. In Velocity Web Performance and Operations Conference. oreilly.Google ScholarGoogle Scholar
  26. Arman Shehabi, Sarah Smith, Dale Sartor, Richard Brown, Magnus Herrlin, Jonathan Koomey, Eric Masanet, Nathaniel Horner, Inês Azevedo, and William Lintner. 2016. United states data center energy usage report. (2016).Google ScholarGoogle Scholar
  27. Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.Google ScholarGoogle ScholarCross RefCross Ref
  28. Ratnala Vinay, Pradip Sasmal, Chandrajit Pal, Toshihisa Haraki, Kazuhiro Tamura, Chirag Juyal, Mohamed Amir Gabir Elbakri, Sumohana Channappayya, and Amit Acharyya. 2022. Light Weight RL Based Run Time Power Management Methodology for Edge Devices. In 2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS). IEEE, 1–4.Google ScholarGoogle Scholar
  29. Yiming Wang, Weizhe Zhang, Meng Hao, and Zheng Wang. 2021. Online power management for multi-cores: A reinforcement learning based approach. IEEE Transactions on Parallel and Distributed Systems 33, 4 (2021), 751–764.Google ScholarGoogle ScholarCross RefCross Ref
  30. Liang Zhou, Laxmi N. Bhuyan, and Kadangode K. Ramakrishnan. 2020. Gemini: Learning to Manage CPU Power for Latency-Critical Search Engines. international symposium on microarchitecture (2020).Google ScholarGoogle Scholar
  31. An Zou, Karthik Garimella, Benjamin Lee, Christopher Gill, and Xuan Zhang. 2020. F-LEMMA: Fast learning-based energy management for multi-/many-core processors. In Proceedings of the 2020 ACM/IEEE Workshop on Machine Learning for CAD. 43–48.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. DeepPower: Deep Reinforcement Learning based Power Management for Latency Critical Applications in Multi-core Systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICPP '23: Proceedings of the 52nd International Conference on Parallel Processing
      August 2023
      858 pages
      ISBN:9798400708435
      DOI:10.1145/3605573

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 September 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate91of313submissions,29%
    • Article Metrics

      • Downloads (Last 12 months)79
      • Downloads (Last 6 weeks)22

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format