research-article

Deriving a near-optimal power management policy using model-free reinforcement learning and Bayesian classification

Authors:

Yanzhi Wang,

Qing Xie,

Ahmed Ammari,

Massoud PedramAuthors Info & Claims

DAC '11: Proceedings of the 48th Design Automation Conference

Pages 41 - 46

https://doi.org/10.1145/2024724.2024735

Published: 05 June 2011 Publication History

Get Access

Abstract

To cope with the variations and uncertainties that emanate from hardware and application characteristics, dynamic power management (DPM) frameworks must be able to learn about the system inputs and environment and adjust the power management policy on the fly. In this paper we present an online adaptive DPM technique based on model-free reinforcement learning (RL), which is commonly used to control stochastic dynamical systems. In particular, we employ temporal difference learning for semi-Markov decision process (SMDP) for the model-free RL. In addition a novel workload predictor based on an online Bayes classifier is presented to provide effective estimates of the workload states for the RL algorithm. In this DPM framework, power and latency tradeoffs can be precisely controlled based on a user-defined parameter. Experiments show that amount of average power saving (without any increase in the latency) is up to 16.7% compared to a reference expert-based approach. Alternatively, the per-request latency reduction without any power consumption increase is up to 28.6% compared to the expert-based approach.

References

[1]

L. Benini, A. Bogliolo and G. De Micheli, "A survey of design techniques for system level dynamic power management," IEEE Trans. on VLSI Systems, Vol. 8, Issue 3, pp. 299--316, 2000.

Digital Library

Google Scholar

[2]

M. Srivastava, A. Chandrakasan and R. Brodersen, "Predictive system shutdown and other architectural techniques for energy efficient programmable computation," IEEE Trans. on VLSI, 1996.

Digital Library

Google Scholar

[3]

C. H. Hwang and A. C. Wu, "A predictive system shutdown method for energy saving of event-driven computation," in ICCAD '97.

Digital Library

Google Scholar

[4]

L. Benini, G. Paleologo, A. Bogliolo and G. De Micheli, "Policy optimization for dynamic power management," IEEE Trans. on CAD, Vol. 18, pp. 813--833, Jun. 1999.

Digital Library

Google Scholar

[5]

Q. Qiu and M. Pedram, "Dynamic Power Management Based on Continuous-Time Markov Decision Processes," in DAC '99.

Digital Library

Google Scholar

[6]

T. Simunic, L. Benini, P. Glynn and G. De Micheli, "Event-driven power management," IEEE Trans. on CAD, 2001.

Digital Library

Google Scholar

[7]

H. Jung and M. Pedram, "Dynamic power management under uncertain information," in DATE '07, pp. 1060--1065, Apr. 2007.

Digital Library

Google Scholar

[8]

Q. Qiu, Y. Tan and Q. Wu, "Stochastic Modeling and Optimization for Robust Power Management in a Partially Observable System," in DATE '07, pp. 779--784, Apr. 2007.

Digital Library

Google Scholar

[9]

G. Dhiman and T. Simunic Rosing, "Dynamic power management using machine learning," in ICCAD '06, pp. 747--754, Nov. 2006.

Digital Library

Google Scholar

[10]

Y. Tan, W. Liu and Q. Qiu, "Adaptive power management using reinforcement learning," in ICCAD '09, pp. 461--467, Nov. 2009.

Digital Library

Google Scholar

[11]

S. Bradtke and M. Duff, "Reinforcement learning methods for continuous-time Markov decision problems," in Advances in Neural Information Processing Systems 7, pp. 393--400, MIT Press, 1995.

Google Scholar

[12]

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.

Digital Library

Google Scholar

[13]

C. Watkins, Learning from Delayed Rewards, PhD thesis, Cambridge University, Cambridge, England, 1989.

Google Scholar

[14]

C. M. Bishop, Pattern Recognition and Machine Learning, Springer, August 2006.

Digital Library

Google Scholar

Cited By

View all

Zou AMa YGarimella KLee BGill CZhang X(2023)F-LEMMA: Fast Learning-Based Energy Management for Multi-/Many-Core ProcessorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.317621942:2(616-629)Online publication date: Feb-2023
https://doi.org/10.1109/TCAD.2022.3176219
Cox LCox L(2023)Addressing Wicked Problems and Deep Uncertainties in Risk AnalysisAI-ML for Decision and Risk Analysis10.1007/978-3-031-32013-2_7(215-249)Online publication date: 6-Jul-2023
https://doi.org/10.1007/978-3-031-32013-2_7
Brand PFalk JSue JBrendel JHasholzner RTeich J(2021)Adaptive Predictive Power Management for Mobile LTE DevicesIEEE Transactions on Mobile Computing10.1109/TMC.2020.298865120:8(2518-2535)Online publication date: 1-Aug-2021
https://doi.org/10.1109/TMC.2020.2988651
Show More Cited By

Index Terms

Deriving a near-optimal power management policy using model-free reinforcement learning and Bayesian classification
1. Hardware
  1. Hardware validation

Recommendations

Adaptive power management using reinforcement learning
ICCAD '09: Proceedings of the 2009 International Conference on Computer-Aided Design

System level power management must consider the uncertainty and variability that comes from the environment, the application and the hardware. A robust power management technique must be able to learn the optimal decision from past history and improve ...
Reinforcement learning based dynamic power management with a hybrid power supply
ICCD '12: Proceedings of the 2012 IEEE 30th International Conference on Computer Design (ICCD 2012)

Dynamic power management (DPM) in battery-powered mobile systems attempts to achieve higher energy efficiency by selectively setting idle components to a sleep state. However, re-activating these components at a later time consumes a large amount of ...
A Reinforcement Learning Framework for Dynamic Power Management of a Portable, Multi-camera Traffic Monitoring System
GREENCOM '12: Proceedings of the 2012 IEEE International Conference on Green Computing and Communications

Dynamic Power Management (DPM) refers to a set of strategies that achieves efficient power consumption by selectively turning off (or reducing the performance of) a system components when they are idle or are serving light workloads. This paper presents ...

Comments

Information & Contributors

Information

Published In

DAC '11: Proceedings of the 48th Design Automation Conference

June 2011

1055 pages

ISBN:9781450306362

DOI:10.1145/2024724

General Chair:
Leon Stok
IBM Corp., Hopewell Jct., NY
,
Program Chairs:
Nikil Dutt
Univ. of California, Irvine, CA
,
Soha Hassoun
Tufts Univ., Medford, MA

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

DAC '11

Sponsor:

EDAC
SIGDA

DAC '11: The 48th Annual Design Automation Conference 2011

June 5 - 10, 2011

California, San Diego

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

49
Total Citations
View Citations
443
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)2

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Zou AMa YGarimella KLee BGill CZhang X(2023)F-LEMMA: Fast Learning-Based Energy Management for Multi-/Many-Core ProcessorsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.317621942:2(616-629)Online publication date: Feb-2023
https://doi.org/10.1109/TCAD.2022.3176219
Cox LCox L(2023)Addressing Wicked Problems and Deep Uncertainties in Risk AnalysisAI-ML for Decision and Risk Analysis10.1007/978-3-031-32013-2_7(215-249)Online publication date: 6-Jul-2023
https://doi.org/10.1007/978-3-031-32013-2_7
Brand PFalk JSue JBrendel JHasholzner RTeich J(2021)Adaptive Predictive Power Management for Mobile LTE DevicesIEEE Transactions on Mobile Computing10.1109/TMC.2020.298865120:8(2518-2535)Online publication date: 1-Aug-2021
https://doi.org/10.1109/TMC.2020.2988651
Gupta UKim YLee STse JLee HWei GBrooks DWu C(2021)Chasing Carbon: The Elusive Environmental Footprint of Computing2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA51647.2021.00076(854-867)Online publication date: Feb-2021
https://doi.org/10.1109/HPCA51647.2021.00076
Jin SYue WJin SYue W(2021)Speed Switch and Multiple-Sleep ModeResource Management and Performance Analysis of Wireless Communication Networks10.1007/978-981-15-7756-7_16(315-336)Online publication date: 16-Mar-2021
https://doi.org/10.1007/978-981-15-7756-7_16
Brand PSabih MFalk JSue JTeich J(2020)Clustering-Based Scenario-Aware LTE Grant Prediction2020 IEEE Wireless Communications and Networking Conference (WCNC)10.1109/WCNC45663.2020.9120789(1-7)Online publication date: May-2020
https://doi.org/10.1109/WCNC45663.2020.9120789
Li HTian ZXu JMaeda RWang ZWang Z(2020)Chip-Specific Power Delivery and Consumption Co-Management for Process-Variation-Aware Manycore Systems Using Reinforcement LearningIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2020.296686628:5(1150-1163)Online publication date: May-2020
https://doi.org/10.1109/TVLSI.2020.2966866
Pagani SManoj PJantsch AHenkel J(2020)Machine Learning for Power, Energy, and Thermal Management on Multicore Processors: A SurveyIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2018.287816839:1(101-116)Online publication date: Jan-2020
https://doi.org/10.1109/TCAD.2018.2878168
Ozer GGarg SDavoudi NPoerwawinata GMaiterth MNetti ATafani D(2020)Towards a Predictive Energy Model for HPC Runtime Systems Using Supervised LearningEuro-Par 2019: Parallel Processing Workshops10.1007/978-3-030-48340-1_48(626-638)Online publication date: 29-May-2020
https://doi.org/10.1007/978-3-030-48340-1_48
(2019)Virtual machine allocation strategy in energy-efficient cloud data centresInternational Journal of Communication Networks and Distributed Systems10.5555/3319210.331921422:2(181-195)Online publication date: 1-Jan-2019
https://dl.acm.org/doi/10.5555/3319210.3319214
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Adaptive power management using reinforcement learning

Reinforcement learning based dynamic power management with a hybrid power supply

A Reinforcement Learning Framework for Dynamic Power Management of a Portable, Multi-camera Traffic Monitoring System

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations