ABSTRACT
The design for continuous computer performance is increasingly becoming limited by the exponential increase in the power consumption. In order to improve the energy efficiency of multicore chips, we propose a novel global power management technique. The goal of the technique is to deliver the maximum performance at a fixed power budget, without significant overhead. To tackle the exponential complexity of the power management for multiple cores, we apply a Reinforcement Learning technique, Q-learning, at the core level and then use a chip-level intelligent controller to optimize the power distribution among all cores. The power assignment adapts dynamically at runtime depending on the needs of the applications. The technique was evaluated using the PARSEC benchmark suite on a full system simulator. The experimental results show, in average, that with the proposed technique the overall performance is increased by 39% for a fixed power budget while the EDP is improved by 28%, compared to the non-DVFS baseline implementation.
- G. Dhiman and T.S. Rosing, "System-Level Power Management Using Online Learning," IEEE TCAD, 28(5): 676--689, 2009 Google ScholarDigital Library
- U.A. Khan and B. Rinner, "Online Learning of Timeout Policies for Dynamic Power Management," ACM-TECS, 13(4), Article 96, 25 pages, 2014 Google ScholarDigital Library
- A. Das, R.A. Shafik, G.V. Merrett, B.M. Al-Hashimi, A. Kumar, and B. Veeravalli. "Reinforcement learning-based inter- and intra-application thermal optimization for lifetime improvement of multicore systems," DAC, pp: 1--6, 2014 Google ScholarDigital Library
- R. Ye and Q. Xu "Learning-based power management for multi-core processors via idle period manipulation," ASP-DAC, pp: 115--120, 2012Google Scholar
- H. Shen et al. "Achieving autonomous power management using reinforcement learning," ACM-TODAES, 18(2): 1--32, 2013 Google ScholarDigital Library
- D.-C. Juan and D. Marculescu, "Power-aware performance increase via core/uncore reinforcement control for chip-multiprocessors," ISLPED, pp: 97--102, 2012 Google ScholarDigital Library
- W. Liu, Y. Tan, and Q. Qiu. "Enhanced Q-learning algorithm for dynamic power management with performance constraint," DATE, pp: 602--605, 2010 Google ScholarDigital Library
- H. Jung and M. Pedram, "Supervised Learning Based Power Management for Multicore Processors," IEEE-TCAD, 29(9): 1395--1408, 2010 Google ScholarDigital Library
- T. Kolpe, A. Zhai, and S.S. Sapatnekar. "Enabling Improved Power Management in Multicore Processors through Clustered DVFS," DATE, pp: 1--6, 2011Google Scholar
- T. Mitchell. Machine Learning. McGrow Hill. 1997 Google ScholarDigital Library
- N. Binkert, et al. "The gem5 simulator," ACM SIGARCH Computer Architecture News, pp: 1--7, 2011 Google ScholarDigital Library
- PARSEC Benchmark, http://www.cs.utexas.edu/~parsec_m5Google Scholar
- A. Bartolini, M. Cacciari, A. Tilli, L. Benini, and M. Gries. "A virtual platform environment for exploring power, thermal and reliability management control strategies in high-performance multicores," GLSVLSI, pp: 311--316, 2010 Google ScholarDigital Library
- A.B. Kahng, L. Bin, P. Li-Shiuan, and K. Samadi. "Orion 2.0: A fast and accurate NoC power and area model for early-stage design space exploration," DATE, pp: 423--428, 2009 Google ScholarDigital Library
- S. Thoziyoor et al. "A comprehensive memory modeling tool and its application to the design and analysis of future memory hierarchies," ISCA, pp: 51--62, 2008 Google ScholarDigital Library
- IBM CPLEX Optimizer, http://www-01.ibm.com/software/commerce/optimization/cplex-optimizer/Google Scholar
- C. Bienia, S. Kumar, J.P. Singh, and K. Li. "The PARSEC Benchmark Suite: Characterization and Architectural Implications," PACT, pp: 72--81, 2008 Google ScholarDigital Library
- Princeton's Garnet Network Simulator, http://www.princeton.edu/~niketa/publications/garnet-tech-report.pdfGoogle Scholar
- C. Isci, A. Buyuktosunoglu, C.-Y. Chen, P. Bose, and M. Martonosi. "An analysis of efficient multi-core global power management policies: maximizing performance for a given power budget," MICRO, pp: 347 -- 358, 2006 Google ScholarDigital Library
- Z. Zhiming et al. "A Cool Scheduler for Multi-Core Systems Exploiting Program Phases", IEEE TC, pp: 1061 -- 1073, 2014 Google ScholarDigital Library
- F. Fei et al. "A simple model for the energy-efficient optimal real-time multiprocessor scheduling", CSAE, pp: 18 -- 21, 2012Google Scholar
- S. Herbert, S. Garg, and D. Marculescu. "Exploiting Process Variability in Voltage/Frequency Control", IEEE T-VLSI, pp: 1392 -- 1404, 2012 Google ScholarDigital Library
- A.K. Datta and R. Patel "CPU Scheduling for Power/Energy Management on Multicore Processors Using Cache Miss and Context Switch Data", IEEE T-PDS, pp: 1190 -- 1199, 2014 Google ScholarDigital Library
- G Liu, J Park, and D. Marculescu. "Dynamic thread mapping for high-performance, power-efficient heterogeneous many-core systems", ICCD, pp: 54 -- 61, 2013Google Scholar
- H. Kim, H Hong, H-S Kim, J-H Ahn, and S Kang. "Total Energy Minimization of Real-Time Tasks in an On-Chip Multiprocessor Using Dynamic Voltage Scaling Efficiency Metric", IEEE T-CAD ICS, pp: 2088 -- 2092, 2008 Google ScholarDigital Library
- M. Etinski, J. Corbalan, J. Labarta, and M. Valero. "Linear programming based parallel job scheduling for power constrained systems", HPCS, pp: 72 -- 80, 2011Google Scholar
Index Terms
- Scalable and Dynamic Global Power Management for Multicore Chips
Recommendations
Coordinated power-performance optimization in manycores
PACT '13: Proceedings of the 22nd international conference on Parallel architectures and compilation techniquesOptimizing the performance in multiprogrammed environments, especially for workloads composed of multithreaded programs is a desired feature of runtime management system in future manycore processors. At the same time, power capping capability is ...
Optimizing throughput of power- and thermal-constrained multicore processors using DVFS and per-core power-gating
DAC '09: Proceedings of the 46th Annual Design Automation ConferenceProcess variability from a range of sources is growing as technology scales below 65nm, resulting in increasingly nonuniform transistor delay and leakage power both within a die and across dies. As a result, the negative impact of process variations on ...
Dynamic Power Management of Voltage-Frequency Island Partitioned Networks-on-Chip using Intel Sing-Chip Cloud Computer
NOCS '11: Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-ChipContinuous technology scaling has enabled the integration of multiple cores on the same chip. To overcome the disadvantages of buses, the Network-on-Chip (NoC) architecture has been proposed as a new communication paradigm. To further mitigate the ...
Comments