ABSTRACT
The increasing power dissipation of current processors and processor cores constrains design options, increases packaging and cooling costs, increases power delivery costs, and decreases reliability. Much research has been focused on decreasing average power dissipation, which most directly addresses cooling costs and reliability. However, much less has been done to decrease peak power, which most directly impacts the processor design, packaging, and power delivery. This research proposes a new architecture which provides a significant decrease in peak power with limited performance loss. It does this through the use of a highly adaptive processor. Many components of the processor can be configured at different levels, but because they are centrally controlled, the architecture can guarantee that they are never all configured maximally at the same time. This paper describes this adaptive processor and explores mechanisms for transitioning between allowed configurations to maximize performance within a peak power constraint. Such an architecture can cut peak power by 25% with less than 5% performance loss; among other advantages, this frees 5.3% of total core area used for decoupling capacitors.
- D. H. Albonesi. Selective cache-ways: On demand cache resource allocation. In Proc. of MICRO, 1999. Google ScholarDigital Library
- D. H. Albonesi, R. Balasubramonian, S. G. Dropsho, S. Dwarkadas, E. G. Friedman, M. C. Huang, V. Kursun, G. Magklis, M. L. Scott, G. Semeraro, P. Bose, A. Buyuktosunoglu, P. W. Cook, and S. E. Schuster. Dynamically tuning processor resources with adaptive processing. IEEE Computer, 2003. Google ScholarDigital Library
- B. Amelifard and M. Pedram. Optimal selection of voltage regulator modules in a power delivery network. In Proc. of DAC, 2007. Google ScholarDigital Library
- D. Brooks and M. Martonosi. Dynamic thermal management for high-performance microprocessors. In Proc. of HPCA, 2001. Google ScholarDigital Library
- D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A framework for architectural-level power analysis and optimizations. In Proc. of ISCA, 2000. Google ScholarDigital Library
- A. Buyuktosunoglu, D. Albonesi, S. Schuster, D. Brooks, P. Bose, and P. Cook. A circuit level implementation of an adaptive issue queue for power-aware microprocessors. In Proc. of GLSVLSI, 2001. Google ScholarDigital Library
- A. Buyuktosunoglu, T. Karkhanis, D. Albonesi, and P. Bose. Energy efficient co-adaptive instruction fetch and issue. In Proc. of ISCA, 2003. Google ScholarDigital Library
- Y.-S. Chang, S. K. Gupta, and M. A. Breuer. Analysis of ground bounce in deep sub-micron circuits. In Proc. of VTS, 1997. Google ScholarDigital Library
- A. K. Coskun, R. Strong, D. M. Tullsen, and T. Simunic Rosing. Evaluating the impact of job scheduling and power management on processor lifetime for chip multiprocessors. In Proc. of SIGMETRICS, 2009. Google ScholarDigital Library
- S. Dropsho, A. Buyuktosunoglu, R. Balasubramonian, D. H. Albonesi, S. Dwarkadas, G. Semeraro, G. Magklis, and M. L. Scott. Integrating adaptive on-chip storage structures for reduced dynamic power. Technical report, Univ. of Rochester, 2002. Google ScholarDigital Library
- K. Flautner, N. S. Kim, S. Martin, D. Blaauw, and T. Mudge. Drowsy caches: simple techniques for reducing leakage power. In Proc. of ISCA, 2002. Google ScholarDigital Library
- E. Grochowski, D. Ayers, and V. Tiwari. Microarchitectural di/dt control. In Proc. of IEEE Design and Test, 2003. Google ScholarDigital Library
- A. Grove. IEDM 2002 Keynote Luncheon Speech.Google Scholar
- Z. Hu, A. Buyuktosunoglu, V. Srinivasan, V. Zyuban, H. Jacobson, and P. Bose. Microarchitectural techniques for power gating of execution units. In Proc. of ISLPED, 2004. Google ScholarDigital Library
- Intel Corp. Intel Pentium 4 Processor in the 423-pin Package Thermal Design Guidelines, Nov. 2000.Google Scholar
- C. Isci, A. Buyuktosunoglu, C.-Y. Cher, P. Bose, and M. Martonosi. An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget. In Proc. of MICRO, 2006. Google ScholarDigital Library
- ITRS. International Technology Roadmap for Semiconductors 2003, http://public.itrs.net.Google Scholar
- S. Kaxiras, Z. Hu, and M. Martonosi. Cache decay: exploiting generational behavior to reduce cache leakage power. In Proc. of ISCA, 2001. Google ScholarDigital Library
- R. Kumar, K. I. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen. Single-ISA Heterogeneous Multi-core Architectures: The Potential for Processor Power Reduction. In Proc. of MICRO, 2003. Google ScholarDigital Library
- R. Kumar, D. Tullsen, N. Jouppi, and P. Ranganathan. Heterogeneous chip multiprocessing. IEEE Computer, 2005. Google ScholarDigital Library
- R. Kumar, D. M. Tullsen, and N. P. Jouppi. Core architecture optimization for heterogeneous chip multiprocessors. In Proc. of PACT, 2006. Google ScholarDigital Library
- B. C. Lee and D. Brooks. Efficiency trends and limits from comprehensive microarchitectural adaptivity. In Proc. of ASPLOS, 2008. Google ScholarDigital Library
- R. Maro, Y. Bai, and R. I. Bahar. Dynamically reconfiguring processor resources to reduce power consumption in high-performance processors. In Proc. of PACS, 2001. Google ScholarDigital Library
- K. Meng, R. Joseph, R. P. Dick, and L. Shang. Multi-optimization power management for chip multiprocessors. In Proc. of PACT, 2008. Google ScholarDigital Library
- P. Muthana, A. Engin, M. Swaminathan, R. Tummala, V. Sundaram, B. Wiedenman, D. Amey, K. Dietz, and S. Banerji. Design, modeling, and characterization of embedded capacitor networks for core decoupling in the package. Trans. on Advanced Packaging, 2007.Google Scholar
- K. Najeeb, V. V. R. Konda, S. K. S. Hari, V. Kamakoti, and V. M. Vedula. Power virus generation using behavioral models of circuits. In Proc. of VTS, 2007. Google ScholarDigital Library
- V. Pandit and W. H. Ryu. Multi-ghz modeling and characterization of on-chip power delivery network. In Proc. of EPEP, Oct. 2008.Google ScholarCross Ref
- M. D. Pant, P. Pant, D. S. Wills, and V. Tiwari. Inductive noise reduction at the architectural level. In Proc. of VLSID, 2000. Google ScholarDigital Library
- M. Popovich, A. V. Mezhiba, and E. G. Friedman. Power Distribution Networks with On-Chip Decoupling Capacitors. Springer, 2008. Google ScholarDigital Library
- M. D. Powell, A. Agarwal, T. N. Vijaykumar, B. Falsafi, and K. Roy. Reducing set-associative cache energy via way-prediction and selective direct-mapping. In Proc. of MICRO, 2001. Google ScholarDigital Library
- P. Royannez, H. Mair, F. Dahan, M. Wagner, M. Streeter, L. Bouetel, J. Blasquez, H. Clasen, G. Semino, J. Dong, D. Scott, B. Pitts, C. Raibaut, and U. Ko. 90nm low leakage soc design techniques for wireless applications. In Proc. of ISSCC, 2005.Google ScholarCross Ref
- J. Sartori and R. Kumar. Distributed peak power management for many-core architectures. In Proc. of DATE, Mar. 2009. Google ScholarDigital Library
- J. Sartori and R. Kumar. Three scalable approaches to improving many-core throughput for a given peak power budget. In Proc. of hiPC, Dec. 2009.Google ScholarCross Ref
- A. Shayan, X. Hu, H. Peng, W. Yu, W. Zhang, C.-K. Cheng, M. Popovich, X. Chen, L. Chua-Eaon, and X. Kong. Parallel flow to analyze the impact of the voltage regulator model in nanoscale power distribution network. In Proc. of ISQED, 2009. Google ScholarDigital Library
- T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In Proc. of ASPLOS, Oct. 2002. Google ScholarDigital Library
- S. Thoziyoor, N. Muralimanohar, J. H. Ahn, and N. P. Jouppi. Tech report CACTI 5.1. Technical report, HPL, 2008.Google Scholar
- D. Tullsen. Simulation and modeling of a simultaneous multithreading processor. In Proc. of CMG Conference, 1996.Google Scholar
- G. Unikowsky. Allocating decoupling capacitors to reduce simultaneous switching noise on chips. MIT PhD Thesis, 2004.Google Scholar
- S. Yaldiz, A. Demir, S. Tasiran, Y. Leblebici, and P. Ienne. Characterizing and Exploiting Task-Load Variability and Correlation for Energy Management in Multi-Core Systems. In Proc. of Workshop ESTIMedia, 2005.Google ScholarCross Ref
- H. Yu, C. Chu, and L. He. Off-chip decoupling capacitor allocation for chip package co-design. In Proc. of DAC, 2007. Google ScholarDigital Library
- Y. Zhang, X. S. Hu, and D. Z. Chen. Task scheduling and voltage selection for energy minimization. In Proc. of DAC, 2002. Google ScholarDigital Library
- X. Zhou, P.-L. Wong, P. Xu, F. Lee, and A. Huang. Investigation of candidate VRM topologies for future microprocessors. Trans. on Power Electronics, Nov 2000.Google Scholar
Index Terms
Reducing peak power with a table-driven adaptive processor core
Recommendations
Simultaneous peak and average power minimization during datapath scheduling for DSP processors
GLSVLSI '03: Proceedings of the 13th ACM Great Lakes symposium on VLSIThe use of multiple supply voltages for energy and average power reduction is well researched and several works have appeared in the literature. However, in low power design using deep submicron and nanometer technology, the peak power, peak power ...
Peak power modeling for data center servers with switched-mode power supplies
ISLPED '10: Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and designAccurately modeling server power consumption is critical in designing data center power provisioning infrastructure. However, to date, most research proposals have used average CPU utilization to infer the power consumption of clusters, typically ...
Graph theoretic approach for scan cell reordering to minimize peak shift power
GLSVLSI '10: Proceedings of the 20th symposium on Great lakes symposium on VLSIScan circuit testing generally causes excessive switching activity compared to normal circuit operation. This excessive switching activity causes high peak and average power consumption. Higher peak power causes, supply voltage droop and excessive heat ...
Comments