Abstract
Current microprocessors face constant thermal and power-related problems during their everyday use, usually solved by applying a power budget to the processor/core. Dynamic voltage and frequency scaling (DVFS) has been an effective technique that allowed microprocessors to match a predefined power budget. However, the continuous increase of leakage power due to technology scaling along with low resolution of DVFS makes it less attractive as a technique to match a predefined power budget as technology goes to deep-submicron. In this paper, we propose the use of microarchitectural techniques to accurately match a power constraint while maximizing the energy-efficiency of the processor. We will predict the processor power dissipation at cycle level (power token throttling) or at a basic block level (basic block level mechanism), using the dissipated power translated into tokens to select between different power-saving microarchitectural techniques. We also introduce a two-level approach in which DVFS acts as a coarse-grain technique to lower the average power dissipation towards the power budget, while microarchitectural techniques focus on removing the numerous power spikes. Experimental results show that the use of power-saving microarchitectural techniques in conjunction with DVFS is up to six times more precise, in terms of total energy consumed over the power budget, than only using DVFS to match a predefined power budget.
Similar content being viewed by others
Notes
Register Update Unit.
8 bits per entry.
Assumed in this paper as a stream of instructions between two branches.
Thermal design package.
Using the Wattch’s clock-gating style cc3, which scales power linearly with unit usage. Inactive units still dissipate 10 % of its maximum power.
For McPAT \(V_{DD}\) at 32 nm is 1.2 V, so each 5 % reduction in voltage translates into 60 mV. That means it will take 1.2 ns to switch between modes. As the processor runs at 3 GHz, it will take 3.6 (4) cycles to change between power modes.
References
Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C. Intel Corporation
Aragon JL, Gonzalez J, Gonzalez A (2003) Power-aware control speculation through selective throttling. In: Proceedings of the 9th international symposium on high-performance computer, architecture, pp 103–112. doi:10.1109/HPCA.2003.1183528
Bacha A, Teodorescu R (2013) Dynamic reduction of voltage margins by leveraging on-chip ecc in itanium ii processors. In: Proceedings of the 40th annual international symposium on computer architecture, ISCA ’13. ACM, New York, pp 297–307. doi:10.1145/2485922.2485948
Baniasadi A, Moshovos A (2001) Instruction flow-based front-end throttling for power-aware high-performance processors. In: Proceedings of the international symposium on low power electronics and design, pp 16–21. doi:10.1109/LPE.2001.945365
Brooks DM, Bose P, Schuster SE, Jacobson H, Kudva PN, Buyuktosunoglu A, Wellman JD, Zyuban V, Gupta M, Cook PW (2000) Power-aware microarchitecture: design and modeling challenges for next-generation microprocessors. IEEE Micro 20:26–44. doi:10.1109/40.888701. http://portal.acm.org/citation.cfm?id=623296.624401
Casmira J, Grunwald D (2000) Dynamic instruction scheduling slack. In: Proceedings of the KoolChips workshop
Cebrian JM, Aragon JL, Garcia JM, Petoumenos P, Kaxiras S (2009) Efficient microarchitecture policies for accurately adapting to power constraints. In: Proceedings of the IEEE international parallel & distributed processing, symposium, pp 1–12. doi:10.1109/IPDPS.2009.5161022
Donald J, Martonosi M (2006) Techniques for multicore thermal management: classification and new exploration. In: Proceedings of the 33rd international symposium on computer, architecture, pp 78–88. doi:10.1109/ISCA.2006.39
Esmaeilzadeh H, Blem E, St Amant R, Sankaralingam K, Burger D (2011) Dark silicon and the end of multicore scaling. In: Proceedings of the 38th annual international symposium on computer architecture, ISCA ’11. ACM, New York, pp 365–376. doi:10.1145/2000064.2000108
Homayoun H, Kontorinis V, Shayan A, Lin TW, Tullsen D (2012) Dynamically heterogeneous cores through 3d resource pooling. In: 2012 IEEE 18th international symposium on high performance computer architecture (HPCA), pp 1–12. doi:10.1109/HPCA.2012.6169037
Isci C, Buyuktosunoglu A, Cher CY, Bose P, Martonosi M (2006) An analysis of efficient multi-core global power management policies: maximizing performance for a given power budget. In: Proceedings of the 39th annual IEEE/ACM international symposium on microarchitecture, pp 347–358. doi:10.1109/MICRO.2006.8
Kim W, Gupta MS, Wei GY, Brooks D (2008) System level analysis of fast, per-core dvfs using on-chip switching regulators. In: Proceedings of the IEEE 14th international symposium on high performance computer, architecture, pp 123–134. doi:10.1109/HPCA.2008.4658633
Kontorinis V, Shayan A, Tullsen DM, Kumar R (2009) Reducing peak power with a table-driven adaptive processor core. In: Proceedings of the 42nd annual IEEE/ACM international symposium on microarchitecture, MICRO 42. ACM, New York, pp 189–200. doi:10.1145/1669112.1669137
Li S, Ahn JH, Strong RD, Brockman JB, Tullsen DM, Jouppi NP (2009) Mcpat: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In: Proceedings of the 42th international symposium on microarchitecture, pp 469–480
Macken P, Degrauwe M, Van Paemel M, Oguey H (1990) A voltage reduction technique for digital systems. In: Proceedings of the 37th IEEE international solid-state circuits conference. Digest of Technical Papers, pp 238–239. doi:10.1109/ISSCC.1990.110213
Manne S, Klauser A, Grunwald D (1998) Pipeline gating: speculation control for energy reduction. In: Proceedings of the 25th, annual international symposium on computer architecture, pp 132–141. doi:10.1109/ISCA.1998.694769
Najeeb K, Konda VVR, Hari SKS, Kamakoti V, Vedula VM (2007) Power virus generation using behavioral models of circuits. In: Proceedings of the 25th IEEE VLSI test symmposium, VTS ’07. IEEE Computer Society, Washington, pp 35–42. doi:10.1109/VTS.2007.49
Sartori J, Kumar R (2009) Three scalable approaches to improving many-core throughput for a given peak power budget. In: HiPC’09, pp 89–98
Sartori J, Ahrens B, Kumar R (2012) Power balanced pipelines. In: 2012 IEEE 18th international symposium on high performance computer architecture (HPCA), pp 1–12. doi:10.1109/HPCA.2012.6169032
Sasanka R, Hughes CJ, Adve SV (2002) Joint local and global hardware adaptations for energy. In: Proceedings of the 10th international conference on architectural support for programming languages and operating systems, ASPLOS-X. ACM, New York, pp 144–155. doi:10.1145/605397.605413
Semeraro G, Magklis G, Balasubramonian R, Albonesi DH, Dwarkadas S, Scott ML (2002) Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling. In: Proceedings of the 8th international high-performance computer architecture, symposium, pp 29–40. doi:10.1109/HPCA.2002.995696
Simunic T, Benini L, Acquaviva A, Glynn P, de Micheli G (2001) Dynamic voltage scaling and power management for portable systems. In: Proceedings on design automation conference, pp 524–529. doi:10.1109/DAC.2001.156195
Tune E, Liang D, Tullsen DM, Calder B (2001) Dynamic prediction of critical path instructions. In: Proceedings of the 7th international symposium on high-performance computer, architecture, pp 185–195. doi:10.1109/HPCA.2001.903262
Winter JA, Albonesi DH (2008) Addressing thermal nonuniformity in smt workloads. ACM Trans Archit Code Optim 5:4.1-4.28. doi:10.1145/1369396.1369400
Wu Q, Juang P, Martonosi M, Clark DW (2005) Voltage and frequency control with adaptive reaction time in multiple-clock-domain processors. In: Proceedings of the 11th international symposium on high-performance computer, architecture, pp 178–189. doi:10.1109/HPCA.2005.43
Acknowledgments
This work was supported by the Spanish MEC, MICINN and EU Commission FEDER funds under Grants CSD2006-00046 and TIN2009-14475-C04. Also by the EU-FP7 ICT Project “Embedded Reconfigurable Architecture (ERA)”, contract No. 249059. Finally, the EU-FP7 HiPEAC funded an internship of J.M. Cebrián at U. Uppsala.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cebrián, J.M., Sánchez, D., Aragón, J.L. et al. Managing power constraints in a single-core scenario through power tokens. J Supercomput 68, 414–442 (2014). https://doi.org/10.1007/s11227-013-1044-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-013-1044-2