Skip to main content
Log in

Managing power constraints in a single-core scenario through power tokens

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Current microprocessors face constant thermal and power-related problems during their everyday use, usually solved by applying a power budget to the processor/core. Dynamic voltage and frequency scaling (DVFS) has been an effective technique that allowed microprocessors to match a predefined power budget. However, the continuous increase of leakage power due to technology scaling along with low resolution of DVFS makes it less attractive as a technique to match a predefined power budget as technology goes to deep-submicron. In this paper, we propose the use of microarchitectural techniques to accurately match a power constraint while maximizing the energy-efficiency of the processor. We will predict the processor power dissipation at cycle level (power token throttling) or at a basic block level (basic block level mechanism), using the dissipated power translated into tokens to select between different power-saving microarchitectural techniques. We also introduce a two-level approach in which DVFS acts as a coarse-grain technique to lower the average power dissipation towards the power budget, while microarchitectural techniques focus on removing the numerous power spikes. Experimental results show that the use of power-saving microarchitectural techniques in conjunction with DVFS is up to six times more precise, in terms of total energy consumed over the power budget, than only using DVFS to match a predefined power budget.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. Register Update Unit.

  2. 8 bits per entry.

  3. Assumed in this paper as a stream of instructions between two branches.

  4. Thermal design package.

  5. Using the Wattch’s clock-gating style cc3, which scales power linearly with unit usage. Inactive units still dissipate 10 % of its maximum power.

  6. For McPAT \(V_{DD}\) at 32 nm is 1.2 V, so each 5 % reduction in voltage translates into 60 mV. That means it will take 1.2 ns to switch between modes. As the processor runs at 3 GHz, it will take 3.6 (4) cycles to change between power modes.

References

  1. Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 3C. Intel Corporation

  2. Aragon JL, Gonzalez J, Gonzalez A (2003) Power-aware control speculation through selective throttling. In: Proceedings of the 9th international symposium on high-performance computer, architecture, pp 103–112. doi:10.1109/HPCA.2003.1183528

  3. Bacha A, Teodorescu R (2013) Dynamic reduction of voltage margins by leveraging on-chip ecc in itanium ii processors. In: Proceedings of the 40th annual international symposium on computer architecture, ISCA ’13. ACM, New York, pp 297–307. doi:10.1145/2485922.2485948

  4. Baniasadi A, Moshovos A (2001) Instruction flow-based front-end throttling for power-aware high-performance processors. In: Proceedings of the international symposium on low power electronics and design, pp 16–21. doi:10.1109/LPE.2001.945365

  5. Brooks DM, Bose P, Schuster SE, Jacobson H, Kudva PN, Buyuktosunoglu A, Wellman JD, Zyuban V, Gupta M, Cook PW (2000) Power-aware microarchitecture: design and modeling challenges for next-generation microprocessors. IEEE Micro 20:26–44. doi:10.1109/40.888701. http://portal.acm.org/citation.cfm?id=623296.624401

    Google Scholar 

  6. Casmira J, Grunwald D (2000) Dynamic instruction scheduling slack. In: Proceedings of the KoolChips workshop

  7. Cebrian JM, Aragon JL, Garcia JM, Petoumenos P, Kaxiras S (2009) Efficient microarchitecture policies for accurately adapting to power constraints. In: Proceedings of the IEEE international parallel & distributed processing, symposium, pp 1–12. doi:10.1109/IPDPS.2009.5161022

  8. Donald J, Martonosi M (2006) Techniques for multicore thermal management: classification and new exploration. In: Proceedings of the 33rd international symposium on computer, architecture, pp 78–88. doi:10.1109/ISCA.2006.39

  9. Esmaeilzadeh H, Blem E, St Amant R, Sankaralingam K, Burger D (2011) Dark silicon and the end of multicore scaling. In: Proceedings of the 38th annual international symposium on computer architecture, ISCA ’11. ACM, New York, pp 365–376. doi:10.1145/2000064.2000108

  10. Homayoun H, Kontorinis V, Shayan A, Lin TW, Tullsen D (2012) Dynamically heterogeneous cores through 3d resource pooling. In: 2012 IEEE 18th international symposium on high performance computer architecture (HPCA), pp 1–12. doi:10.1109/HPCA.2012.6169037

  11. Isci C, Buyuktosunoglu A, Cher CY, Bose P, Martonosi M (2006) An analysis of efficient multi-core global power management policies: maximizing performance for a given power budget. In: Proceedings of the 39th annual IEEE/ACM international symposium on microarchitecture, pp 347–358. doi:10.1109/MICRO.2006.8

  12. Kim W, Gupta MS, Wei GY, Brooks D (2008) System level analysis of fast, per-core dvfs using on-chip switching regulators. In: Proceedings of the IEEE 14th international symposium on high performance computer, architecture, pp 123–134. doi:10.1109/HPCA.2008.4658633

  13. Kontorinis V, Shayan A, Tullsen DM, Kumar R (2009) Reducing peak power with a table-driven adaptive processor core. In: Proceedings of the 42nd annual IEEE/ACM international symposium on microarchitecture, MICRO 42. ACM, New York, pp 189–200. doi:10.1145/1669112.1669137

  14. Li S, Ahn JH, Strong RD, Brockman JB, Tullsen DM, Jouppi NP (2009) Mcpat: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In: Proceedings of the 42th international symposium on microarchitecture, pp 469–480

  15. Macken P, Degrauwe M, Van Paemel M, Oguey H (1990) A voltage reduction technique for digital systems. In: Proceedings of the 37th IEEE international solid-state circuits conference. Digest of Technical Papers, pp 238–239. doi:10.1109/ISSCC.1990.110213

  16. Manne S, Klauser A, Grunwald D (1998) Pipeline gating: speculation control for energy reduction. In: Proceedings of the 25th, annual international symposium on computer architecture, pp 132–141. doi:10.1109/ISCA.1998.694769

  17. Najeeb K, Konda VVR, Hari SKS, Kamakoti V, Vedula VM (2007) Power virus generation using behavioral models of circuits. In: Proceedings of the 25th IEEE VLSI test symmposium, VTS ’07. IEEE Computer Society, Washington, pp 35–42. doi:10.1109/VTS.2007.49

  18. Sartori J, Kumar R (2009) Three scalable approaches to improving many-core throughput for a given peak power budget. In: HiPC’09, pp 89–98

  19. Sartori J, Ahrens B, Kumar R (2012) Power balanced pipelines. In: 2012 IEEE 18th international symposium on high performance computer architecture (HPCA), pp 1–12. doi:10.1109/HPCA.2012.6169032

  20. Sasanka R, Hughes CJ, Adve SV (2002) Joint local and global hardware adaptations for energy. In: Proceedings of the 10th international conference on architectural support for programming languages and operating systems, ASPLOS-X. ACM, New York, pp 144–155. doi:10.1145/605397.605413

  21. Semeraro G, Magklis G, Balasubramonian R, Albonesi DH, Dwarkadas S, Scott ML (2002) Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling. In: Proceedings of the 8th international high-performance computer architecture, symposium, pp 29–40. doi:10.1109/HPCA.2002.995696

  22. Simunic T, Benini L, Acquaviva A, Glynn P, de Micheli G (2001) Dynamic voltage scaling and power management for portable systems. In: Proceedings on design automation conference, pp 524–529. doi:10.1109/DAC.2001.156195

  23. Tune E, Liang D, Tullsen DM, Calder B (2001) Dynamic prediction of critical path instructions. In: Proceedings of the 7th international symposium on high-performance computer, architecture, pp 185–195. doi:10.1109/HPCA.2001.903262

  24. Winter JA, Albonesi DH (2008) Addressing thermal nonuniformity in smt workloads. ACM Trans Archit Code Optim 5:4.1-4.28. doi:10.1145/1369396.1369400

  25. Wu Q, Juang P, Martonosi M, Clark DW (2005) Voltage and frequency control with adaptive reaction time in multiple-clock-domain processors. In: Proceedings of the 11th international symposium on high-performance computer, architecture, pp 178–189. doi:10.1109/HPCA.2005.43

Download references

Acknowledgments

This work was supported by the Spanish MEC, MICINN and EU Commission FEDER funds under Grants CSD2006-00046 and TIN2009-14475-C04. Also by the EU-FP7 ICT Project “Embedded Reconfigurable Architecture (ERA)”, contract No. 249059. Finally, the EU-FP7 HiPEAC funded an internship of J.M. Cebrián at U. Uppsala.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan M. Cebrián.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cebrián, J.M., Sánchez, D., Aragón, J.L. et al. Managing power constraints in a single-core scenario through power tokens. J Supercomput 68, 414–442 (2014). https://doi.org/10.1007/s11227-013-1044-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-013-1044-2

Keywords

Navigation