Skip to main content
Log in

Wasted dynamic power and correlation to instruction set architecture for CPU throttling

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Reducing dynamic power consumption is one of the major design goals in modern high-performance processor design. Throttling is a mechanism that reduces dynamic power at the expense of reduced throughput. Instruction profiling can identify a set of instructions suitable for fine-grained throttling without significant performance degradation. In this paper, an Electronic Design Automation (EDA) flow was developed to process pipeline trace at an early stage to identify the bottleneck. Using the developed EDA flow, this work identifies a set of instructions suitable for fine-grained CPU throttling to reduce wasted dynamic power in RISC-V architecture. To rank higher stall causing instructions in the instruction profile, a weight-based system was introduced. It was observed that independent of the workload and type, higher stall causing instructions were repeating across all the benchmark programs. The top 10 instruction profiles for each test suite identify probable throttling clock cycles for each pipeline stage for wasted dynamic power reduction at minimal performance loss. These results are expected to enable researchers to reduce wasted dynamic power by modifying existing architecture and effectively apply throttling mechanism without significant performance degradation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Alam M, Kang K, Paul BC, Roy K (2007) Reliability- and process-variation aware design of vlsi circuits. In: 2007 14th International Symposium on the Physical and Failure Analysis of Integrated Circuits. https://doi.org/10.1109/IPFA.2007.4378050

  2. Aragon JL, Gonzalez J, Gonzalez A (2003) Power-aware control speculation through selective throttling. In: The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings, pp 103–112. https://doi.org/10.1109/HPCA.2003.1183528

  3. Borkar S (2005) Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE Micro 25(6):10–16. https://doi.org/10.1109/MM.2005.110

    Article  Google Scholar 

  4. Celio C, Chiu PF, Nikolic B, Patterson DA, Asanovi K (2017) Boom v2: an open-source out-of-order risc-v core. Tech. Rep. UCB/EECS-2017-157, EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-157.html

  5. Celio C, Patterson DA, Asanovi K (2015) The Berkeley out-of-order machine (boom): an industry-competitive, synthesizable, parameterized risc-v processor. Tech. Rep. UCB/EECS-2015-167, EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-167.html

  6. Deng Q, Meisner D, Bhattacharjee A, Wenisch TF, Bianchini R (2012) Coscale: coordinating CPU and memory system DVFS in server systems. In: 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp 143–154. https://doi.org/10.1109/MICRO.2012.22

  7. Eggers SJ, Emer JS, Levy HM, Lo JL, Stamm RL, Tullsen DM (1997) Simultaneous multithreading: a platform for next-generation processors. IEEE Micro 17(5):12–19. https://doi.org/10.1109/40.621209

    Article  Google Scholar 

  8. Gelsinger P (2006) Moore’s law—the genius lives on. IEEE Solid State Circuits Soc Newsl 11(5):18–20. https://doi.org/10.1109/N-SSC.2006.4785855

    Article  Google Scholar 

  9. Gelsinger PP (2001) Microprocessors for the new millennium: challenges, opportunities, and new frontiers. In: 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177), pp 22–25. https://doi.org/10.1109/ISSCC.2001.912412

  10. Ghosh S, Roy K (2010) Parameter variation tolerance and error resiliency: new design paradigm for the nanoscale era. Proc IEEE 98(10):1718–1751. https://doi.org/10.1109/JPROC.2010.2057230

    Article  Google Scholar 

  11. Gowan MK, Biro LL, Jackson DB (1998) Power considerations in the design of the alpha 21264 microprocessor. In: Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175), pp 726–731. https://doi.org/10.1145/277044.277226

  12. Guthaus MR, Ringenberg JS, Ernst D, Austin TM, Mudge T, Brown RB (2001) Mibench: a free, commercially representative embedded benchmark suite. In: Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, WWC ’01, pp 3–14. IEEE Computer Society, Washington, DC, USA. https://doi.org/10.1109/WWC.2001.15

  13. Intel 4004. https://en.wikipedia.org/wiki/Intel_4004. Accessed 13 Mar 2018

  14. Jharia B, Sarkar S, Agarwal R (2007) Effects of scaling on the impact ionization and subthreshold current in submicron mosfets. Microelectron Int 25(1):41–45. https://doi.org/10.1108/13565360810848156

    Article  Google Scholar 

  15. Kim Y, John LK, Pant S, Manne S, Schulte M, Bircher WL, Govindan MSS (2012) Audit: stress testing the automatic way. In: 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp 212–223. https://doi.org/10.1109/MICRO.2012.28

  16. Kim Y, John LK, Paul I, Manne S, Schulte M (2013) Performance boosting under reliability and power constraints. In: 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp 334–341. https://doi.org/10.1109/ICCAD.2013.6691140

  17. Lee SW, Gaudiot JL (2006) Throttling-based resource management in high performance multithreaded architectures. IEEE Trans Comput 55(9):1142–1152. https://doi.org/10.1109/TC.2006.154

    Article  Google Scholar 

  18. Manne S, Klauser A, Grunwald D (1998) Pipeline gating: speculation control for energy reduction. In: Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235), pp 132–141. https://doi.org/10.1109/ISCA.1998.694769

  19. O3 Pipeline Viewer. http://gem5.org/Visualization. Accessed 16 Apr 2017

  20. Owahid A, John E (2017) Identifying micro-ops for CPU throttling to reduce wasted dynamic power. In: The 3rd Annual Samsung Austin R&D Center Technology Forum

  21. Owahid A, John E (2017) RTL level instruction profiling for CPU throttling to reduce wasted dynamic power. In: The 2017 International Symposium on Parallel and Distributed Computing and Computational Science (CSCI-ISPD). https://doi.org/10.1109/CSCI.2017.281

  22. RISC-V. https://riscv.org/. Accessed 16 Apr 2017

  23. RISC-V Specification. https://riscv.org/specifications/. Accessed 10 Nov 2017

  24. RISCV Benchmarks. https://github.com/riscv/riscv-tests/tree/master/benchmarks. Accessed 29 Oct 2017

  25. riscv-boom. https://github.com/ucb-bar/riscv-boom. Accessed 16 Apr 2017

  26. Rocket-chip. https://github.com/ucb-bar/rocket-chip. Accessed 16 Apr 2017

  27. Sanchez H, Kuttanna B, Olson T, Alexander M, Gerosa G, Philip R, Alvarez J (1997) Thermal management system for high performance PowerPC/sup TM/microprocessors. In: Proceedings IEEE COMPCON 97. Digest of Papers, pp 325–330. https://doi.org/10.1109/CMPCON.1997.584744

  28. Seng JS, Tullsen DM, Cai OZN (2012) Power-sensitive multithreaded architecture. In: 2012 IEEE 30th International Conference on Computer Design (ICCD), pp 17–24. https://doi.org/10.1109/ICCD.2012.6378610

  29. Yeo YC, King TJ, Hu C (2003) Mosfet gate leakage modeling and selection guide for alternative gate dielectrics based on leakage considerations. IEEE Trans Electron Devices 50(4):1027–1035. https://doi.org/10.1109/TED.2003.812504

    Article  Google Scholar 

  30. Zhang W, Zhang H, Lach J (2014) Adaptive front-end throttling for superscalar processors. In: 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), pp 21–26. https://doi.org/10.1145/2627369.2627633

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdullah A. Owahid.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Owahid, A.A., John, E.B. Wasted dynamic power and correlation to instruction set architecture for CPU throttling. J Supercomput 75, 2436–2454 (2019). https://doi.org/10.1007/s11227-018-2637-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2637-6

Keywords

Navigation