Abstract
Reducing dynamic power consumption is one of the major design goals in modern high-performance processor design. Throttling is a mechanism that reduces dynamic power at the expense of reduced throughput. Instruction profiling can identify a set of instructions suitable for fine-grained throttling without significant performance degradation. In this paper, an Electronic Design Automation (EDA) flow was developed to process pipeline trace at an early stage to identify the bottleneck. Using the developed EDA flow, this work identifies a set of instructions suitable for fine-grained CPU throttling to reduce wasted dynamic power in RISC-V architecture. To rank higher stall causing instructions in the instruction profile, a weight-based system was introduced. It was observed that independent of the workload and type, higher stall causing instructions were repeating across all the benchmark programs. The top 10 instruction profiles for each test suite identify probable throttling clock cycles for each pipeline stage for wasted dynamic power reduction at minimal performance loss. These results are expected to enable researchers to reduce wasted dynamic power by modifying existing architecture and effectively apply throttling mechanism without significant performance degradation.
Similar content being viewed by others
References
Alam M, Kang K, Paul BC, Roy K (2007) Reliability- and process-variation aware design of vlsi circuits. In: 2007 14th International Symposium on the Physical and Failure Analysis of Integrated Circuits. https://doi.org/10.1109/IPFA.2007.4378050
Aragon JL, Gonzalez J, Gonzalez A (2003) Power-aware control speculation through selective throttling. In: The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings, pp 103–112. https://doi.org/10.1109/HPCA.2003.1183528
Borkar S (2005) Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE Micro 25(6):10–16. https://doi.org/10.1109/MM.2005.110
Celio C, Chiu PF, Nikolic B, Patterson DA, Asanovi K (2017) Boom v2: an open-source out-of-order risc-v core. Tech. Rep. UCB/EECS-2017-157, EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-157.html
Celio C, Patterson DA, Asanovi K (2015) The Berkeley out-of-order machine (boom): an industry-competitive, synthesizable, parameterized risc-v processor. Tech. Rep. UCB/EECS-2015-167, EECS Department, University of California, Berkeley. http://www2.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-167.html
Deng Q, Meisner D, Bhattacharjee A, Wenisch TF, Bianchini R (2012) Coscale: coordinating CPU and memory system DVFS in server systems. In: 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp 143–154. https://doi.org/10.1109/MICRO.2012.22
Eggers SJ, Emer JS, Levy HM, Lo JL, Stamm RL, Tullsen DM (1997) Simultaneous multithreading: a platform for next-generation processors. IEEE Micro 17(5):12–19. https://doi.org/10.1109/40.621209
Gelsinger P (2006) Moore’s law—the genius lives on. IEEE Solid State Circuits Soc Newsl 11(5):18–20. https://doi.org/10.1109/N-SSC.2006.4785855
Gelsinger PP (2001) Microprocessors for the new millennium: challenges, opportunities, and new frontiers. In: 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177), pp 22–25. https://doi.org/10.1109/ISSCC.2001.912412
Ghosh S, Roy K (2010) Parameter variation tolerance and error resiliency: new design paradigm for the nanoscale era. Proc IEEE 98(10):1718–1751. https://doi.org/10.1109/JPROC.2010.2057230
Gowan MK, Biro LL, Jackson DB (1998) Power considerations in the design of the alpha 21264 microprocessor. In: Proceedings 1998 Design and Automation Conference. 35th DAC. (Cat. No.98CH36175), pp 726–731. https://doi.org/10.1145/277044.277226
Guthaus MR, Ringenberg JS, Ernst D, Austin TM, Mudge T, Brown RB (2001) Mibench: a free, commercially representative embedded benchmark suite. In: Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, WWC ’01, pp 3–14. IEEE Computer Society, Washington, DC, USA. https://doi.org/10.1109/WWC.2001.15
Intel 4004. https://en.wikipedia.org/wiki/Intel_4004. Accessed 13 Mar 2018
Jharia B, Sarkar S, Agarwal R (2007) Effects of scaling on the impact ionization and subthreshold current in submicron mosfets. Microelectron Int 25(1):41–45. https://doi.org/10.1108/13565360810848156
Kim Y, John LK, Pant S, Manne S, Schulte M, Bircher WL, Govindan MSS (2012) Audit: stress testing the automatic way. In: 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp 212–223. https://doi.org/10.1109/MICRO.2012.28
Kim Y, John LK, Paul I, Manne S, Schulte M (2013) Performance boosting under reliability and power constraints. In: 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp 334–341. https://doi.org/10.1109/ICCAD.2013.6691140
Lee SW, Gaudiot JL (2006) Throttling-based resource management in high performance multithreaded architectures. IEEE Trans Comput 55(9):1142–1152. https://doi.org/10.1109/TC.2006.154
Manne S, Klauser A, Grunwald D (1998) Pipeline gating: speculation control for energy reduction. In: Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235), pp 132–141. https://doi.org/10.1109/ISCA.1998.694769
O3 Pipeline Viewer. http://gem5.org/Visualization. Accessed 16 Apr 2017
Owahid A, John E (2017) Identifying micro-ops for CPU throttling to reduce wasted dynamic power. In: The 3rd Annual Samsung Austin R&D Center Technology Forum
Owahid A, John E (2017) RTL level instruction profiling for CPU throttling to reduce wasted dynamic power. In: The 2017 International Symposium on Parallel and Distributed Computing and Computational Science (CSCI-ISPD). https://doi.org/10.1109/CSCI.2017.281
RISC-V. https://riscv.org/. Accessed 16 Apr 2017
RISC-V Specification. https://riscv.org/specifications/. Accessed 10 Nov 2017
RISCV Benchmarks. https://github.com/riscv/riscv-tests/tree/master/benchmarks. Accessed 29 Oct 2017
riscv-boom. https://github.com/ucb-bar/riscv-boom. Accessed 16 Apr 2017
Rocket-chip. https://github.com/ucb-bar/rocket-chip. Accessed 16 Apr 2017
Sanchez H, Kuttanna B, Olson T, Alexander M, Gerosa G, Philip R, Alvarez J (1997) Thermal management system for high performance PowerPC/sup TM/microprocessors. In: Proceedings IEEE COMPCON 97. Digest of Papers, pp 325–330. https://doi.org/10.1109/CMPCON.1997.584744
Seng JS, Tullsen DM, Cai OZN (2012) Power-sensitive multithreaded architecture. In: 2012 IEEE 30th International Conference on Computer Design (ICCD), pp 17–24. https://doi.org/10.1109/ICCD.2012.6378610
Yeo YC, King TJ, Hu C (2003) Mosfet gate leakage modeling and selection guide for alternative gate dielectrics based on leakage considerations. IEEE Trans Electron Devices 50(4):1027–1035. https://doi.org/10.1109/TED.2003.812504
Zhang W, Zhang H, Lach J (2014) Adaptive front-end throttling for superscalar processors. In: 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), pp 21–26. https://doi.org/10.1145/2627369.2627633
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Owahid, A.A., John, E.B. Wasted dynamic power and correlation to instruction set architecture for CPU throttling. J Supercomput 75, 2436–2454 (2019). https://doi.org/10.1007/s11227-018-2637-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2637-6