skip to main content
research-article

PMU-Events-Driven DVFS Techniques for Improving Energy Efficiency of Modern Processors

Published: 17 August 2022 Publication History

Abstract

This paper describes the results of our measurement-based study, conducted on an Intel Core i7 processor running the SPEC CPU2017 benchmark suites, that evaluates the impact of dynamic voltage frequency scaling (DVFS) on performance (P), energy efficiency (EE), and their product (PxEE). The results indicate that the default DVFS-based power management techniques heavily favor performance, resulting in poor energy efficiency. To remedy this problem, we introduce, implement, and evaluate four DVFS-based power management techniques driven by the following metrics derived from the processor's performance monitoring unit: (i) the total pipeline slot stall ratio (FS-PS), (ii) the total cycle stall ratio (FS-TS), (iii) the total memory-related cycle stall ratio (FS-MS), and (iv) the number of last level cache misses per kilo instructions (FS-LLCM). The proposed techniques linearly map these metrics onto the available processor clock frequencies. The experimental evaluation results show that the proposed techniques significantly improve EE and PxEE metrics compared to the existing approaches. Specifically, EE improves from 44% to 92%, and PxEE improves from 31% to 48% when all the benchmarks are considered together. Furthermore, we find that the proposed techniques are particularly effective for a class of memory-intensive benchmarks – they improve EE from 121% to 183% and PxEE from 100% to 141%. Finally, we elucidate the advantages and disadvantages of each of the proposed techniques and offer recommendations on using them.

References

[1]
A. S. G. Andrae and T. Edler. 2015. On global electricity usage of communication technology: Trends to 2030. Challenges 6, 1 (2015) 117–157. DOI:
[2]
P. Delforge. 2021. America's data centers consuming and wasting growing amounts of energy. NRDC. https://www.nrdc.org/resources/americas-data-centers-consuming-and-wasting-growing-amounts-energy (accessed Mar. 21, 2021).
[3]
How Much Energy Do Data Centers Really Use?, Energy Innovation: Policy and Technology. 2020. https://energyinnovation.org/2020/03/17/how-much-energy-do-data-centers-really-use/(accessed Jan. 27, 2021).
[4]
M. Koot and F. Wijnhoven. 2021. Usage impact on data center electricity needs: A system dynamic forecasting model. Applied Energy 291, (2021), 116798. DOI:
[5]
C. Gough, I. Steiner, and W. Saunders. 2015. Energy Efficient Servers: Blueprints for Data Center Optimization. Apress, 2015.
[6]
L. Finkelstein, E. Rotem, A. Cohen, R. Ronen, and D. Rajwan. 2020. Power management for multiple processor cores. US8402290B2, 2013 Accessed: 18, 2020. [Online]. Available Nov. 18, 2020 https://patents.google.com/patent/US8402290B2/en.
[7]
V. Saravanan, S. K. Chandran, S. Punnekkat, and D. P. Kothari. 2011. A study on factors influencing power consumption in multithreaded and multicore CPUs. W. Trans. on Comp. 10, 3 (2011), 93–103.
[8]
E. Rotem, A. Naveh, A. Ananthakrishnan, E. Weissmann, and D. Rajwan. 2012. Power-management architecture of the intel microarchitecture code-named Sandy Bridge. IEEE Micro 32, 2 (2012), 20–27. DOI:
[9]
R. Schöne, T. Ilsche, M. Bielert, A. Gocht, and D. Hackenberg. 2019. Energy efficiency features of the Intel Skylake-SP processor and their impact on performance. In Proceedings of the 2019 International Conference on High Performance Computing Simulation (HPCS). 399–406. DOI:
[10]
E. A. Burton et al. 2014. FIVR — fully integrated voltage regulators on 4th generation Intel® CoreTM SoCs. In Proceedings of the 2014 IEEE Applied Power Electronics Conference and Exposition - APEC 2014. 432–439. DOI:
[11]
D. Hackenberg, R. Schöne, T. Ilsche, D. Molka, J. Schuchart, and R. Geyer. 2015. An energy efficiency feature survey of the Intel Haswell processor. In Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop. 896–904. DOI:
[12]
S. Huang, M. Lang, S. Pakin, and S. Fu. 2015. Measurement and characterization of Haswell power and energy consumption. In Proceedings of the 3rd International Workshop on Energy Efficient Supercomputing. New York, NY, USA, Nov. 2015, 1–10. DOI:
[13]
S. Jahagirdar, V. George, I. Sodhi, and R. Wells. 2012. Power management of the third generation Intel Core micro architecture formerly codenamed Ivy Bridge. In 2012 IEEE Hot Chips 24 Symposium (HCS’12). 1–49. DOI:
[14]
8.1. Processor Power States — ACPI Specification 6.4 documentation. https://uefi.org/specs/ACPI/6.4/08_Processor_Configuration_and_Control/processor-power-states.html (accessed Jan. 30, 2021).
[15]
Power Management States: P-States, C-States, and Package C-States. http://www.ilinuxkernel.com/files/CPU.Power/pwr_mgmt_states_r0.pdf (accessed Aug. 21, 2020).
[16]
Advanced Configuration and Power Interface - an overview | ScienceDirect Topics. https://www.sciencedirect.com/topics/computer-science/advanced-configuration-and-power-interface (accessed Jan. 20, 2021).
[17]
V. Pallipadi and A. Starikovskiy. 2006. The ondemand governor. In Proceedings of the Linux Symposium 2 (2006), 223–238.
[18]
G. Therien and M. Walz. 2020. Power management system that changes processor level if processor utilization crosses threshold over a period that is different for switching up or down. US7017060B2, 2006 Accessed: Nov. 18, 2020. [Online]. Available https://patents.google.com/patent/US7017060B2/en.
[19]
R. R. Wolford and J. Chen. 2021. Power Management with Lenovo Efficiency Mode. Lenovo Press. Retrieved Jun. 16, 2021 https://lenovopress.lenovo.com/lp0548-power-management-with-lenovo-efficiency-mode.
[20]
A. Dzhagaryan and A. Milenković. 2014. Impact of thread and frequency scaling on performance and energy in modern multicores: A measurement-based study. In Proceedings of the 2014 ACM Southeast Regional Conference. New York, NY, USA, 1–6. DOI:
[21]
R. Hebbar S. R. and A. Milenković. 2019. Impact of thread and frequency scaling on performance and energy efficiency: An evaluation of core i7-8700K using SPEC CPU2017. In Proceedings of the 2019 IEEE SoutheastCon. Huntsville, AL, USA, 1–7. DOI:
[22]
D. Lo and C. Kozyrakis. 2014. Dynamic management of turbomode in modern multi-core chips. In Proceedings of the 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA). 603–613. DOI:
[23]
T. Rauber, G. Rünger, and M. Stachowski. 2019. Model-based optimization of the energy efficiency of multi-threaded applications. Sustainable Computing: Informatics and Systems 22 (2019), 44–61. DOI:
[24]
V. Sundriyal and M. Sosonkina. 2018. Modeling of the CPU frequency to minimize energy consumption in parallel applications. Sustainable Computing: Informatics and Systems 17 (2018), 1–8. DOI:
[25]
A. Mallik, B. Lin, G. Memik, P. Dinda, and R. P. Dick. 2006. User-driven frequency scaling. IEEE Computer Architecture Letters 5, 2 (2006), 16–16. DOI:
[26]
US7840825B2 - Method for autonomous dynamic voltage and frequency scaling of microprocessors - Google Patents. https://patents.google.com/patent/US7840825B2/en (accessed Oct. 20, 2019).
[27]
D. Molka, R. Schöne, D. Hackenberg, and W. E. Nagel. 2017. Detecting memory-boundedness with hardware performance counters. In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering. New York, NY, USA, 2017, 27–38. DOI:
[28]
M. Nanja. 2019. Performance monitoring based dynamic voltage and frequency scaling, US7770034B2, 03, 2010 Accessed: Aug. 20, 2019. [Online]. Available: https://patents.google.com/patent/US7770034B2/en.
[29]
V. Spiliopoulos, S. Kaxiras, and G. Keramidas. 2011. Green governors: A framework for continuously adaptive DVFS. In 2011 International Green Computing Conference and Workshops 2011, 1–8. DOI:
[30]
P. Koutsovasilis, K. Parasyris, C. D. Antonopoulos, N. Bellas, and S. Lalis. 2020. Dynamic undervolting to improve energy efficiency on multicore X86 CPUs. IEEE Transactions on Parallel and Distributed Systems 31, 12 (2020), 2851–2864. DOI:
[31]
D. P. Johnson, E. C. Saxe, and B. Smaalders. 2021. Frequency scaling of processing unit based on aggregate thread CPI metric. US8219993B2, Jul. 10, 2012 Accessed: Apr. 22, 2021. [Online]. Available: https://patents.google.com/patent/US8219993B2/en.
[32]
P. Altevogt, H. Boettiger, W. M. Felter, C. R. Lefurgy, L. Stiege, and M. S. Ware. 2021. Method for autonomous dynamic voltage and frequency scaling of microprocessors. US20080098254A1, Apr. 24, 2008 Accessed: Apr. 22, 2021. [Online]. Available: https://patents.google.com/patent/US20080098254/en.
[33]
M. S. Rao. 2021. An overview of the 6th generation Intel® CoreTM processor (code-named Skylake). Intel. https://www.intel.com/content/www/us/en/develop/articles/an-overview-of-the-6th-generation-intel-core-processor-code-named-skylake.html (accessed Apr. 27, 2021).
[34]
J. Doweck et al. 2017. Inside 6th-generation Intel Core: New microarchitecture code-named Skylake. IEEE Micro 37, 2 (2017), 52–62. DOI:
[35]
G. Semeraro, G. Magklis, R. Balasubramonian, D. H. Albonesi, S. Dwarkadas, and M. L. Scott. 2002. Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling. In Proceedings Eighth International Symposium on High Performance Computer Architecture 2002, 29–40. DOI:
[36]
R. Hebbar and P. Yedlapalli. Host power management in VMware vSphere 7.0: Performance study for optimal power consumption. [Online]. Available: https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/performance/hpm-vsphere7-perf.pdf.
[37]
J. Charles, P. Jassi, N. S. Ananth, A. Sadat, and A. Fedorova. 2009. Evaluation of the Intel® CoreTM i7 turbo boost feature. In Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC), (2009), 188–197. DOI:
[38]
What Is Intel® Turbo Boost Technology? Intel. https://www.intel.com/content/www/us/en/gaming/resources/turbo-boost.html (accessed Apr. 28, 2021).
[39]
Power Management Guide Red Hat Enterprise Linux 7. Red Hat Customer Portal. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/power_management_guide/index (accessed Feb. 06, 2021).
[40]
Intel® 64 and IA-32 Architectures Software Developer's Manual Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4. Intel. https://tinyurl.com/2p7knr6x (accessed Jun. 03, 2020).
[41]
A. Yasin. 2014. A top-down method for performance analysis and counters architecture. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (2014), 35–44. DOI:
[42]
R. Hebbar S. R. and A. Milenković. 2019. SPEC CPU2017: Performance, event, and energy characterization on the core i7-8700K. In Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering. New York, NY, USA 2019, 111–118. DOI:
[43]
S. Williams, A. Waterman, and D. Patterson. 2009. Roofline: An insightful visual performance model for floating-point programs and multicore architectures. 1407078, 2009. DOI:
[44]
L. Yi, C. Li, and J. Guo. 2020. CPI for runtime performance measurement: The good, the bad, and the ugly. In Proceedings of the 2020 IEEE International Symposium on Workload Characterization (IISWC’20). 106–113. DOI:
[45]
R. Hebbar S. R., M. Ponugoti, and A. Milenković. 2019. Battle of compilers: An experimental evaluation using SPEC CPU2017. In Proceedings of the 2019 IEEE SoutheastCon, Huntsville, AL, USA, 2019, 1–8. DOI:
[46]
A. R. Alameldeen and D. A. Wood. 2006. IPC considered harmful for multiprocessor workloads. IEEE Micro 26, 4 (2006), 8–17. DOI:
[47]
[48]
Clockticks per Instructions Retired (CPI). Intel. https://tinyurl.com/222m7js3 (accessed Aug. 21, 2021).
[49]
GNU MSR Tools Package (msr-tools 1.3). https://guix.gnu.org/packages/msr-tools-1.3/(accessed Aug. 21, 2021).
[50]
Intel® CoreTM i7-8700K Processor Product Specifications. Intel® ARK (Product Specs). https://tinyurl.com/ybcw5vc8 (accessed Mar. 23, 2018).
[51]
J. Bucek, K.-D. Lange, and J. V. Kistowski. 2018. SPEC CPU2017: Next-generation compute benchmark. In Companion of the 2018 ACM/SPEC International Conference on Performance Engineering - ICPE’18. Berlin, Germany. 41–42. DOI:
[52]
SPEC CPU®. 2017. https://www.spec.org/cpu2017/(accessed Mar. 19, 2018).
[53]
J. Treibig, G. Hager, and G. Wellein. 2010. LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. In Proceedings of the 2010 39th International Conference on Parallel Processing Workshops. 207–216. DOI:
[54]
Likwid Powermeter · RRZE-HPC/likwid Wiki. GitHub. https://github.com/RRZE-HPC/likwid (accessed Aug. 21, 2021).
[55]
H. Zhang and H. Hoffmann. 2015. A quantitative evaluation of the RAPL power control system. Feedback Computing 2015, 6, 2015.
[56]
V. M. Weaver et al. 2012. Measuring energy and power with PAPI. In Proceedings of the 2012 41st International Conference on Parallel Processing Workshops. 262–268. DOI:
[57]
K. N. Khan, M. Hirki, T. Niemi, J. K. Nurminen, and Z. Ou. 2018. RAPL in action: Experiences in using RAPL for power measurements. ACM Trans. Model. Perform. Eval. Comput. Syst. 3, 2 (2018), 9:1–9:26. DOI:
[58]
The Linux IPMI Driver — The Linux Kernel documentation. https://www.kernel.org/doc/html/latest/driver-api/ipmi.html (accessed Sep. 16, 2021).
[59]
E. André, R. Dulong, A. Guermouche, and F. Trahay. 2019. DUF : Dynamic uncore frequency scaling to reduce power consumption. Accessed: Jan. 02, 2020. [Online]. Available: https://hal.archives-ouvertes.fr/hal-02401796.
[60]
H. Esmaeilzadeh, T. Cao, X. Yang, S. Blackburn, and K. McKinley. 2012. What is happening to power, performance, and software? IEEE Micro 32, 3 (2012), 110–121. DOI:
[61]
X. Fan, W.-D. Weber, and L. A. Barroso. 2007. Power provisioning for a warehouse-sized computer. In Proceedings of the 34th Annual International Symposium on Computer Architecture. New York, NY, USA, 2007, 13–23. DOI:
[62]
B. Goel and S. A. McKee. 2016. A methodology for modeling dynamic and static power consumption for multicore processors. In Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS’16). 273–282. DOI:
[63]
H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam, and D. Burger. 2013. Power challenges may end the multicore era. Commun. ACM 56, 2 (2013), 93–102. DOI:
[64]
S. Park et al. 2013. Accurate modeling of the delay and energy overhead of dynamic voltage and frequency scaling in modern microprocessors. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 32, 5 (2013), 695–708. DOI:
[65]
T. Rauber, G. Rünger, and M. Stachowski. 2018. Performance and energy metrics for multi-threaded applications on DVFS processors. Sustainable Computing: Informatics and Systems 17 (2018), 55–68. DOI:
[66]
M. Curtis-Maury, A. Shah, F. Blagojevic, D. S. Nikolopoulos, B. R. de Supinski, and M. Schulz. 2008. Prediction models for multi-dimensional power-performance optimization on many cores. In Proceedings of the 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT’08). 250–259.
[67]
Y. Cho, N. Chang, C. Chakrabarti, and S. Vrudhula. 2006. High-level power management of embedded systems with application-specific energy cost functions. In Proceedings of the 43rd Annual Design Automation Conference. New York, NY, USA, 2006, 568–573. DOI:
[68]
A. Iyer and D. Marculescu. 2002. Power efficiency of voltage scaling in multiple clock multiple voltage cores. In Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2002. ICCAD’02. 379–386. DOI:
[69]
W. Bao et al. 2016. Static and dynamic frequency scaling on multicore CPUs. ACM Trans. Archit. Code Optim. 13, 4 51:1–51:26. DOI:
[70]
R. Hebbar and A. Milenković. 2021. An experimental evaluation of workload driven DVFS. In Companion of the ACM/SPEC International Conference on Performance Engineering, Virtual Event France, 2021, 95–102. DOI:
[71]
M. A. Laurenzano, M. Meswani, L. Carrington, A. Snavely, M. M. Tikir, and S. Poole. 2011. Reducing energy usage with memory and computation-aware dynamic frequency scaling. In Proceedings of the Euro-Par 2011 Parallel Processing. 79–90.
[72]
K. De Vogeleer, G. Memmi, P. Jouvelot, and F. Coelho. 2014. The energy/frequency convexity rule: Modeling and experimental validation on mobile devices. In Proceedings of the Parallel Processing and Applied Mathematics. Berlin, Heidelberg, 2014, 793–803. DOI:
[73]
S. Bhalachandra, A. Porterfield, and J. F. Prins. 2015. Using dynamic duty cycle modulation to improve energy efficiency in high performance computing. In 2015 IEEE International Parallel and Distributed Processing Symposium Workshop. 911–918. DOI:
[74]
R. Schöne, T. Ilsche, M. Bielert, D. Molka, and D. Hackenberg. 2016. Software controlled clock modulation for energy efficiency optimization on Intel processors. In 2016 4th International Workshop on Energy Efficient Supercomputing (E2SC’16). 69–76. DOI:
[75]
C. Isci and M. Martonosi. 2003. Runtime power monitoring in high-end processors: Methodology and empirical data. In Proceedings, 36th Annual IEEE/ACM International Symposium on Microarchitecture 2003. MICRO-36. 93–104. DOI:
[76]
S. Desrochers, C. Paradis, and V. M. Weaver. 2016. A validation of DRAM RAPL power measurements. In Proceedings of the Second International Symposium on Memory Systems. Alexandria, VA, USA, 2016, 455–470. DOI:
[77]
M. Najibi, M. Salehi, A. A. Kusha, M. Pedram, S. M. Fakhraie, and H. Pedram. 2006. Dynamic voltage and frequency management based on variable update intervals for frequency setting. In Proceedings of the 2006 IEEE/ACM International Conference on Computer Aided Design. 755–760. DOI:
[78]
G. Keramidas, V. Spiliopoulos, and S. Kaxiras. 2010. Interval-based models for run-time DVFS orchestration in superscalar processors. In Proceedings of the 7th ACM International Conference on Computing Frontiers. New York, NY, USA, 2010, 287–296. DOI:
[79]
R. Schöne and D. Hackenberg. 2011. On-line analysis of hardware performance events for workload characterization and processor frequency scaling decisions. In Proceeding of the Second Joint WOSP/SIPEW International Conference on Performance Engineering - ICPE ’11, Karlsruhe, Germany, 2011, 481. DOI:
[80]
H. Jung and M. Pedram. 2008. Continuous frequency adjustment technique based on dynamic workload prediction. In Proceedings of the 21st International Conference on VLSI Design (VLSID’2008). 249–254. DOI:
[81]
G. D. Costa and J.-M. Pierson. 2015. DVFS governor for HPC: Higher, faster, greener. In 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing. 533–540. DOI:
[82]
S. K. Saurav, G. L. Ganga Prasad, and M. Chauhan. 2016. Adaptive power management for HPC applications. In Proceedings of the 2016 2nd International Conference on Green High Performance Computing (ICGHPC’16). 1–7. DOI:
[84]
S. M. V. N. Marques et al. 2019. The impact of turbo frequency on the energy, performance, and aging of parallel applications. In Proceedings of the 2019 IFIP/IEEE 27th International Conference on Very Large Scale Integration (VLSI-SoC’19). 149–154. DOI:
[85]
R. Miftakhutdinov, E. Ebrahimi, and Y. N. Patt. 2012. Predicting performance impact of DVFS for realistic memory systems. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. 155–165. DOI:
[86]
S. Kim, C. Choi, H. Eom, H. Y. Yeom, and H. Byun. 2012. Energy-centric DVFS controling method for multi-core platforms. In Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis. 685–690. DOI:

Cited By

View all
  • (2024)OS-Level PMC-Based Runtime Thermal Control for ARM Mobile CPUsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336031943:7(2023-2036)Online publication date: 31-Jan-2024
  • (2024)Is the powersave governor really saving power?2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00039(273-283)Online publication date: 6-May-2024
  • (2022)DVFS method of memory hierarchy based on CPU microarchitectural information2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS)10.1109/ICECS202256217.2022.9971023(1-4)Online publication date: 24-Oct-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Modeling and Performance Evaluation of Computing Systems
ACM Transactions on Modeling and Performance Evaluation of Computing Systems  Volume 7, Issue 1
March 2022
122 pages
ISSN:2376-3639
EISSN:2376-3647
DOI:10.1145/3551657
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 August 2022
Online AM: 25 May 2022
Accepted: 01 May 2022
Revised: 01 May 2022
Received: 01 November 2021
Published in TOMPECS Volume 7, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DVFS
  2. performance
  3. energy-efficiency
  4. SPEC CPU2017
  5. power management multicores
  6. benchmarking
  7. performance monitoring unit

Qualifiers

  • Research-article
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)327
  • Downloads (Last 6 weeks)22
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)OS-Level PMC-Based Runtime Thermal Control for ARM Mobile CPUsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336031943:7(2023-2036)Online publication date: 31-Jan-2024
  • (2024)Is the powersave governor really saving power?2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00039(273-283)Online publication date: 6-May-2024
  • (2022)DVFS method of memory hierarchy based on CPU microarchitectural information2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS)10.1109/ICECS202256217.2022.9971023(1-4)Online publication date: 24-Oct-2022

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media