Skip to main content

Advertisement

Log in

Joint frequency scaling of processor and DRAM

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Energy efficiency and energy-proportional computing have become a central focus in modern supercomputers. Many previous energy-saving strategies have focused solely on the CPU while the DRAM subsystem has not been addressed sufficiently, even though memory consumes about 20 % of the total power in a typical server platform. This paper describes a novel runtime system that scales the frequency of both processor and DRAM-based on the performance and power models, also proposed here. Specifically, first, a performance-loss constraint is chosen for an application, then, an optimal processor–DRAM frequency pair is modeled such that the pair minimizes the energy consumption in a given timeslice. Experiments performed on SPEC CPU™ 2006, NAS NPB, and pARMS benchmarks demonstrate that the proposed runtime system may obtain total energy savings both for memory- and compute-intensive applications. In particular, as much as 22 % of energy was saved with a low performance loss of about 4.8 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. TOP500 list: http://top500.org/.

  2. Authors’ previous work [31] outlines the pitfalls of the models relying on the user-defined performance-loss tolerance and introduces a model based on instantaneous power consumption.

  3. LMBench web-site: http://www.bitmover.com/lmbench/.

  4. See, e.g., http://www.anandtech.com/show/6355/intels-haswell-architecture/8.

  5. Wattsup meter: https://www.wattsupmeters.com.

  6. SPEC CPU™ 2006 benchmarks web-site: https://www.spec.org/cpu2006/.

References

  1. Begum R, Werner D, Hempstead M, Prasad G, Challen G (2015) Energy-performance trade-offs on energy-constrained devices with multi-component DVFS. In: Workload Characterization (IISWC), 2015 IEEE International Symposium on, pp 34–43, Oct 2015

  2. Borkar S (2001) The exascale challenge, 2011. Keynote speech. In: the 12th International Conference on Parallel Architectures and Compilation Techniques

  3. Chen YJ, Yang CL, Lin PS, Lu YC (2015) Thermal/performance characterization of CMPs with 3D-stacked DRAMs under synergistic voltage-frequency control of cores and DRAMs. In: Proceedings of the 2015 Conference on Research in Adaptive and Convergent Systems, RACS, pp 430–436, New York, NY, USA, 2015. ACM

  4. David H, Fallin C, Gorbatov E, Hanebutte UR, Mutlu O (2011) Memory power management via dynamic voltage/frequency scaling. In: Proceedings of the 8th ACM International Conference on Autonomic Computing, pp 31–40

  5. Deng Q, Meisner D, Bhattacharjee A, Wenisch TF, Bianchini R (2012) Coscale: coordinating cpu and memory system DVFS in server systems. In: Microarchitecture (MICRO), 2012 45th Annual IEEE/ACM International Symposium on, pp 143–154, Dec 2012

  6. Etinski M, Corbalan J, Labarta J, Valero M, Veidenbaum A (2009) Power-aware load balancing of large scale MPI applications. In Parallel Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pp 1–8, May 2009

  7. Freeh VW, Lowenthal DK (2005) Using multiple energy gears in MPI programs on a power-scalable cluster. In: Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming, pp 164–173

  8. Ge R, Feng X, Feng W, Cameron KW (2007) CPU MISER: A performance-directed, run-time system for power-aware clusters. In: Parallel Processing, 2007. ICPP 2007. International Conference on, pp 18, Sep. 2007

  9. Ge R, Feng X, Song S, Chang HC, Li D, Cameron KW (2010) PowerPack: energy profiling and analysis of high-performance systems and applications. Parallel Distrib Syst IEEE Trans 21:658–671

    Article  Google Scholar 

  10. Gonzales R, Horowitz M (1995) Energy dissipation in general purpose processors. IEEE J Solid State Circuits 31:1277–1284

    Article  Google Scholar 

  11. Hackenberg D, Schone R, Ilsche T, Molka D, Schuchart J, Geyer R (2015) An energy efficiency feature survey of the intel haswell processor. In: Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International, pp 896–904, May 2015

  12. Hennessy JL, Patterson DA (2011) Computer architecture: a quantitative approach (appendix B), 5th edn. Morgan Kaufmann Publishers Inc., San Francisco

    MATH  Google Scholar 

  13. Henning JL (2006) SPEC CPU2006 benchmark descriptions. SIGARCH Comput Archit News 34(4):1–17

    Article  MathSciNet  Google Scholar 

  14. Hsu CH, Feng W (2005) A power-aware run-time system for high-performance computing. In Supercomputing. In: Proceedings of the ACM/IEEE SC 2005 Conference, pp 1, Nov. 2005

  15. Huang S, Feng W (2009) Energy-efficient cluster computing via accurate workload characterization. In: Cluster Computing and the Grid, 2009. CCGRID’09. 9th IEEE/ACM International Symposium on, pp 68–75, May 2009

  16. Iancu C, Hofmeyr S, Blagojevic F, Zheng Y (2010) Oversubscription on multicore processors. In: Parallel Distributed Processing (IPDPS), 2010 IEEE International Symposium on, pp 1–11

  17. Intel 64 and IA-32 architectures software developer’s manual combined volumes 3A, 3B, and 3C: System programming guide. http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf

  18. Ioannou N, Kauschke M, Gries M, Cintra M (2011) Phase-based application-driven hierarchical power management on the single-chip cloud computer. In: Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on, pp 131–142, Oct. 2011

  19. Kandalla K, Mancini EP, Sur S, Panda DK (2010) Designing power-aware collective communication algorithms for InfiniBand clusters. In: Parallel Processing (ICPP), 2010 39th International Conference on, pp 218–227

  20. Lefurgy C, Rajamani K, Rawson F, Felter W, Kistler M, Keller TW (2003) Energy management for commercial servers. Computer 36(12):39–48

    Article  Google Scholar 

  21. Li Z, Saad Y, Sosonkina M (2003) pARMS: a parallel version of the algebraic recursive multilevel solver. Numer Linear Algebra Appl 10:485–509

    Article  MathSciNet  MATH  Google Scholar 

  22. Lim MY, Freeh VW, Lowenthal DK (2006) Adaptive, transparent frequency and voltage scaling of communication phases in MPI programs. In: Proceedings of the 2006 ACM/IEEE conference on Supercomputing

  23. Mills N, Mills E (2015) Taming the energy use of gaming computers. Energy Efficiency 1–18. doi:10.1007/s12053-015-9371-1

  24. Mittal S (2014) A survey of techniques for improving energy efficiency in embedded computing systems. Int J Comput Aided Eng Technol (IJACET) 6:440–459

    Article  Google Scholar 

  25. Moscibroda T, Mutlu O (2007) Memory performance attacks: Denial of memory service in multi-core systems. In: Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, SS’07, pp 18:1–18:18, Berkeley, CA, USA, 2007. USENIX Association

  26. Park J, Shin D, Chang N, Pedram M (2010) Accurate modeling and calculation of delay and energy overheads of dynamic voltage scaling in modern high-performance microprocessors. In: 2010 International Symposium on Low-Power Electronics and Design (ISLPED), pp 419–424

  27. Rountree B, Lownenthal DK, de Supinski BR, Schulz M, Freeh VW, Bletsch T (2009) Adagio: making DVS practical for complex HPC applications. In: Proceedings of the 23rd international conference on Supercomputing, ICS’09, pp 460–469, New York, NY, USA, 2009. ACM

  28. Saad Y (2003) Iterative methods for sparse linear systems, 2nd edn. SIAM, Philadelpha

    Book  MATH  Google Scholar 

  29. Sosonkina M, Saad Y, Cai X (2004) Using the parallel algebraic recursive multilevel solver in modern physical applications. Future Gener Comput Syst 20:489–500

    Article  Google Scholar 

  30. Sundriyal V, Sosonkina M (2011) Per-call energy saving strategies in all-to-all communications. In: Proceedings of the 18th European MPI Users’ Group conference on Recent advances in the message passing interface, EuroMPI’11, pp 188–197, Berlin, Heidelberg, 2011. Springer-Verlag

  31. Sundriyal V, Sosonkina M (2013) Initial investigation of a scheme to use instantaneous CPU power consumption for energy savings format. In: Proceedings of the 1st International Workshop on Energy Efficient Supercomputing, E2SC ’13, pp 1:1–1:6, New York, NY, USA, 2013. ACM

  32. Sundriyal V, Sosonkina M, Gaenko A (2012) Runtime procedure for energy savings in applications with point-to-point communications. In: Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on, pp 155–162

  33. Sundriyal V, Sosonkina M, Zhang Z (2012) Achieving energy efficiency during collective communications. Pract Exp Concurr Comput 25:2140–2156

    Article  Google Scholar 

  34. Tiwari A., Schulz M, Arrington L (2015) Predicting optimal power allocation for CPU and DRAM domains. In: Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International, pp 951–959, May 2015

  35. Vishnu A, Song S, Marquez A, Barker K, Kerbyson D, Cameron K, Balaji P (2010) Designing energy efficient communication runtime systems for data centric programming models. In: Proceedings of the 2010 IEEE/ACM Int’l Conference on Green Computing and Communications & Int’l Conference on Cyber, Physical and Social Computing, GREENCOM-CPSCOM ’10, pp 229–236, Washington, DC, USA, 2010. IEEE Computer Society

  36. Zhang Z, Chang JM (2014) A cool scheduler for multi-core systems exploiting program phases. Comput IEEE Trans 63(5):1061–1073

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work was supported in part by the Air Force Office of Scientific Research under the AFOSR award FA9550-12-1-0476, by the National Science Foundation grants 0904782, 1047772, 1516096, by the US Department of Energy, Office of Advanced Scientific Computing Research, through the Ames Laboratory, operated by Iowa State University under contract No. DE-AC02-07CH11358, and by the US Department of Defense High Performance Computing Modernization Program, through a HASI grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vaibhav Sundriyal.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sundriyal, V., Sosonkina, M. Joint frequency scaling of processor and DRAM. J Supercomput 72, 1549–1569 (2016). https://doi.org/10.1007/s11227-016-1680-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1680-4

Keywords

Navigation