Skip to main content

Predictive Modeling for Job Power Consumption in HPC Systems

  • Conference paper
  • First Online:
Book cover High Performance Computing (ISC High Performance 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9697))

Included in the following conference series:

Abstract

Power consumption is a critical aspect for next generation High Performance Computing systems: Supercomputers are expected to reach Exascale in 2023 but this will require a significant improvement in terms of energy efficiency. In this domain, power-capping can significant increase the final energy-efficiency by cutting cooling effort and worst-case design margins. A key aspect for an optimal implementation of power capping is the ability to estimate the power consumption of HPC applications before they run on the real system. In this paper we propose a Machine-Learning approach, based on the user and application resource request, to accurately predict the power consumption of typical supercomputer workloads. We demonstrate our method on real production workloads executed on the Eurora supercomputer hosted at CINECA computing center in Bologna and we provide useful insights to apply our technique in other installations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Measured as FLOPS (floating point operation per second).

  2. 2.

    It is probably due to the fact that Eurora was originally a prototype and only later entered production phase.

  3. 3.

    The proactive job dispatchers aforementioned require to know in advance the job power consumption as a single value; they could theoretically manage the power as a more complex object (i.e. a curve instead of a single value) but we risk to incur in significant performance losses.

  4. 4.

    The number of requested HW accelerators is important because GPUs adn Xeon Phi are mounted on computing nodes with different power consumptions, i.e. a job requiring a GPU will necessary run on a CPU consuming more power than those with a Xeon Phi.

  5. 5.

    We actually used a normalized prediction error: \((real\_power - predicted\_power) / real\_power\).

  6. 6.

    This is also a problem for new users to whom we cannot build any prediction model until a sufficient number of jobs are submitted.

  7. 7.

    We cannot just delete all jobs with short durations from the train set since in this way we could discard legitimate jobs.

  8. 8.

    In Eurora’s case this corresponds to less than 3 months of observation.

References

  1. Eurora page on the cineca web site. http://www.cineca.it/en/content/eurora. Accessed 14 Apr 2014

  2. PRACE, the Partnership for advanced computing in europe

    Google Scholar 

  3. Auweter, A., Bode, A., Brehm, M., Brochard, L., Hammer, N., Huber, H., Panda, R., Thomas, F., Wilde, T.: A case study of energy aware scheduling on SuperMUC. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 394–409. Springer, Heidelberg (2014)

    Google Scholar 

  4. Bartolini, A., Cacciari, M., Cavazzoni, C., et al.: Unveiling eurora - thermal and power characterization of the most energy-efficient supercomputer in the world. In: Design, Automation Test in Europe Conference Exhibition (DATE), March 2014

    Google Scholar 

  5. Bartolini, A., Cacciari, M., Tilli, A., Benini, L.: Thermal and energy management of high-performance multicores: distributed and self-calibrating model-predictive controller. IEEE Trans. Parallel Distrib. Syst. 24(1), 170–183 (2013)

    Article  Google Scholar 

  6. Bergman, K., Borkar, S., Campbell, D., et al.: Exascale computing study: technology challenges in achieving exascale systems, September 2008

    Google Scholar 

  7. Borghesi, A., Collina, F., Lombardi, M., Milano, M., Benini, L.: Power capping in high performance computing systems. In: Pesant, G. (ed.) CP 2015. LNCS, vol. 9255, pp. 524–540. Springer, Heidelberg (2015)

    Google Scholar 

  8. Borghesi, A., Conficoni, C., Lombardi, M., Bartolini, A.: MS3: a mediterranean-stile job scheduler for supercomputers - do less when it’s too hot! In: International Conference on High Performance Computing & Simulation, HPCS, Amsterdam, Netherlands, 20–24 July 2015, pp. 88–95 (2015)

    Google Scholar 

  9. Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. The Wadsworth and Brooks-Cole Statistics-Probability Series. Taylor & Francis, Abingdon (1984)

    Google Scholar 

  10. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  11. Chetsa, G.L.T., Lefevre, L., Pierson, J., et al.: Exploiting performance counters to predict and improve energy performance of HPC systems. Future Gener. Comput. Syst. 36, 287–298 (2014)

    Article  Google Scholar 

  12. Choi, J., Govindan, S., Urgaonkar, B., et al.: Profiling, prediction, and capping of power consumption in consolidated environments. In: IEEE International Symposium on Modeling, Analysis and Simulation of Computers and Telecommunication Systems, MASCOTS, pp. 1–10. IEEE (2008)

    Google Scholar 

  13. Cochran, R., Hankendi, C., Coskun, A.K., Reda, S.: Pack & cap: adaptive DVFS and thread packing under power caps. In: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 175–185. ACM (2011)

    Google Scholar 

  14. Contreras, G., Martonosi, M.: Power prediction for intel xscaleprocessors using performance monitoring unit events. In: Proceedings of the International Symposium on Low Power Electronics and Design, ISLPED 2005, pp. 221–226. ACM, New York (2005)

    Google Scholar 

  15. Dongarra, J.J.: Visit to the national university for defense technology changsha, China. Technical report, University of Tennessee, June 2013

    Google Scholar 

  16. Dongarra, J.J., Meuer, H.W., Strohmaier, E.: 29th top500 Supercomputer Sites. Technical report, Top500.org, November 1994

    Google Scholar 

  17. Feng, W., Cameron, K.: The Green500 list: encouraging sustainable supercomputing. IEEE Comput. 40(12), 50–55 (2007)

    Article  Google Scholar 

  18. Fraternali, F., Bartolini, A., Cavazzoni, C., et al.: Quantifying the impact of variability on the energy efficiency for a next-generation ultra-green supercomputer. In: Proceedings of the International Symposium on Low Power Electronics and Design, ISLPED 2014, pp. 295–298. ACM, New York (2014)

    Google Scholar 

  19. Jungsoo, K., Ruggiero, M., Atienza, D.: Free cooling-aware dynamic power management for green datacenters. In: 2012 International Conference on High Performance Computing and Simulation (HPCS), pp. 140–146, July 2012

    Google Scholar 

  20. Kogge, P., Resnick, D.R.: Yearly update: exascale projections for 2013, October 2013

    Google Scholar 

  21. Lefurgy, C., Wang, X., Ware, M.: Power capping: a prelude to power shifting. Cluster Comput. 11(2), 183–195 (2008)

    Article  Google Scholar 

  22. Pakin, S., Storlie, C., Lang, M., et al.: Power usage of production supercomputers and production workloads. Concurrency Comput.: Pract. Experience (2013)

    Google Scholar 

  23. Patki, T., Lowenthal, D.K., et al.: Exploring hardware overprovisioning in power-constrained, high performance computing. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS 2013, pp. 173–182. ACM, New York (2013)

    Google Scholar 

  24. Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  25. Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)

    Google Scholar 

  26. Sarood, O., Langer, A., Gupta, A., et al.: Maximizing throughput of overprovisioned HPC data centers under a strict power budget

    Google Scholar 

  27. Scogland, T.R.W., Steffen, C.P., Wilde, T., et al.: A power-measurement methodology for large-scale, high-performance computing. In: Proceedings of the 5th ACM/SPEC International Conference on Performance Engineering, ICPE 2014, pp. 149–159. ACM, New York (2014)

    Google Scholar 

  28. Shoukourian, H., Wilde, T., Auweter, A., Bode, A.: Predicting the energy and power consumption of strong and weak scaling HPC applications. Supercomputing Front. Innovations 1(2), 20–41 (2014)

    Google Scholar 

  29. Shoukourian, H., Wilde, T., Auweter, A., Bode, A.: Power variation aware configuration adviser for scalable HPC schedulers. In: 2015 International Conference on High Performance Computing Simulation (HPCS), pp. 71–79, July 2015

    Google Scholar 

  30. Storlie, C., Sexton, J., Pakin, S., et al.: Modeling and predicting power consumption of high performance computing jobs. arXiv preprint arXiv:1412.5247 (2014)

  31. You, H., Zhang, H.: Comprehensive workload analysis and modeling of a petascale supercomputer. In: Cirne, W., Desai, N., Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2012. LNCS, vol. 7698, pp. 253–271. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

Download references

Acknowledgement

This work was partially supported by the FP7 ERC Advance project MULTITHERMAN (g.a. 291125). We also want to thank CINECA and Eurotech for granting us the access to their systems.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Borghesi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Borghesi, A., Bartolini, A., Lombardi, M., Milano, M., Benini, L. (2016). Predictive Modeling for Job Power Consumption in HPC Systems. In: Kunkel, J., Balaji, P., Dongarra, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9697. Springer, Cham. https://doi.org/10.1007/978-3-319-41321-1_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41321-1_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41320-4

  • Online ISBN: 978-3-319-41321-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics