Abstract
Power consumption is a critical aspect for next generation High Performance Computing systems: Supercomputers are expected to reach Exascale in 2023 but this will require a significant improvement in terms of energy efficiency. In this domain, power-capping can significant increase the final energy-efficiency by cutting cooling effort and worst-case design margins. A key aspect for an optimal implementation of power capping is the ability to estimate the power consumption of HPC applications before they run on the real system. In this paper we propose a Machine-Learning approach, based on the user and application resource request, to accurately predict the power consumption of typical supercomputer workloads. We demonstrate our method on real production workloads executed on the Eurora supercomputer hosted at CINECA computing center in Bologna and we provide useful insights to apply our technique in other installations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Measured as FLOPS (floating point operation per second).
- 2.
It is probably due to the fact that Eurora was originally a prototype and only later entered production phase.
- 3.
The proactive job dispatchers aforementioned require to know in advance the job power consumption as a single value; they could theoretically manage the power as a more complex object (i.e. a curve instead of a single value) but we risk to incur in significant performance losses.
- 4.
The number of requested HW accelerators is important because GPUs adn Xeon Phi are mounted on computing nodes with different power consumptions, i.e. a job requiring a GPU will necessary run on a CPU consuming more power than those with a Xeon Phi.
- 5.
We actually used a normalized prediction error: \((real\_power - predicted\_power) / real\_power\).
- 6.
This is also a problem for new users to whom we cannot build any prediction model until a sufficient number of jobs are submitted.
- 7.
We cannot just delete all jobs with short durations from the train set since in this way we could discard legitimate jobs.
- 8.
In Eurora’s case this corresponds to less than 3 months of observation.
References
Eurora page on the cineca web site. http://www.cineca.it/en/content/eurora. Accessed 14 Apr 2014
PRACE, the Partnership for advanced computing in europe
Auweter, A., Bode, A., Brehm, M., Brochard, L., Hammer, N., Huber, H., Panda, R., Thomas, F., Wilde, T.: A case study of energy aware scheduling on SuperMUC. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 394–409. Springer, Heidelberg (2014)
Bartolini, A., Cacciari, M., Cavazzoni, C., et al.: Unveiling eurora - thermal and power characterization of the most energy-efficient supercomputer in the world. In: Design, Automation Test in Europe Conference Exhibition (DATE), March 2014
Bartolini, A., Cacciari, M., Tilli, A., Benini, L.: Thermal and energy management of high-performance multicores: distributed and self-calibrating model-predictive controller. IEEE Trans. Parallel Distrib. Syst. 24(1), 170–183 (2013)
Bergman, K., Borkar, S., Campbell, D., et al.: Exascale computing study: technology challenges in achieving exascale systems, September 2008
Borghesi, A., Collina, F., Lombardi, M., Milano, M., Benini, L.: Power capping in high performance computing systems. In: Pesant, G. (ed.) CP 2015. LNCS, vol. 9255, pp. 524–540. Springer, Heidelberg (2015)
Borghesi, A., Conficoni, C., Lombardi, M., Bartolini, A.: MS3: a mediterranean-stile job scheduler for supercomputers - do less when it’s too hot! In: International Conference on High Performance Computing & Simulation, HPCS, Amsterdam, Netherlands, 20–24 July 2015, pp. 88–95 (2015)
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. The Wadsworth and Brooks-Cole Statistics-Probability Series. Taylor & Francis, Abingdon (1984)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Chetsa, G.L.T., Lefevre, L., Pierson, J., et al.: Exploiting performance counters to predict and improve energy performance of HPC systems. Future Gener. Comput. Syst. 36, 287–298 (2014)
Choi, J., Govindan, S., Urgaonkar, B., et al.: Profiling, prediction, and capping of power consumption in consolidated environments. In: IEEE International Symposium on Modeling, Analysis and Simulation of Computers and Telecommunication Systems, MASCOTS, pp. 1–10. IEEE (2008)
Cochran, R., Hankendi, C., Coskun, A.K., Reda, S.: Pack & cap: adaptive DVFS and thread packing under power caps. In: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 175–185. ACM (2011)
Contreras, G., Martonosi, M.: Power prediction for intel xscaleprocessors using performance monitoring unit events. In: Proceedings of the International Symposium on Low Power Electronics and Design, ISLPED 2005, pp. 221–226. ACM, New York (2005)
Dongarra, J.J.: Visit to the national university for defense technology changsha, China. Technical report, University of Tennessee, June 2013
Dongarra, J.J., Meuer, H.W., Strohmaier, E.: 29th top500 Supercomputer Sites. Technical report, Top500.org, November 1994
Feng, W., Cameron, K.: The Green500 list: encouraging sustainable supercomputing. IEEE Comput. 40(12), 50–55 (2007)
Fraternali, F., Bartolini, A., Cavazzoni, C., et al.: Quantifying the impact of variability on the energy efficiency for a next-generation ultra-green supercomputer. In: Proceedings of the International Symposium on Low Power Electronics and Design, ISLPED 2014, pp. 295–298. ACM, New York (2014)
Jungsoo, K., Ruggiero, M., Atienza, D.: Free cooling-aware dynamic power management for green datacenters. In: 2012 International Conference on High Performance Computing and Simulation (HPCS), pp. 140–146, July 2012
Kogge, P., Resnick, D.R.: Yearly update: exascale projections for 2013, October 2013
Lefurgy, C., Wang, X., Ware, M.: Power capping: a prelude to power shifting. Cluster Comput. 11(2), 183–195 (2008)
Pakin, S., Storlie, C., Lang, M., et al.: Power usage of production supercomputers and production workloads. Concurrency Comput.: Pract. Experience (2013)
Patki, T., Lowenthal, D.K., et al.: Exploring hardware overprovisioning in power-constrained, high performance computing. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, ICS 2013, pp. 173–182. ACM, New York (2013)
Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Sarood, O., Langer, A., Gupta, A., et al.: Maximizing throughput of overprovisioned HPC data centers under a strict power budget
Scogland, T.R.W., Steffen, C.P., Wilde, T., et al.: A power-measurement methodology for large-scale, high-performance computing. In: Proceedings of the 5th ACM/SPEC International Conference on Performance Engineering, ICPE 2014, pp. 149–159. ACM, New York (2014)
Shoukourian, H., Wilde, T., Auweter, A., Bode, A.: Predicting the energy and power consumption of strong and weak scaling HPC applications. Supercomputing Front. Innovations 1(2), 20–41 (2014)
Shoukourian, H., Wilde, T., Auweter, A., Bode, A.: Power variation aware configuration adviser for scalable HPC schedulers. In: 2015 International Conference on High Performance Computing Simulation (HPCS), pp. 71–79, July 2015
Storlie, C., Sexton, J., Pakin, S., et al.: Modeling and predicting power consumption of high performance computing jobs. arXiv preprint arXiv:1412.5247 (2014)
You, H., Zhang, H.: Comprehensive workload analysis and modeling of a petascale supercomputer. In: Cirne, W., Desai, N., Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2012. LNCS, vol. 7698, pp. 253–271. Springer, Heidelberg (2013)
Acknowledgement
This work was partially supported by the FP7 ERC Advance project MULTITHERMAN (g.a. 291125). We also want to thank CINECA and Eurotech for granting us the access to their systems.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Borghesi, A., Bartolini, A., Lombardi, M., Milano, M., Benini, L. (2016). Predictive Modeling for Job Power Consumption in HPC Systems. In: Kunkel, J., Balaji, P., Dongarra, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9697. Springer, Cham. https://doi.org/10.1007/978-3-319-41321-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-41321-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41320-4
Online ISBN: 978-3-319-41321-1
eBook Packages: Computer ScienceComputer Science (R0)