Skip to main content
Log in

Pre-execution power consumption prediction of computational multithreaded workloads

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Power management in large-scale computational environments can significantly benefit from predictive models. Such models provide information about the power consumption behavior of workloads prior to running them. Power consumption depends on the characteristics of both the machine and the workload. However, combinational features such as the cache miss rate cannot be considered due to their unavailability before running the workload. Therefore, pre-execution power modeling requires both machine-independent workload characteristics and workload-independent machine characteristics. In this paper the predictive modeling problem is tackled by the proposal of a two-stage modeling framework. In the first stage, a machine learning approach is taken to predict single-threaded workload power consumption at a specific frequency. The second stage analytically scales this output to any intended thread/frequency configuration. Experimental results show that the proposed approach can yield highly accurate predictions about workload power consumption with an average error of 3.7 % on six different test platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Kołodziej, J., Khan, S.U., Wang, L., Byrski, A., Min-Allah, N., Madani, S.A.: Hierarchical genetic-based grid scheduling with energy optimization. Clust. Comput. 16(3), 591–609 (2013)

    Article  Google Scholar 

  2. Nesmachnow, S., Dorronsoro, B., Pecero, J.E., Bouvry, P.: Energy-aware scheduling on multicore heterogeneous grid computing systems. Springer J. Grid Comput. 11(4), 653–680 (2013)

    Article  Google Scholar 

  3. Valentini, G.L., Lassonde, W., Khan, S.U., Min-Allah, N., Madani, S.A., Li, J., Zhang, L., Wang, L., Ghani, N., Kolodziej, J., et al.: An overview of energy efficiency techniques in cluster computing systems. Clust. Comput. 16(1), 3–15 (2013)

    Article  Google Scholar 

  4. Wang, L., Khan, S.U., Chen, D., Kołodziej, J., Ranjan, R., Xu, C.Z., Zomaya, A.: Energy-aware parallel task scheduling in a cluster. Futur. Gener. Comput. Syst. 29(7), 1661–1670 (2013)

  5. Laszewski, G.V., Wang, L., Younge, A.J., He, X.: Power-aware scheduling of virtual machines in DVFS-enabled clusters. In: Proc. of CLUSTER’09, pp. 1–10. IEEE, New Orleans, LA (2009).

  6. Jiang, C., Wan, J., You, X., Zhao, Y.: Power aware job scheduling in multi-processor system with service level agreements constraints. J. Comput. 5(8), 1193–1203 (2010)

    Google Scholar 

  7. Bertran, R., Gonzàlez, M., Martorell, X., Navarro, N., Ayguadé, E.: Decomposable and responsive power models for multicore processors using performance counters. In: Proceedings of the ICS’10, pp. 147–158. ACM, New York, NY, Tsukuba, Ibaraki (2010).

  8. Bertran, R., Gonzàlez, M., Martorell, X., Navarro, N., Ayguadé, E.: A systematic methodology to generate decomposable and responsive power models for CMPs. IEEE Trans. Comput. 62(7), 1289–1302 (2013)

    Article  MathSciNet  Google Scholar 

  9. Hager, G., Treibig, J., Habich, J., Wellein, G.: Exploring performance and power properties of modern multi-core chips via simple machine models. Concurr. Comput.: Pract. Exp. (2014).

  10. Fan, X., Weber, W.D., Barroso, L.A.: Power provisioning for a warehouse-sized computer. ACM SIGARCH Comput. Arch. News 35(2), 13–23 (2007)

    Article  Google Scholar 

  11. Rivoire, S., Ranganathan, P., Kozyrakis, C.: A comparison of high-level full-system power models. In: Proceedings of the HotPower’08. USENIX, San Diego, CA (2008).

  12. Bellosa, F., Kellner, S., Waitz, M., Weissel, A.: Event-driven energy accounting for dynamic thermal management. In: Proceedings of the COLP’03. New Orleans, Louisiana (2003).

  13. Bircher, W.L., John, L.K.: Complete system power estimation using processor performance events. IEEE Trans. Comput. 61(4), 563–577 (2012)

    Article  MathSciNet  Google Scholar 

  14. Goel, B.: Per-core power estimation and power aware scheduling strategies for CMPs. Master’s thesis, Chalmers University of Technology, Gothenburg, Sweden (2011).

  15. Singh, K., Bhadauria, M., McKee, S.A.: Prediction-based power estimation and scheduling for CMPs. In: Proceedings of the ICS’09, pp. 501–502. ACM, New York, NY (2009).

  16. Li, T., John, L.K.: Run-time modeling and estimation of operating system power consumption. In: Proceedings of the SIGMETRICS’03, vol. 31, pp. 160–171. ACM, New York, NY (2003).

  17. Pathak, A., Hu, Y.C., Zhang, M., Bahl, P., Wang, Y.M.: Fine-grained power modeling for smartphones using system call tracing. In: Proceedings of the EuroSys’11, pp. 153–168. ACM, New York, NY (2011).

  18. Chen, X., Xu, C., Dick, R.P., Mao, Z.M.: Performance and power modeling in a multi-programmed multi-core environment. In: Proceedings of the DAC’10, pp. 813–818. ACM, New York, NY, Anaheim, CA (2010).

  19. Hu, C., Jiménez, D.A., Kremer, U.: Combining edge vector and event counter for time-dependent power behavior characterization. Springer Trans. High-Perform. Embed. Arch. Compil. 5470, 85–104 (2009).

  20. Wang, S., Chen, H., Shi, W, (2011) SPAN: a software power analyzer for multicore computer systems. Elsevier Sustain. Comput.: Inform. Syst. 1(1), 23–34.

  21. Singh, K., Bhadauria, M., McKee, S.A.: Real time power estimation and thread scheduling via performance counters. ACM SIGARCH Comput. Arch. News 37(2), 46–55 (2009)

    Article  Google Scholar 

  22. Zamani, R., Afsahi, A.: Adaptive estimation and prediction of power and performance in high performance computing. Springer Comput. Sci. Res. Dev. 25(3), 177–186 (2010)

    Article  Google Scholar 

  23. Bertran, R., Gonzàlez, M., Martorell, X., Navarro, N., Ayguadé, E.: Counter-based power modeling methods: top-down vs. bottom-up. Comput. J. 56(2), 198–213 (2013)

    Article  Google Scholar 

  24. Joshi, A.M., Eeckhout, L., John, L.K., Isen, C.: Automated microprocessor stressmark generation. In: Proceedings of the High HPCA’08, pp. 229–239. IEEE, Salt Lake City, UT (2008).

  25. Hoste, K., Phansalkar, A., Eeckhout, L., Georges, A., John, L.K., Bosschere, K.D.: Performance prediction based on inherent program similarity. In: Proceedings of the PACT’06, pp. 114–122. ACM, Seattle, Washington (2006).

  26. Joshi, A., Phansalkar, A., Eeckhout, L., John, L.K.: Measuring benchmark similarity using inherent program characteristics. IEEE Trans. Comput. 55(6), 769–782 (2006)

    Article  Google Scholar 

  27. Lau, J., Sampson, J., Perelman, E., Hamerly, G., Calder, B.: The strong correlation between code signatures and performance. In: Proceedings of the ISPASS05, pp. 236–247. IEEE, Austin, TX (2005).

  28. Hoste, K., Eeckhout, L.: Microarchitecture-independent workload characterization. IEEE Micro 27(3), 63–72 (2007)

    Article  Google Scholar 

  29. Franklin, M., Sohi, G.S.: Register traffic analysis for streamlining inter-operation communication in fine-grain parallel processors. ACM SIGMICRO Newslett. 23(1–2), 236–245 (1992)

    Article  Google Scholar 

  30. Lafage, T., Seznec, A.: Choosing representative slices of program execution for microarchitecture simulations: a preliminary application to the data stream. Springer Workload Charact. Emerg. Comput. Appl. pp. 145–163 (2001).

  31. Haungs, M., Sallee, P., Farrens, M.: Branch transition rate: a new metric for improved branch classification analysis. In: Proceedings of the HPCA’00, pp. 241–250. IEEE, Toulouse (2000).

  32. Moore, R.: Predicting application performance for chip multiprocessors. Ph.D. thesis, University of Pittsburgh (2014).

  33. Liu, D., Svensson, C.: Power consumption estimation in CMOS VLSI chips. IEEE J. Solid-State Circuits 29(6), 663–670 (1994)

    Article  Google Scholar 

  34. Sakurai, T., Newton, A.R.: Alpha-power law MOSFET model and its applications to CMOS inverter delay and other formulas. IEEE J. Solid-State Circuits 25(2), 584–594 (1990)

    Article  Google Scholar 

  35. Mitchell, T.: Machine Learning. McGraw Hill, New York (1997).

  36. Shanno, D.F.: Conditioning of quasi-newton methods for function minimization. Math. Comput. 24(111), 647–656 (1970)

    Article  MathSciNet  Google Scholar 

  37. Hsu, C.H., Kremer, U.: The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction. ACM SIGPLAN Not. 38(5), 38–48 (2003)

  38. Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the AFIPS’67, pp. 483–485 (1967).

  39. Cho, S., Melhem, R.G.: Corollaries to Amdahl’s law for energy. Comput. Arch. Lett. 7(1), 25–28 (2008)

    Article  Google Scholar 

  40. Woo, D.H., Lee, H.H.S.: Extending Amdahl’s law for energy-efficient computing in the many-core era. IEEE Comput. 41(12), 24–31 (2008)

    Article  Google Scholar 

  41. Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: Proceedings of the PACT’08, pp. 72–81. Toronto, Canada (2008).

  42. Bailey, D.H., Barszcz, E., Barton, J.T., Browning, D.S., Carter, R.L., Dagum, L., Fatoohi, R.A., Frederickson, P.O., Lasinski, T.A., Schreiber, R.S.: The NAS parallel benchmarks. Int. J. High Perform. Comput. Appl. 5(3), 63–73 (1991)

    Article  Google Scholar 

  43. Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. ACM SIGPLAN Not. 40(6), 190–200 (2005)

    Article  Google Scholar 

  44. Armstrong, J.S., Collopy, F.: Error measures for generalizing about forecasting methods: empirical comparisons. Int. J. Forecasting 8(1), 69–80 (1992)

    Article  Google Scholar 

  45. Alonso, P., Dolz, M.F., Mayo, R., Quintana-Ortí, E.S.: Modeling power and energy of the task-parallel cholesky factorization on multicore processors. Springer Comput. Sci. Res. Dev. 29(2), 105–112 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamid Fadishei.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fadishei, H., Deldari, H. & Naghibzadeh, M. Pre-execution power consumption prediction of computational multithreaded workloads. Cluster Comput 17, 1323–1333 (2014). https://doi.org/10.1007/s10586-014-0401-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-014-0401-0

Keywords

Navigation