Skip to main content

Advertisement

Log in

Prediction models for performance, power, and energy efficiency of software executed on heterogeneous hardware

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Heterogeneous computer environments are becoming commonplace so it is increasingly important to understand how and where we could execute a given algorithm the most efficiently. In this paper we propose a methodology that uses both static source code metrics, and dynamic execution time, power, and energy measurements to build gain ratio prediction models. These models are trained on special benchmarks that have both sequential and parallel implementations and can be executed on various computing elements, e.g., on CPUs, GPUs, or FPGAs. After they are built, however, they can be applied to a new system using only the system’s static source code metrics which are much more easily computable than any dynamic measurement. We found that while estimating a continuous gain ratio is a much harder problem, we could predict the gain category (e.g., “slight improvement” or “large deterioration”) of porting to a specific configuration significantly more accurately than a random choice, using static information alone. We also conclude based on our benchmarks that parallelized implementations are less maintainable, thereby supporting the need for automatic transformations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. This journal paper is an extended version of our earlier conference paper [7].

  2. The full tables are part of the Online Appendix [8].”

References

  1. (2014) NVIDIA Management Library (NVML)—Reference Manual. NVIDIA Corporation, TRM-06719-001 _vR331

  2. (2014) PicoScope 4000 Series (A API)—Programmers Guide. Pico Technology Ltd., ps4000apg.en r1

  3. (2015) AMD GPU Performance API—User Guide. Advanced Micro Devices, Inc., v2.15

  4. (2015) ARM DS-5 Version 5.21—Streamline User Guide. ARM, ARM DUI0482S

  5. (2015) Intel 64 and IA-32 Architectures Software Developer’s Manual: vol 3B. Intel Corporation, Order Number 253669

  6. Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Stat Surv 4:40–79

    Article  MathSciNet  MATH  Google Scholar 

  7. Bán D, Ferenc R, Siket I, Kiss Á (2015) Prediction models for performance, power, and energy efficiency of software executed on heterogeneous hardware. In: Proceedings of the 13th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2015). IEEE, pp 178–183

  8. Bán D, Ferenc R, Siket I, Kiss Á, Gyimóthy T (2017) Performance, power, and energy prediction models. http://www.inf.u-szeged.hu/~ferenc/papers/PerformancePowerEnergyModels/

  9. Bán D, Sipka R, Dobi I (2017) Tagged parallel benchmarks. https://github.com/sed-inf-u-szeged/TaggedParallelBenchmarks

  10. Brandolese C, Fornaciari W, Salice F, Sciuto D (2001) Source-level execution time estimation of C programs. In: Proceedings of the Ninth International Symposium on Hardware/Software Codesign (CODES). ACM, New York, NY, USA, pp 98–103

  11. Brown KJ, Sujeeth AK, Lee HJ, Rompf T, Chafi H, Odersky M, Olukotun K (2011) A heterogeneous parallel framework for domain-specific languages. In: Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques. IEEE Computer Society, pp 89–100

  12. Che S, Boyer M, Meng J, Tarjan D, Sheaffer J, Lee SH, Skadron K (2009) Rodinia: a benchmark suite for heterogeneous computing. IEEE International Symposium on Workload Characterization (IISWC). IEEE Computer Society, Washington, DC, USA, pp 44–54

    Chapter  Google Scholar 

  13. Ferenc R et al (2014) Static analysis techniques for AIR generation. Deliverable D2.2, REPARA

  14. Ferenc R et al (2015) Maintainability models of heterogeneous programming models. Deliverable D7.4, REPARA

  15. Fursin G, Kashnikov Y, Memon AW, Chamski Z, Temam O, Namolaru M, Yom-Tov E, Mendelson B, Zaks A, Courtois E, Bodin F, Barnard P, Ashton E, Bonilla E, Thomson J, Williams CKI, O’Boyle M (2011) Milepost GCC: machine learning enabled self-tuning compiler. Int J Parallel Program 39:296–327

    Article  Google Scholar 

  16. Grauer-Gray S, Xu L, Searles R, Ayalasomayajula S, Cavazos J (2012) Auto-tuning a high-level language targeted to GPU codes. In: Innovative Parallel Computing (InPar). IEEE, pp 1–10

  17. Grewe D, O’Boyle MFP (2011) A static task partitioning approach for heterogeneous systems using OpenCL. In: Proceedings of the 20th International Conference Compiler Construction (CC). Springer, Berlin, Heidelberg, pp 286–305

  18. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. In: SIGKDD Explorations, ACM, vol 11, pp 10–18

  19. Kiss Á, Molnár P, Sipka R (2017) RMeasure performance and energy monitoring library. https://github.com/sed-inf-u-szeged/RMeasure

  20. Kuperberg M, Krogmann K, Reussner R (2008) Performance prediction for black-box components using reengineered parametric behaviour models. In: Proceedings of the 11th International Symposium on Component-Based Software Engineering. Springer, pp 48–63

  21. Li D, de Supinski B, Schulz M, Cameron K, Nikolopoulos D (2010) Hybrid MPI/OpenMP power-aware computing. In: IEEE International Symposium on Parallel Distributed Processing (IPDPS). IEEE, pp 1–12

  22. Ma X, Dong M, Zhong L, Deng Z (2009) Statistical power consumption analysis and modeling for GPU-based computing. In: In Proceedings of SOSP Workshop on Power-aware Computing and Systems (HotPower)’09

  23. Marin G, Mellor-Crummey J (2004) Cross-architecture performance predictions for scientific applications using parameterized models. In: Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems. ACM, pp 2–13

  24. Osmulski T, Muehring JT, Veale B, West JM, Li H, Vanichayobon S, Ko SH, Antonio JK, Dhall SK (2000) A probabilistic power prediction tool for the Xilinx 4000-series FPGA. In: Proceedings of the IPDPS 2000 Workshops on Parallel and Distributed Processing. Springer, pp 776–783

  25. Pflüger D, Pfander D (2016) Computational efficiency vs. maintainability and portability. Experiences with the sparse grid code sg++. In: Proceedings of the Fourth International Workshop on Software Engineering for High Performance Computing in Computational Science and Engineering (SE-HPCCSE). IEEE, pp 17–25

  26. Pouchet LN (2011) Polybench: the polyhedral benchmark suite. http://www-roc.inria.fr/~pouchet/software/polybench

  27. Sánchez LM et al (2014) Target platform description specification. Deliverable D3.1, REPARA

  28. Shen J, Fang J, Sips H, Varbanescu A (2012) Performance gaps between OpenMP and OpenCL for multi-core CPUs. 41st International Conference on Parallel Processing Workshops (ICPPW). IEEE Computer Society, Washington, DC, USA, pp 116–125

    Google Scholar 

  29. Stratton JA, Rodrigues C, Sung IJ, Obeid N, Chang LW, Anssari N, Liu GD, Mei W, Hwu W (2012) Parboil: a revised benchmark suite for scientific and commercial throughput computing. Technical report, University of Illinois at Urbana-Champaign

  30. Takizawa H, Sato K, Kobayashi H (2008) SPRAT: Runtime processor selection for energy-aware computing. In: IEEE International Conference on Cluster Computing. IEEE, pp 386–393

  31. Van Der Vaart A (1998) Asymptotic statistics, Cambridge series in statistical and probabilistic mathematics, vol 3. Cambridge University Press, Cambridge

    Google Scholar 

  32. Yang L, Ma X, Mueller F (2005) Cross-platform performance prediction of parallel applications using partial execution. In: Proceedings of the ACM/IEEE SC 2005 Conference on Supercomputing. IEEE Computer Society, Washington, DC, USA, p 40

Download references

Acknowledgements

The authors would like to thank Péter Molnár and Róbert Sipka for their extensive help with dynamic measurements. This work was supported by the European Union FP7 Project “REPARA—Reengineering and Enabling Performance And poweR of Applications” (Project No. 609666), and by the EU-funded Hungarian national Grant GINOP-2.3.2-15-2016-00037 titled “Internet of Living Things.”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to István Siket.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bán, D., Ferenc, R., Siket, I. et al. Prediction models for performance, power, and energy efficiency of software executed on heterogeneous hardware. J Supercomput 75, 4001–4025 (2019). https://doi.org/10.1007/s11227-018-2252-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2252-6

Keywords

Navigation