Prediction models for performance, power, and energy efficiency of software executed on heterogeneous hardware

Bán, Dénes; Ferenc, Rudolf; Siket, István; Kiss, Ákos; Gyimóthy, Tibor

doi:10.1007/s11227-018-2252-6

Prediction models for performance, power, and energy efficiency of software executed on heterogeneous hardware

Published: 02 February 2018

Volume 75, pages 4001–4025, (2019)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

477 Accesses
3 Citations
Explore all metrics

Abstract

Heterogeneous computer environments are becoming commonplace so it is increasingly important to understand how and where we could execute a given algorithm the most efficiently. In this paper we propose a methodology that uses both static source code metrics, and dynamic execution time, power, and energy measurements to build gain ratio prediction models. These models are trained on special benchmarks that have both sequential and parallel implementations and can be executed on various computing elements, e.g., on CPUs, GPUs, or FPGAs. After they are built, however, they can be applied to a new system using only the system’s static source code metrics which are much more easily computable than any dynamic measurement. We found that while estimating a continuous gain ratio is a much harder problem, we could predict the gain category (e.g., “slight improvement” or “large deterioration”) of porting to a specific configuration significantly more accurately than a random choice, using static information alone. We also conclude based on our benchmarks that parallelized implementations are less maintainable, thereby supporting the need for automatic transformations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Machine Learning Model for Code Optimization

Article 22 September 2023

Optimal classification trees

Article 03 April 2017

Can GPU performance increase faster than the code error rate?

Article Open access 18 April 2024

Notes

This journal paper is an extended version of our earlier conference paper [7].
The full tables are part of the Online Appendix [8].”

References

(2014) NVIDIA Management Library (NVML)—Reference Manual. NVIDIA Corporation, TRM-06719-001 _vR331
(2014) PicoScope 4000 Series (A API)—Programmers Guide. Pico Technology Ltd., ps4000apg.en r1
(2015) AMD GPU Performance API—User Guide. Advanced Micro Devices, Inc., v2.15
(2015) ARM DS-5 Version 5.21—Streamline User Guide. ARM, ARM DUI0482S
(2015) Intel 64 and IA-32 Architectures Software Developer’s Manual: vol 3B. Intel Corporation, Order Number 253669
Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Stat Surv 4:40–79
Article MathSciNet MATH Google Scholar
Bán D, Ferenc R, Siket I, Kiss Á (2015) Prediction models for performance, power, and energy efficiency of software executed on heterogeneous hardware. In: Proceedings of the 13th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2015). IEEE, pp 178–183
Bán D, Ferenc R, Siket I, Kiss Á, Gyimóthy T (2017) Performance, power, and energy prediction models. http://www.inf.u-szeged.hu/~ferenc/papers/PerformancePowerEnergyModels/
Bán D, Sipka R, Dobi I (2017) Tagged parallel benchmarks. https://github.com/sed-inf-u-szeged/TaggedParallelBenchmarks
Brandolese C, Fornaciari W, Salice F, Sciuto D (2001) Source-level execution time estimation of C programs. In: Proceedings of the Ninth International Symposium on Hardware/Software Codesign (CODES). ACM, New York, NY, USA, pp 98–103
Brown KJ, Sujeeth AK, Lee HJ, Rompf T, Chafi H, Odersky M, Olukotun K (2011) A heterogeneous parallel framework for domain-specific languages. In: Proceedings of the 2011 International Conference on Parallel Architectures and Compilation Techniques. IEEE Computer Society, pp 89–100
Che S, Boyer M, Meng J, Tarjan D, Sheaffer J, Lee SH, Skadron K (2009) Rodinia: a benchmark suite for heterogeneous computing. IEEE International Symposium on Workload Characterization (IISWC). IEEE Computer Society, Washington, DC, USA, pp 44–54
Chapter Google Scholar
Ferenc R et al (2014) Static analysis techniques for AIR generation. Deliverable D2.2, REPARA
Ferenc R et al (2015) Maintainability models of heterogeneous programming models. Deliverable D7.4, REPARA
Fursin G, Kashnikov Y, Memon AW, Chamski Z, Temam O, Namolaru M, Yom-Tov E, Mendelson B, Zaks A, Courtois E, Bodin F, Barnard P, Ashton E, Bonilla E, Thomson J, Williams CKI, O’Boyle M (2011) Milepost GCC: machine learning enabled self-tuning compiler. Int J Parallel Program 39:296–327
Article Google Scholar
Grauer-Gray S, Xu L, Searles R, Ayalasomayajula S, Cavazos J (2012) Auto-tuning a high-level language targeted to GPU codes. In: Innovative Parallel Computing (InPar). IEEE, pp 1–10
Grewe D, O’Boyle MFP (2011) A static task partitioning approach for heterogeneous systems using OpenCL. In: Proceedings of the 20th International Conference Compiler Construction (CC). Springer, Berlin, Heidelberg, pp 286–305
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. In: SIGKDD Explorations, ACM, vol 11, pp 10–18
Kiss Á, Molnár P, Sipka R (2017) RMeasure performance and energy monitoring library. https://github.com/sed-inf-u-szeged/RMeasure
Kuperberg M, Krogmann K, Reussner R (2008) Performance prediction for black-box components using reengineered parametric behaviour models. In: Proceedings of the 11th International Symposium on Component-Based Software Engineering. Springer, pp 48–63
Li D, de Supinski B, Schulz M, Cameron K, Nikolopoulos D (2010) Hybrid MPI/OpenMP power-aware computing. In: IEEE International Symposium on Parallel Distributed Processing (IPDPS). IEEE, pp 1–12
Ma X, Dong M, Zhong L, Deng Z (2009) Statistical power consumption analysis and modeling for GPU-based computing. In: In Proceedings of SOSP Workshop on Power-aware Computing and Systems (HotPower)’09
Marin G, Mellor-Crummey J (2004) Cross-architecture performance predictions for scientific applications using parameterized models. In: Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems. ACM, pp 2–13
Osmulski T, Muehring JT, Veale B, West JM, Li H, Vanichayobon S, Ko SH, Antonio JK, Dhall SK (2000) A probabilistic power prediction tool for the Xilinx 4000-series FPGA. In: Proceedings of the IPDPS 2000 Workshops on Parallel and Distributed Processing. Springer, pp 776–783
Pflüger D, Pfander D (2016) Computational efficiency vs. maintainability and portability. Experiences with the sparse grid code sg++. In: Proceedings of the Fourth International Workshop on Software Engineering for High Performance Computing in Computational Science and Engineering (SE-HPCCSE). IEEE, pp 17–25
Pouchet LN (2011) Polybench: the polyhedral benchmark suite. http://www-roc.inria.fr/~pouchet/software/polybench
Sánchez LM et al (2014) Target platform description specification. Deliverable D3.1, REPARA
Shen J, Fang J, Sips H, Varbanescu A (2012) Performance gaps between OpenMP and OpenCL for multi-core CPUs. 41st International Conference on Parallel Processing Workshops (ICPPW). IEEE Computer Society, Washington, DC, USA, pp 116–125
Google Scholar
Stratton JA, Rodrigues C, Sung IJ, Obeid N, Chang LW, Anssari N, Liu GD, Mei W, Hwu W (2012) Parboil: a revised benchmark suite for scientific and commercial throughput computing. Technical report, University of Illinois at Urbana-Champaign
Takizawa H, Sato K, Kobayashi H (2008) SPRAT: Runtime processor selection for energy-aware computing. In: IEEE International Conference on Cluster Computing. IEEE, pp 386–393
Van Der Vaart A (1998) Asymptotic statistics, Cambridge series in statistical and probabilistic mathematics, vol 3. Cambridge University Press, Cambridge
Google Scholar
Yang L, Ma X, Mueller F (2005) Cross-platform performance prediction of parallel applications using partial execution. In: Proceedings of the ACM/IEEE SC 2005 Conference on Supercomputing. IEEE Computer Society, Washington, DC, USA, p 40

Download references

Acknowledgements

The authors would like to thank Péter Molnár and Róbert Sipka for their extensive help with dynamic measurements. This work was supported by the European Union FP7 Project “REPARA—Reengineering and Enabling Performance And poweR of Applications” (Project No. 609666), and by the EU-funded Hungarian national Grant GINOP-2.3.2-15-2016-00037 titled “Internet of Living Things.”

Author information

Authors and Affiliations

Department of Software Engineering, University of Szeged, Szeged, Hungary
Dénes Bán, Rudolf Ferenc, István Siket & Ákos Kiss
MTA-SZTE Research Group on Artificial Intelligence, Department of Software Engineering, University of Szeged, Szeged, Hungary
Tibor Gyimóthy

Authors

Dénes Bán
View author publications
You can also search for this author in PubMed Google Scholar
Rudolf Ferenc
View author publications
You can also search for this author in PubMed Google Scholar
István Siket
View author publications
You can also search for this author in PubMed Google Scholar
Ákos Kiss
View author publications
You can also search for this author in PubMed Google Scholar
Tibor Gyimóthy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to István Siket.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bán, D., Ferenc, R., Siket, I. et al. Prediction models for performance, power, and energy efficiency of software executed on heterogeneous hardware. J Supercomput 75, 4001–4025 (2019). https://doi.org/10.1007/s11227-018-2252-6

Download citation

Published: 02 February 2018
Issue Date: 01 August 2019
DOI: https://doi.org/10.1007/s11227-018-2252-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Prediction models for performance, power, and energy efficiency of software executed on heterogeneous hardware

Abstract

Access this article

Similar content being viewed by others

A Hybrid Machine Learning Model for Code Optimization

Optimal classification trees

Can GPU performance increase faster than the code error rate?

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Prediction models for performance, power, and energy efficiency of software executed on heterogeneous hardware

Abstract

Access this article

Similar content being viewed by others

A Hybrid Machine Learning Model for Code Optimization

Optimal classification trees

Can GPU performance increase faster than the code error rate?

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation