skip to main content
10.1145/2807591.2807637acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Finding the limits of power-constrained application performance

Published: 15 November 2015 Publication History

Abstract

As we approach exascale systems, power is turning from an optimization goal to a critical operating constraint. With power bounds imposed by both stakeholders and the limitations of existing infrastructure, we need to develop new techniques that work with limited power to extract maximum performance. In this paper, we explore this area and provide an approach to find the theoretical upper bound of computational performance on a per-application basis in hybrid MPI + OpenMP applications.
We use a linear programming (LP) formulation to optimize application schedules under various power constraints, where a schedule consists of a DVFS state and number of OpenMP threads for each section of computation between consecutive MPI calls. We also provide a more flexible mixed integer-linear (ILP) formulation and show that the resulting schedules closely match schedules from the LP formulation. Across four applications, we use our LP-derived upper bounds to show that current approaches trail optimal, power-constrained performance by up to 41.1%. This demonstrates the untapped potential of current systems, and our LP formulation provides future optimization approaches with a quantitative optimization target.

References

[1]
Coral benchmark codes. https://asc.llnl.gov/CORAL-benchmarks. Accessed: 2015-01-13.
[2]
Comd. https://github.com/exmatex/CoMD, 2013.
[3]
C. Artigues, O. Koné, P. Lopez, and M. Mongeau. Mixed-integer linear programming formulations. In C. Schwindt and J. Zimmermann, editors, Handbook on Project Management and Scheduling Vol.1, International Handbooks on Information Systems, pages 17--41. Springer International Publishing, 2015.
[4]
D. Bailey, E. Barszcz, J. Barton, D. Browning, R. Carter, L. Dagum, R. Fatoohi, P. Frederickson, T. Lasinski, R. Schreiber, et al. The NAS parallel benchmarks summary and preliminary results. In Supercomputing, pages 158--165, 1991.
[5]
P. E. Bailey, D. K. Lowenthal, V. Ravi, B. Rountree, M. Schulz, and B. R. de Supinski. Adaptive configuration selection for power-constrained heterogeneous systems. In International Conference on Parallel Processing, volume 43, 2014.
[6]
K. J. Barker, D. J. Kerbyson, and E. Anger. On the feasibility of dynamic power steering. In Proceedings of the 2nd International Workshop on Energy Efficient Supercomputing, pages 60--69. IEEE Press, 2014.
[7]
R. Cochran, C. Hankendi, A. K. Coskun, and S. Reda. Pack & cap: adaptive dvfs and thread packing under power caps. In Proceedings of the 44th annual IEEE/ACM international symposium on microarchitecture, pages 175--185. ACM, 2011.
[8]
M. Curtis-Maury, A. Shah, F. Blagojevic, D. Nikolopoulos, B. de Supinski, and M. Schulz. Prediction models for multi-dimensional power-performance optimization on many cores. In International Conference on Parallel Architectures and Compilation Techniques, 2008.
[9]
H. David, E. Gorbatov, U. Hanebutte, R. Khanna, and C. Le. RAPL: Memory power estimation and capping. In ACM/IEEE International Symposium on Low Power Electronics and Design, pages 189--194. ACM, 2010.
[10]
M. Etinski, J. Corbalan, J. Labarta, and M. Valero. Optimizing job performance under a given power constraint in hpc centers. In Green Computing Conference, 2010 International, pages 257--267. IEEE, 2010.
[11]
M. Etinski, J. Corbalan, J. Labarta, and M. Valero. Linear programming based parallel job scheduling for power constrained systems. In High Performance Computing and Simulation (HPCS), 2011 International Conference on, pages 72--80. IEEE, 2011.
[12]
M. Etinski, J. Corbalan, J. Labarta, M. Valero, and A. Veidenbaum. Power-aware load balancing of large scale mpi applications. In Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pages 1--8. IEEE, 2009.
[13]
R. Ge, X. Feng, W. Feng, and K. W. Cameron. CPU Miser: A performance-directed, run-time system for power-aware clusters. In ICPP, 2007.
[14]
Intel. Intel-64 and IA-32 Architectures Software Developer's Manual, Volumes 3A and 3B: System Programming Guide, 2011.
[15]
C. Isci, A. Buyuktosunoglu, C. Cher, P. Bose, and M. Martonosi. An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget. In IEEE/ACM International Symposium on Microarchitecture, pages 347--358, 2006.
[16]
N. Kappiah, V. W. Freeh, and D. K. Lowenthal. Just in time dynamic voltage scaling: Exploiting inter-node slack to save energy in MPI programs. In Supercomputing, Nov. 2005.
[17]
I. Karlin, J. Keasler, and R. Neely. Lulesh 2.0 updates and changes. Technical Report LLNL-TR-641973, Lawrence Livermore National Laboratory, August 2013.
[18]
O. Koné, C. Artigues, P. Lopez, and M. Mongeau. Event-based milp models for resource-constrained project scheduling problems. Computers & Operations Research, 38(1):3--13, 2011.
[19]
D. Li, B. de Supinski, M. Schulz, K. Cameron, and D. Nikolopoulos. Hybrid MPI/OpenMP power-aware computing. In IEEE International Parallel and Distributed Processing Symposium, pages 1--12, 2010.
[20]
A. Marathe, P. E. Bailey, D. K. Lowenthal, B. Rountree, M. Schulz, and B. R. de Supinski. A run-time system for power-constrained HPC applications. In International Supercomputing Conference, 2015.
[21]
T. Patki, D. K. Lowenthal, B. Rountree, M. Schulz, and B. R. de Supinski. Exploring hardware overprovisioning in power-constrained, high performance computing. In Proceedings of the 27th international ACM conference on International conference on supercomputing, pages 173--182. ACM, 2013.
[22]
T. Patki, A. Sasidharan, M. Melarth, D. K. Lowenthal, B. Rountree, M. Schulz, and B. de Supinski. Practical resource management in power-constrained, high performance computing. In High-Performance Distributed Computing, June 2015.
[23]
B. Rountree, D. K. Lowenthal, B. de Supinski, M. Schulz, and V. W. Freeh. Adagio: Making DVS practical for complex HPC applications. In International Conference on Supercomputing, Yorktown Heights, N.Y., USA, June 2009.
[24]
B. Rountree, D. K. Lowenthal, S. Funk, V. W. Freeh, B. R. de Supinski, and M. Schulz. Bounding energy consumption in large-scale MPI programs. In Supercomputing, 2007. SC'07. Proceedings of the 2007 ACM/IEEE Conference on, pages 1--9. IEEE, 2007.
[25]
O. Sarood, A. Langer, A. Gupta, and L. Kale. Maximizing throughput of overprovisioned hpc data centers under a strict power budget. In Supercomputing, 2014.
[26]
O. Sarood, A. Langer, L. Kalé, B. Rountree, and B. De Supinski. Optimizing power allocation to cpu and memory subsystems in overprovisioned hpc systems. In CLUSTER, 2013.
[27]
J. Shalf, S. Dosanjh, and J. Morrison. Exascale computing technology challenges. In High Performance Computing for Computational Science--VECPAR 2010, pages 1--25. Springer, 2011.
[28]
R. F. vanderWijngaart and J. Haopiang. Nas parallel benchmarks, multi-zone versions. 2003.

Cited By

View all
  • (2024)FCUFS: Core-Level Frequency Tuning for Energy Optimization on Intel Processors2024 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER59578.2024.00026(214-225)Online publication date: 24-Sep-2024
  • (2023)Mapi-Pro: An Energy Efficient Memory Mapping Technique for Intermittent ComputingACM Transactions on Architecture and Code Optimization10.1145/362952420:4(1-25)Online publication date: 20-Oct-2023
  • (2022)Penelope: Peer-to-peer Power ManagementProceedings of the 51st International Conference on Parallel Processing10.1145/3545008.3545047(1-11)Online publication date: 29-Aug-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '15: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis
November 2015
985 pages
ISBN:9781450337236
DOI:10.1145/2807591
  • General Chair:
  • Jackie Kern,
  • Program Chair:
  • Jeffrey S. Vetter
© 2015 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

  • DOE ASCR
  • DOE LLNL

Conference

SC15
Sponsor:

Acceptance Rates

SC '15 Paper Acceptance Rate 79 of 358 submissions, 22%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)FCUFS: Core-Level Frequency Tuning for Energy Optimization on Intel Processors2024 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER59578.2024.00026(214-225)Online publication date: 24-Sep-2024
  • (2023)Mapi-Pro: An Energy Efficient Memory Mapping Technique for Intermittent ComputingACM Transactions on Architecture and Code Optimization10.1145/362952420:4(1-25)Online publication date: 20-Oct-2023
  • (2022)Penelope: Peer-to-peer Power ManagementProceedings of the 51st International Conference on Parallel Processing10.1145/3545008.3545047(1-11)Online publication date: 29-Aug-2022
  • (2020)Performance and Energy Trade-Offs for Parallel Applications on Heterogeneous Multi-Processing SystemsEnergies10.3390/en1309240913:9(2409)Online publication date: 11-May-2020
  • (2020)What does Power Consumption Behavior of HPC Jobs Reveal? : Demystifying, Quantifying, and Predicting Power Consumption Characteristics2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS47924.2020.00087(799-809)Online publication date: May-2020
  • (2019)Statistical and machine learning models for optimizing energy in parallel applicationsThe International Journal of High Performance Computing Applications10.1177/1094342019842915(109434201984291)Online publication date: 25-Apr-2019
  • (2019)Power efficient job scheduling by predicting the impact of processor manufacturing variabilityProceedings of the ACM International Conference on Supercomputing10.1145/3330345.3330372(296-307)Online publication date: 26-Jun-2019
  • (2019)PoDDProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3295500.3356174(1-23)Online publication date: 17-Nov-2019
  • (2019)Performance and Energy Efficiency Trade-Offs in Single-ISA Heterogeneous Multi-Processing for Parallel Applications2019 IFIP/IEEE 27th International Conference on Very Large Scale Integration (VLSI-SoC)10.1109/VLSI-SoC.2019.8920384(232-233)Online publication date: Oct-2019
  • (2019)A Scalable Priority-Aware Approach to Managing Data Center Server Power2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2019.00067(701-714)Online publication date: Feb-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media