skip to main content
10.1145/3624062.3624274acmotherconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article
Open access

Comparing Power Signatures of HPC Workloads: Machine Learning vs Simulation

Published: 12 November 2023 Publication History

Abstract

Power is a limiting factor for supercomputers limiting their scale and operation. Characterizing the power signatures of different application types can enable data centers to operate efficiently, even when power constrained. This paper investigates power profiles of diverse scientific applications, spanning both traditional simulations and modern machine learning (ML) running on the Perlmutter supercomputer at the National Energy Research Scientific Computing Center (NERSC). Our findings indicate that traditional simulations typically consume more power on average than ML workloads. Furthermore, ML applications exhibit periodic power fluctuations attributed to epoch transitions during training. Finally, we discuss the potential implications of the research insights toward automatic demand response (ADR) and considerations for designing future systems.

Supplemental Material

MP4 File - Conference presentation recording
Recording of "Comparing Power Signatures of HPC Workloads: Machine Learning vs Simulation" presentation at the Sustainable Supercomputing (SusSup23) Workshop

References

[1]
[1] Brian Austin. 2020. https://portal.nersc.gov/project/m888/nersc10/workload/N10_Workload_Analysis.latest.pdf
[2]
Amrita Mathuriya et al.2018. CosmoFlow: Using Deep Learning to Learn the Universe at Scale. arxiv:1808.04728 [astro-ph.CO]
[3]
Aidan P. Thompson et al.2022. LAMMPS - a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Computer Physics Communications 271 (2022), 108171. https://doi.org/10.1016/j.cpc.2021.108171
[4]
Elizabeth Bautista et al.2019. Collecting, Monitoring, and Analyzing Facility and Systems Data at the National Energy Research Scientific Computing Center. In Workshop Proceedings of the 48th International Conference on Parallel Processing (Kyoto, Japan) (ICPP Workshops ’19). Association for Computing Machinery, New York, NY, USA, Article 10, 9 pages. https://doi.org/10.1145/3339186.3339213
[5]
Eva García-Martín et al.2019. Estimation of energy consumption in machine learning. J. Parallel and Distrib. Comput. 134 (2019), 75–88. https://doi.org/10.1016/j.jpdc.2019.07.007
[6]
Gustaf Ahdritz et al.2022. OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. bioRxiv (2022), 45 pages. https://doi.org/10.1101/2022.11.20.517210
[7]
John Jumper et al.2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589. https://doi.org/10.1038/s41586-021-03819-2
[8]
Keren Bergman et al.2008. Exascale computing study: Technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Tech. Rep 15 (2008), 181.
[9]
Nathan Baker et al.2019. Workshop Report on Basic Research Needs for Scientific Machine Learning: Core Technologies for Artificial Intelligence. (2 2019). https://doi.org/10.2172/1478744
[10]
Steven Farrell et al.2021. MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems. CoRR abs/2110.11466 (2021), 15 pages. arXiv:2110.11466https://arxiv.org/abs/2110.11466
[11]
Steven Martin et al.2018. How to write a plugin to export job, power, energy, and system environmental data from your Cray® XC™ system. Concurrency and Computation: Practice and Experience 30, 1 (2018), e4299.
[12]
Thorsten Kurth et al.2018. Exascale Deep Learning for Climate Analytics. arxiv:1810.01993 [cs.DC]
[13]
Yijia Zhang et al.2022. HPC Data Center Participation in Demand Response: An Adaptive Policy With QoS Assurance. IEEE Transactions on Sustainable Computing 7, 1 (2022), 157–171. https://doi.org/10.1109/TSUSC.2021.3077254
[14]
Zhengji Zhao et al.[n. d.]. VASP Performance on Cray EX Based on NVIDIA A100 GPUs and AMD Milan CPUs. https://drive.google.com/file/d/1kPFNc-y0ezn_ANatYDpE04x603U-gxlL/view
[15]
Steven Gottlieb. 2011. MILC. Springer US, Boston, MA, 1130–1140. https://doi.org/10.1007/978-0-387-09766-4_109
[16]
Jürgen Hafner. 2008. Ab-initio simulations of materials using VASP: Density-functional theory and beyond. Journal of Computational Chemistry 29, 13 (2008), 2044–2078. https://doi.org/10.1002/jcc.21057 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/jcc.21057
[17]
LLNL. [n. d.]. ASCR@40: Highlights and Impacts of ASCR’s Programs. https://computing.llnl.gov/misc/[email protected]
[18]
NERSC. 2023. CosmoFlow Dataset. NERSC. https://portal.nersc.gov/project/m3363/
[19]
NERSC. 2023. NERSC-10 Benchmark Suite. NERSC. https://gitlab.com/NERSC/N10-benchmarks/
[20]
OpenFold. 2021. OpenProteinSet. https://registry.opendata.aws/openfold/
[21]
Sridutt Bhalachandra. 2023. Perlmutter OMNI Analysis. NERSC. https://gitlab.com/NERSC/perlmutter-omni-analysis
[22]
Top500. 2023. Green500. https://www.top500.org/lists/green500/

Cited By

View all
  • (2025)EVeREST: An Effective and Versatile Runtime Energy Saving Tool for GPUsProceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3710848.3710875(57-69)Online publication date: 28-Feb-2025
  • (2024)Power Consumption Trends in Supercomputers: A Study of NERSC's Cori and Perlmutter MachinesISC High Performance 2024 Research Paper Proceedings (39th International Conference)10.23919/ISC.2024.10528943(1-10)Online publication date: May-2024
  • (2024)Understanding VASP Power Profiles on NVIDIA A100 GPUsProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00189(1496-1505)Online publication date: 17-Nov-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SC-W '23: Proceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis
November 2023
2180 pages
ISBN:9798400707858
DOI:10.1145/3624062
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 November 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. automatic demand response
  2. exascale systems
  3. machine learning workloads
  4. modeling workloads
  5. power analysis
  6. power signature
  7. simulation workloads
  8. super computing
  9. sustainable supercomputing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Department of Energy

Conference

SC-W 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)225
  • Downloads (Last 6 weeks)51
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)EVeREST: An Effective and Versatile Runtime Energy Saving Tool for GPUsProceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3710848.3710875(57-69)Online publication date: 28-Feb-2025
  • (2024)Power Consumption Trends in Supercomputers: A Study of NERSC's Cori and Perlmutter MachinesISC High Performance 2024 Research Paper Proceedings (39th International Conference)10.23919/ISC.2024.10528943(1-10)Online publication date: May-2024
  • (2024)Understanding VASP Power Profiles on NVIDIA A100 GPUsProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00189(1496-1505)Online publication date: 17-Nov-2024
  • (2024)Expert and operator perspectives on barriers to energy efficiency in data centersEnergy Efficiency10.1007/s12053-024-10244-717:6Online publication date: 17-Jul-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media