skip to main content
10.1145/3624062.3624274acmotherconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article
Open Access

Comparing Power Signatures of HPC Workloads: Machine Learning vs Simulation

Published:12 November 2023Publication History

ABSTRACT

Power is a limiting factor for supercomputers limiting their scale and operation. Characterizing the power signatures of different application types can enable data centers to operate efficiently, even when power constrained. This paper investigates power profiles of diverse scientific applications, spanning both traditional simulations and modern machine learning (ML) running on the Perlmutter supercomputer at the National Energy Research Scientific Computing Center (NERSC). Our findings indicate that traditional simulations typically consume more power on average than ML workloads. Furthermore, ML applications exhibit periodic power fluctuations attributed to epoch transitions during training. Finally, we discuss the potential implications of the research insights toward automatic demand response (ADR) and considerations for designing future systems.

References

  1. [1] Brian Austin. 2020. https://portal.nersc.gov/project/m888/nersc10/workload/N10_Workload_Analysis.latest.pdfGoogle ScholarGoogle Scholar
  2. Amrita Mathuriya et al.2018. CosmoFlow: Using Deep Learning to Learn the Universe at Scale. arxiv:1808.04728 [astro-ph.CO]Google ScholarGoogle Scholar
  3. Aidan P. Thompson et al.2022. LAMMPS - a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Computer Physics Communications 271 (2022), 108171. https://doi.org/10.1016/j.cpc.2021.108171Google ScholarGoogle ScholarCross RefCross Ref
  4. Elizabeth Bautista et al.2019. Collecting, Monitoring, and Analyzing Facility and Systems Data at the National Energy Research Scientific Computing Center. In Workshop Proceedings of the 48th International Conference on Parallel Processing (Kyoto, Japan) (ICPP Workshops ’19). Association for Computing Machinery, New York, NY, USA, Article 10, 9 pages. https://doi.org/10.1145/3339186.3339213Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Eva García-Martín et al.2019. Estimation of energy consumption in machine learning. J. Parallel and Distrib. Comput. 134 (2019), 75–88. https://doi.org/10.1016/j.jpdc.2019.07.007Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gustaf Ahdritz et al.2022. OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. bioRxiv (2022), 45 pages. https://doi.org/10.1101/2022.11.20.517210Google ScholarGoogle ScholarCross RefCross Ref
  7. John Jumper et al.2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589. https://doi.org/10.1038/s41586-021-03819-2Google ScholarGoogle ScholarCross RefCross Ref
  8. Keren Bergman et al.2008. Exascale computing study: Technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Tech. Rep 15 (2008), 181.Google ScholarGoogle Scholar
  9. Nathan Baker et al.2019. Workshop Report on Basic Research Needs for Scientific Machine Learning: Core Technologies for Artificial Intelligence. (2 2019). https://doi.org/10.2172/1478744Google ScholarGoogle ScholarCross RefCross Ref
  10. Steven Farrell et al.2021. MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems. CoRR abs/2110.11466 (2021), 15 pages. arXiv:2110.11466https://arxiv.org/abs/2110.11466Google ScholarGoogle Scholar
  11. Steven Martin et al.2018. How to write a plugin to export job, power, energy, and system environmental data from your Cray® XC™ system. Concurrency and Computation: Practice and Experience 30, 1 (2018), e4299.Google ScholarGoogle ScholarCross RefCross Ref
  12. Thorsten Kurth et al.2018. Exascale Deep Learning for Climate Analytics. arxiv:1810.01993 [cs.DC]Google ScholarGoogle Scholar
  13. Yijia Zhang et al.2022. HPC Data Center Participation in Demand Response: An Adaptive Policy With QoS Assurance. IEEE Transactions on Sustainable Computing 7, 1 (2022), 157–171. https://doi.org/10.1109/TSUSC.2021.3077254Google ScholarGoogle ScholarCross RefCross Ref
  14. Zhengji Zhao et al.[n. d.]. VASP Performance on Cray EX Based on NVIDIA A100 GPUs and AMD Milan CPUs. https://drive.google.com/file/d/1kPFNc-y0ezn_ANatYDpE04x603U-gxlL/viewGoogle ScholarGoogle Scholar
  15. Steven Gottlieb. 2011. MILC. Springer US, Boston, MA, 1130–1140. https://doi.org/10.1007/978-0-387-09766-4_109Google ScholarGoogle ScholarCross RefCross Ref
  16. Jürgen Hafner. 2008. Ab-initio simulations of materials using VASP: Density-functional theory and beyond. Journal of Computational Chemistry 29, 13 (2008), 2044–2078. https://doi.org/10.1002/jcc.21057 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/jcc.21057Google ScholarGoogle ScholarCross RefCross Ref
  17. LLNL. [n. d.]. ASCR@40: Highlights and Impacts of ASCR’s Programs. https://computing.llnl.gov/misc/[email protected]Google ScholarGoogle Scholar
  18. NERSC. 2023. CosmoFlow Dataset. NERSC. https://portal.nersc.gov/project/m3363/Google ScholarGoogle Scholar
  19. NERSC. 2023. NERSC-10 Benchmark Suite. NERSC. https://gitlab.com/NERSC/N10-benchmarks/Google ScholarGoogle Scholar
  20. OpenFold. 2021. OpenProteinSet. https://registry.opendata.aws/openfold/Google ScholarGoogle Scholar
  21. Sridutt Bhalachandra. 2023. Perlmutter OMNI Analysis. NERSC. https://gitlab.com/NERSC/perlmutter-omni-analysisGoogle ScholarGoogle Scholar
  22. Top500. 2023. Green500. https://www.top500.org/lists/green500/Google ScholarGoogle Scholar

Index Terms

  1. Comparing Power Signatures of HPC Workloads: Machine Learning vs Simulation
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Other conferences
              SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis
              November 2023
              2180 pages
              ISBN:9798400707858
              DOI:10.1145/3624062

              Copyright © 2023 ACM

              Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 12 November 2023

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed limited
            • Article Metrics

              • Downloads (Last 12 months)77
              • Downloads (Last 6 weeks)24

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format .

            View HTML Format