ABSTRACT
Power is a limiting factor for supercomputers limiting their scale and operation. Characterizing the power signatures of different application types can enable data centers to operate efficiently, even when power constrained. This paper investigates power profiles of diverse scientific applications, spanning both traditional simulations and modern machine learning (ML) running on the Perlmutter supercomputer at the National Energy Research Scientific Computing Center (NERSC). Our findings indicate that traditional simulations typically consume more power on average than ML workloads. Furthermore, ML applications exhibit periodic power fluctuations attributed to epoch transitions during training. Finally, we discuss the potential implications of the research insights toward automatic demand response (ADR) and considerations for designing future systems.
- [1] Brian Austin. 2020. https://portal.nersc.gov/project/m888/nersc10/workload/N10_Workload_Analysis.latest.pdfGoogle Scholar
- Amrita Mathuriya et al.2018. CosmoFlow: Using Deep Learning to Learn the Universe at Scale. arxiv:1808.04728 [astro-ph.CO]Google Scholar
- Aidan P. Thompson et al.2022. LAMMPS - a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Computer Physics Communications 271 (2022), 108171. https://doi.org/10.1016/j.cpc.2021.108171Google ScholarCross Ref
- Elizabeth Bautista et al.2019. Collecting, Monitoring, and Analyzing Facility and Systems Data at the National Energy Research Scientific Computing Center. In Workshop Proceedings of the 48th International Conference on Parallel Processing (Kyoto, Japan) (ICPP Workshops ’19). Association for Computing Machinery, New York, NY, USA, Article 10, 9 pages. https://doi.org/10.1145/3339186.3339213Google ScholarDigital Library
- Eva García-Martín et al.2019. Estimation of energy consumption in machine learning. J. Parallel and Distrib. Comput. 134 (2019), 75–88. https://doi.org/10.1016/j.jpdc.2019.07.007Google ScholarDigital Library
- Gustaf Ahdritz et al.2022. OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. bioRxiv (2022), 45 pages. https://doi.org/10.1101/2022.11.20.517210Google ScholarCross Ref
- John Jumper et al.2021. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589. https://doi.org/10.1038/s41586-021-03819-2Google ScholarCross Ref
- Keren Bergman et al.2008. Exascale computing study: Technology challenges in achieving exascale systems. Defense Advanced Research Projects Agency Information Processing Techniques Office (DARPA IPTO), Tech. Rep 15 (2008), 181.Google Scholar
- Nathan Baker et al.2019. Workshop Report on Basic Research Needs for Scientific Machine Learning: Core Technologies for Artificial Intelligence. (2 2019). https://doi.org/10.2172/1478744Google ScholarCross Ref
- Steven Farrell et al.2021. MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems. CoRR abs/2110.11466 (2021), 15 pages. arXiv:2110.11466https://arxiv.org/abs/2110.11466Google Scholar
- Steven Martin et al.2018. How to write a plugin to export job, power, energy, and system environmental data from your Cray® XC™ system. Concurrency and Computation: Practice and Experience 30, 1 (2018), e4299.Google ScholarCross Ref
- Thorsten Kurth et al.2018. Exascale Deep Learning for Climate Analytics. arxiv:1810.01993 [cs.DC]Google Scholar
- Yijia Zhang et al.2022. HPC Data Center Participation in Demand Response: An Adaptive Policy With QoS Assurance. IEEE Transactions on Sustainable Computing 7, 1 (2022), 157–171. https://doi.org/10.1109/TSUSC.2021.3077254Google ScholarCross Ref
- Zhengji Zhao et al.[n. d.]. VASP Performance on Cray EX Based on NVIDIA A100 GPUs and AMD Milan CPUs. https://drive.google.com/file/d/1kPFNc-y0ezn_ANatYDpE04x603U-gxlL/viewGoogle Scholar
- Steven Gottlieb. 2011. MILC. Springer US, Boston, MA, 1130–1140. https://doi.org/10.1007/978-0-387-09766-4_109Google ScholarCross Ref
- Jürgen Hafner. 2008. Ab-initio simulations of materials using VASP: Density-functional theory and beyond. Journal of Computational Chemistry 29, 13 (2008), 2044–2078. https://doi.org/10.1002/jcc.21057 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/jcc.21057Google ScholarCross Ref
- LLNL. [n. d.]. ASCR@40: Highlights and Impacts of ASCR’s Programs. https://computing.llnl.gov/misc/[email protected]Google Scholar
- NERSC. 2023. CosmoFlow Dataset. NERSC. https://portal.nersc.gov/project/m3363/Google Scholar
- NERSC. 2023. NERSC-10 Benchmark Suite. NERSC. https://gitlab.com/NERSC/N10-benchmarks/Google Scholar
- OpenFold. 2021. OpenProteinSet. https://registry.opendata.aws/openfold/Google Scholar
- Sridutt Bhalachandra. 2023. Perlmutter OMNI Analysis. NERSC. https://gitlab.com/NERSC/perlmutter-omni-analysisGoogle Scholar
- Top500. 2023. Green500. https://www.top500.org/lists/green500/Google Scholar
Index Terms
- Comparing Power Signatures of HPC Workloads: Machine Learning vs Simulation
Recommendations
Power Tuning HPC Jobs on Power-Constrained Systems
PACT '16: Proceedings of the 2016 International Conference on Parallel Architectures and CompilationAs we approach the exascale era, power has become a primary bottleneck. The US Department of Energy has set a power constraint of 20MW on each exascale machine. To be able achieve one exaflop under this constraint, it is necessary that we use power ...
Demand-aware power management for power-constrained HPC systems
CCGRID '16: Proceedings of the 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid ComputingAs limited power budget is becoming one of the most crucial challenges in developing supercomputer systems, hardware overprovisioning which installs larger number of nodes beyond the limitations of the power constraint determined by Thermal Design Power ...
Adaptive memory power management techniques for HPC workloads
HIPC '11: Proceedings of the 2011 18th International Conference on High Performance ComputingThe memory subsystem is responsible for a large fraction of the energy consumed by compute nodes in High Performance Computing (HPC) systems. The rapid increase in the number of cores has been accompanied by a corresponding increase in the DRAM capacity ...
Comments