research-article

Penelope: Peer-to-peer Power Management

Authors:
Tapan Srivastava

University of Chicago, United States of America

University of Chicago, United States of America
View Profile

,
Huazhe Zhang

Meta, United States of America

Meta, United States of America
View Profile

,
Henry Hoffmann

University of Chicago, United States of America

University of Chicago, United States of America
View Profile

ICPP '22: Proceedings of the 51st International Conference on Parallel ProcessingAugust 2022Article No.: 43Pages 1–11https://doi.org/10.1145/3545008.3545047

Published:13 January 2023Publication History

ICPP '22: Proceedings of the 51st International Conference on Parallel Processing

Pages 1–11

ABSTRACT

Large scale distributed computing setups rely on power management systems to enforce tight power budgets. Existing systems use a central authority that redistributes excess power to power-hungry nodes. This central authority, however, is both a single point of failure and a critical bottleneck—especially at large scale. To address these limitations we propose Penelope, a distributed power management system which shifts power through peer-to-peer transactions, ensuring that it remains robust in faulty environments and at large scale. We implement Penelope and compare its achieved performance to SLURM, a centralized power manager, under a variety of power budgets. We find that under normal conditions SLURM and Penelope achieve almost equivalent performance; however in faulty environments, Penelope achieves 8–15% mean application performance gains over SLURM. At large scale and with increasing frequency of messages, Penelope maintains its performance in contrast to centralized approaches which degrade and become unusable.

References

Dong H. Ahn, Ned Bass, Albert Chu, Jim Garlick, Mark Grondona, Stephen Herbein, Helgi I. IngÃ³lfsson, Joseph Koning, Tapasya Patki, Thomas R.W. Scogland, Becky Springmeyer, and Michela Taufer. 2020. Flux: Overcoming scheduling challenges for exascale workflows. Future Generation Computer Systems 110 (2020), 202–213. https://doi.org/10.1016/j.future.2020.04.006Google ScholarCross Ref
Peter E Bailey, Aniruddha Marathe, David K Lowenthal, Barry Rountree, and Martin Schulz. 2015. Finding the limits of power-constrained application performance. In SC. ACM, Austin Texas, 1–12. https://doi.org/10.1145/2807591.2807637Google ScholarDigital Library
Pete Beckman, Ron Brightwell, Maya Gokhale, Bronis R. de Supinski, Steven Hofmeyr, Sriram Krishnamoorthy, Mike Lang, Barney Maccabe, John Shalf, and Marc Snir. 2012. Exascale Operating Systems and Runtime Software Report. (12 2012). https://doi.org/10.2172/1471119Google ScholarCross Ref
[4] NAS Parallel Benchmark.[n.d.]. https://www.nas.nasa.gov/publications/npb.html.Google Scholar
Keren Bergman, Shekhar Borkar, Dan Campbell, William Carlson, William Dally, Monty Denneau, Paul Franzon, William Harrod, Kerry Hill, Jon Hiller, 2008. Exascale computing study: Technology challenges in achieving exascale systems. DARPA IPTO, Tech. Rep 15(2008).Google Scholar
Stephanie Brink, Matthew Larsen, Hank Childs, and Barry Rountree. 2021. Evaluating adaptive and predictive power management strategies for optimizing visualization performance on supercomputers. Parallel Comput. 104-105(2021), 102782. https://doi.org/10.1016/j.parco.2021.102782Google Scholar
Rolando Brondolin, Marco Arnaboldi, and Marco D. Santambrogio. 2020. Power Consumption Management under a Low-Level Performance Constraint in the Xen Hypervisor. SIGBED Rev. 17, 1 (July 2020), 42–48. https://doi.org/10.1145/3412821.3412828Google ScholarDigital Library
Ramon Canal, Carles Hernandez, Rafa Tornero, Alessandro Cilardo, Giuseppe Massari, Federico Reghenzani, William Fornaciari, Marina Zapater, David Atienza, Ariel Oleksiak, Wojciech Piundefinedtek, and Jaume Abella. 2020. Predictive Reliability and Fault Management in Exascale Systems: State of the Art and Perspectives. ACM Comput. Surv. 53, 5, Article 95 (Sept. 2020), 32 pages. https://doi.org/10.1145/3403956Google ScholarDigital Library
J Chen, Alok Choudhary, S Feldman, B Hendrickson, CR Johnson, R Mount, V Sarkar, V White, and D Williams. 2013. Synergistic Challenges in Data-Intensive Science and Exascale Computing: DOE ASCAC Data Subcommittee Report. Department of Energy Office of Science. Type: Report.Google Scholar
Jian Chen and Lizy Kurian John. 2011. Predictive coordination of multiple on-chip resources for chip multiprocessors. In ICS ’11. ACM Press, Tucson, Arizona, USA, 192–201. https://doi.org/10.1145/1995896.1995927Google ScholarDigital Library
Anwesha Das, Frank Mueller, and Barry Rountree. 2020. Aarohi: Making Real-Time Node Failure Prediction Feasible. In 2020 IPDPS. 1092–1101. https://doi.org/10.1109/IPDPS47924.2020.00115Google Scholar
H. David, E. Gorbatov, U. R. Hanebutte, R. Khanna, and C. Le. 2010. RAPL: Memory power estimation and capping. In 2010 ACM/IEEE ISLPED. 189–194. https://doi.org/10.1145/1840845.1840883Google ScholarDigital Library
Qingyuan Deng, David Meisner, Abhishek Bhattacharjee, Thomas F Wenisch, and Ricardo Bianchini. 2012. CoScale: Coordinating CPU and memory system DVFS in server systems. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE, 143–154. https://doi.org/10.1109/MICRO.2012.22Google ScholarDigital Library
Qingyuan Deng, David Meisner, Abhishek Bhattacharjee, Thomas F Wenisch, and Ricardo Bianchini. 2012. MultiScale: memory system DVFS with multiple memory controllers. In ISLPED ’12. ACM Press, Redondo Beach, California, USA, 297–302. https://doi.org/10.1145/2333660.2333727Google ScholarDigital Library
Bruno Diniz, Dorgival Guedes, Wagner Meira Jr, and Ricardo Bianchini. 2007. Limiting the power consumption of main memory. In ISCA ’07. ACM Press, San Diego, California, USA, 290–301. https://doi.org/10.1145/1250662.1250699Google ScholarDigital Library
Daniel Ellsworth, Tapasya Patki, Martin Schulz, Barry Rountree, and Allen Malony. 2017. Simulating Power Scheduling at Scale(E2SC’17). Association for Computing Machinery, New York, NY, USA, Article 2, 8 pages. https://doi.org/10.1145/3149412.3149414Google ScholarDigital Library
Daniel A Ellsworth, Allen D Malony, Barry Rountree, and Martin Schulz. 2015. Dynamic power sharing for higher job throughput. In SC’15. IEEE, 1–11. https://doi.org/10.1145/2807591.2807643Google ScholarDigital Library
Daniel A Ellsworth, Allen D Malony, Barry Rountree, and Martin Schulz. 2015. POW: System-wide Dynamic Reallocation of Limited Power in HPC. In HPDC. ACM, Portland Oregon USA, 145–148. https://doi.org/10.1145/2749246.2749277Google ScholarDigital Library
Keiichiro Fukazawa, Masatsugu Ueda, Mutsumi Aoyagi, Tomonori Tsuhata, Kyohei Yoshida, Aruta Uehara, Masakazu Kuze, Yuichi Inadomi, and Koji Inoue. 2014. Power consumption evaluation of an mhd simulation with cpu power capping. In 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, 612–617. https://doi.org/10.1109/CCGrid.2014.47Google ScholarDigital Library
Neha Gholkar, Frank Mueller, and Barry Rountree. 2016. Power tuning HPC jobs on power-constrained systems. In 2016 PACT. IEEE, 179–190. https://doi.org/10.1145/2967938.2967961Google ScholarDigital Library
Neha Gholkar, Frank Mueller, and Barry Rountree. 2019. Uncore Power Scavenger: A Runtime for Uncore Power Conservation on HPC Systems(SC ’19). Association for Computing Machinery, New York, NY, USA, Article 27, 23 pages. https://doi.org/10.1145/3295500.3356150Google ScholarDigital Library
Neha Gholkar, Frank Mueller, Barry Rountree, and Aniruddha Marathe. 2018. PShifter: feedback-based dynamic power shifting within HPC jobs for performance. In HPDC. ACM, Tempe Arizona, 106–117. https://doi.org/10.1145/3208040.3208047Google ScholarDigital Library
Henry Hoffmann, Jim Holt, George Kurian, Eric Lau, Martina Maggio, Jason E Miller, Sabrina M Neuman, Mahmut Sinangil, Yildiz Sinangil, Anant Agarwal, 2012. Self-aware computing in the Angstrom processor. In DAC ’12. ACM Press, 259–264. https://doi.org/10.1145/2228360.2228409Google ScholarDigital Library
Henry Hoffmann and Martina Maggio. 2014. PCP: A Generalized Approach to Optimizing Performance Under Power Constraints through Resource Management. In ICAC ’14. 241–247.Google Scholar
Connor Imes and Henry Hoffmann. 2016. Bard: A unified framework for managing soft timing and power constraints. In AMOS. IEEE, 31–38. https://doi.org/10.1109/SAMOS.2016.7818328Google Scholar
Connor Imes, Steven Hofmeyr, and Henry Hoffmann. 2018. Energy-efficient Application Resource Scheduling using Machine Learning Classifiers. In Proceedings of the 47th International Conference on Parallel Processing. ACM, Eugene OR USA, 1–11. https://doi.org/10.1145/3225058.3225088Google ScholarDigital Library
David E Keyes, Lois C McInnes, Carol Woodward, William Gropp, Eric Myra, Michael Pernice, John Bell, Jed Brown, Alain Clo, Jeffrey Connors, 2013. Multiphysics simulations: Challenges and opportunities. The International Journal of High Performance Computing Applications 27, 1(2013), 4–83. https://doi.org/10.1177/1094342012468181 arXiv:https://doi.org/10.1177/1094342012468181Google ScholarDigital Library
Mohammed G Khatib and Zvonimir Bandic. 2016. PCAP: Performance-aware Power Capping for the Disk Drive in the Cloud. In FAST. USENIX Association, Santa Clara, CA, 227–240. https://www.usenix.org/conference/fast16/technical-sessions/presentation/khatibGoogle Scholar
Charles Lefurgy, Xiaorui Wang, and Malcolm Ware. 2008. Power capping: a prelude to power shifting. Cluster Computing 11, 2 (June 2008), 183–195. https://doi.org/10.1007/s10586-007-0045-4Google ScholarDigital Library
Matthias Maiterth, Torsten Wilde, David Lowenthal, Barry Rountree, Martin Schulz, Jonathan Eastep, and Dieter Kranzlmüller. 2017. Power Aware High Performance Computing: Challenges and Opportunities for Application and System Developers — Survey Tutorial. In HPCS. 3–10. https://doi.org/10.1109/HPCS.2017.11Google Scholar
Tapasya Patki, Zachary Frye, Harsh Bhatia, Francesco Di Natale, James Glosli, Helgi Ingolfsson, and Barry Rountree. 2019. Comparing GPU Power and Frequency Capping: A Case Study with the MuMMI Workflow. In WORKS. 31–39. https://doi.org/10.1109/WORKS49585.2019.00009Google Scholar
Tapasya Patki, Zachary Frye, Harsh Bhatia, Francesco Di Natale, James Glosli, Helgi Ingolfsson, and Barry Rountree. 2019. Comparing GPU Power and Frequency Capping: A Case Study with the MuMMI Workflow. In WORKS. IEEE, 31–39.Google Scholar
Tapasya Patki, David K Lowenthal, Barry Rountree, Martin Schulz, and Bronis R De Supinski. 2013. Exploring hardware overprovisioning in power-constrained, high performance computing. In ICS. ACM Press, 173–182. https://doi.org/10.1145/2464996.2465009Google ScholarDigital Library
Tapasya Patki, David K Lowenthal, Anjana Sasidharan, Matthias Maiterth, Barry L Rountree, Martin Schulz, and Bronis R De Supinski. 2015. Practical Resource Management in Power-Constrained, High Performance Computing. In HPDC. ACM, Portland Oregon USA, 121–132. https://doi.org/10.1145/2749246.2749262Google ScholarDigital Library
Ramya Raghavendra, Parthasarathy Ranganathan, Vanish Talwar, Zhikui Wang, and Xiaoyun Zhu. 2008. No ”power” struggles: coordinated multi-level power management for the data center. ACM SIGARCH Computer Architecture News 36, 1 (March 2008), 48–59. https://doi.org/10.1145/1353534.1346289Google ScholarDigital Library
Haris Ribic and Yu David Liu. 2016. AEQUITAS: Coordinated Energy Management Across Parallel Applications. In ICS. ACM, Istanbul Turkey, 1–12. https://doi.org/10.1145/2925426.2926260Google ScholarDigital Library
Barry Rountree, Dong H Ahn, Bronis R De Supinski, David K Lowenthal, and Martin Schulz. 2012. Beyond DVFS: A first look at performance under a hardware-enforced power bound. In 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum. IEEE, 947–953. https://doi.org/10.1109/IPDPSW.2012.116Google ScholarDigital Library
Ryuichi Sakamoto, Tapasya Patki, Thang Cao, Masaaki Kondo, Koji Inoue, Masatsugu Ueda, Daniel Ellsworth, Barry Rountree, and Martin Schulz. 2018. Analyzing Resource Trade-offs in Hardware Overprovisioned Supercomputers. In 2018 IPDPS. 526–535. https://doi.org/10.1109/IPDPS.2018.00062Google Scholar
Ahmed Salem, Theodoros Salonidis, Nirmit Desai, and Tamer Nadeem. 2017. Kinaara: Distributed discovery and allocation of mobile edge resources. In MASS. IEEE, 153–161. https://doi.org/10.1109/MASS.2017.10Google Scholar
Osman Sarood, Akhil Langer, Abhishek Gupta, and Laxmikant Kale. 2014. Maximizing Throughput of Overprovisioned HPC Data Centers Under a Strict Power Budget. In SC ’14. IEEE, 807–818. https://doi.org/10.1109/SC.2014.71Google ScholarDigital Library
Osman Sarood, Akhil Langer, Laxmikant Kalé, Barry Rountree, and Bronis De Supinski. 2013. Optimizing power allocation to CPU and memory subsystems in overprovisioned HPC systems. In CLUSTER. IEEE, 1–8. https://doi.org/10.1109/CLUSTER.2013.6702684Google ScholarCross Ref
Lee Savoie, David K. Lowenthal, Bronis R. De Supinski, Tanzima Islam, Kathryn Mohror, Barry Rountree, and Martin Schulz. 2016. I/O Aware Power Shifting. In IPDPS. IEEE, Chicago, IL, 740–749. https://doi.org/10.1109/IPDPS.2016.15Google Scholar
SLURM. [n.d.]. The SLURM Workload Manager. https://slurm.schedmd.com.Google Scholar
Giacomo Tanganelli, Carlo Vallati, and Enzo Mingozzi. 2017. Edge-Centric Distributed Discovery and Access in the Internet of Things. IEEE Internet of Things Journal 5, 1 (2017), 425–438. https://doi.org/10.1109/JIOT.2017.2767381Google ScholarCross Ref
ExaOSR Team. [n.d.]. Key Challenges for Exascale OS/R. https://collab.cels.anl.gov/display/exaosr/Challenges.Google Scholar
Andy B Yoo, Morris A Jette, and Mark Grondona. 2003. Slurm: Simple linux utility for resource management. In Job Scheduling Strategies for Parallel Processing, Dror Feitelson, Larry Rudolph, and Uwe Schwiegelshohn (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 44–60.Google Scholar
Javad Zarrin, Rui L Aguiar, and João Paulo Barraca. 2018. Resource discovery for distributed computing systems: A comprehensive survey. J. Parallel and Distrib. Comput. 113 (2018), 127–166. https://doi.org/10.1016/j.jpdc.2017.11.010Google ScholarDigital Library
Huazhe Zhang. [n.d.]. A quantitative evaluation of the RAPL power control system. ([n. d.]).Google Scholar
Huazhe Zhang and Henry Hoffmann. 2016. Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques. ACM SIGPLAN Notices 51, 4 (June 2016), 545–559. https://doi.org/10.1145/2954679.2872375Google ScholarDigital Library
Huazhe Zhang and Henry Hoffmann. 2018. Performance & Energy Tradeoffs for Dependent Distributed Applications Under System-wide Power Caps. In ICPP. ACM, Eugene OR USA, 1–11. https://doi.org/10.1145/3225058.3225098Google ScholarDigital Library
Huazhe Zhang and Henry Hoffmann. 2019. PoDD: power-capping dependent distributed applications. In SC. ACM, Denver Colorado, 1–23. https://doi.org/10.1145/3295500.3356174Google ScholarDigital Library

Index Terms

Penelope: Peer-to-peer Power Management
1. Hardware
  1. Power and energy
    1. Power estimation and optimization
      1. Enterprise level and data centers power issues

Recommendations

Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques
ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems

Power and thermal dissipation constrain multicore performance scaling. Modern processors are built such that they could sustain damaging levels of power dissipation, creating a need for systems that can implement processor power caps. A particular ...
Read More
Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques
ASPLOS '16

Power and thermal dissipation constrain multicore performance scaling. Modern processors are built such that they could sustain damaging levels of power dissipation, creating a need for systems that can implement processor power caps. A particular ...
Read More
Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques
ASPLOS'16

Power and thermal dissipation constrain multicore performance scaling. Modern processors are built such that they could sustain damaging levels of power dissipation, creating a need for systems that can implement processor power caps. A particular ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICPP '22: Proceedings of the 51st International Conference on Parallel Processing
August 2022
976 pages
ISBN:9781450397339
DOI:10.1145/3545008

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 January 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Adaptive Systems
Power Management
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate91of313submissions,29%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 83
  Total Downloads
- Downloads (Last 12 months)45
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Penelope: Peer-to-peer Power Management

ICPP '22: Proceedings of the 51st International Conference on Parallel Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques

Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques

Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Penelope: Peer-to-peer Power Management

ICPP '22: Proceedings of the 51st International Conference on Parallel Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques

Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques

Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media