skip to main content
10.1145/3173162.3173190acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Public Access

SmoothOperator: Reducing Power Fragmentation and Improving Power Utilization in Large-scale Datacenters

Authors Info & Claims
Published:19 March 2018Publication History

ABSTRACT

With the ever growing popularity of cloud computing and web services, Internet companies are in need of increased computing capacity to serve the demand. However, power has become a major limiting factor prohibiting the growth in industry: it is often the case that no more servers can be added to datacenters without surpassing the capacity of the existing power infrastructure. In this work, we first investigate the power utilization in Facebook datacenters. We observe that the combination of provisioning for peak power usage, highly fluctuating traffic, and multi-level power delivery infrastructure leads to significant power budget fragmentation problem and inefficiently low power utilization. To address this issue, our insight is that heterogeneity of power consumption patterns among different services provides opportunities to re-shape the power profile of each power node by re-distributing services. By grouping services with asynchronous peak times under the same power node, we can reduce the peak power of each node and thus creating more power head-rooms to allow more servers hosted, achieving higher throughput. Based on this insight, we develop a workload-aware service placement framework to systematically spread the service instances with synchronous power patterns evenly under the power supply tree, greatly reducing the peak power draw at power nodes. We then leverage dynamic power profile reshaping to maximally utilize the headroom unlocked by our placement framework. Our experiments based on real production workload and power traces show that we are able to host up to 13% more machines in production, without changing the underlying power infrastructure. Utilizing the unleashed power headroom with dynamic reshaping, we achieve up to an estimated total of 15% and 11% throughput improvement for latency-critical service and batch service respectively at the same time, with up to 44% of energy slack reduction.

References

  1. Baris Aksanli, Eddie Pettis, and Tajana Rosing. 2013. Architecting Efficient Peak Power Shaving Using Batteries in Data Centers.Google ScholarGoogle Scholar
  2. Theophilus Benson, Aditya Akella, and David A Maltz. 2010. Network traffic characteristics of data centers in the wild Proceedings of the 10th ACM SIGCOMM conference on Internet measurement. ACM, 267--280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Arka A Bhattacharya, David Culler, Aman Kansal, Sriram Govindan, and Sriram Sankar. 2013. The need for speed and stability in data center power capping. Sustainable Computing: Informatics and Systems Vol. 3, 3 (2013), 183--193.Google ScholarGoogle ScholarCross RefCross Ref
  4. Alex D. Breslow, Ananta Tiwari, Martin Schulz, Laura Carrington, Lingjia Tang, and Jason Mars. 2013. Enabling Fair Pricing on HPC Systems with Node Sharing Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC '13). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jeffrey S Chase, Darrell C Anderson, Prachi N Thakar, Amin M Vahdat, and Ronald P Doyle. 2001. Managing energy and server resources in hosting centers. ACM SIGOPS Operating Systems Review Vol. 35, 5 (2001), 103--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Hao Chen, Can Hankendi, Michael C Caramanis, and Ayse K Coskun. 2013. Dynamic server power capping for enabling data center participation in power markets. In Proceedings of the International Conference on Computer-Aided Design. IEEE Press, 122--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Howard David, Eugene Gorbatov, Ulf R Hanebutte, Rahul Khanna, and Christian Le. 2010. RAPL: memory power estimation and capping. In Low-Power Electronics and Design (ISLPED), 2010 ACM/IEEE International Symposium on. IEEE, 189--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Qingyuan Deng, David Meisner, Abhishek Bhattacharjee, Thomas F Wenisch, and Ricardo Bianchini. 2012. Coscale: Coordinating cpu and memory system dvfs in server systems Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 143--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. W. El-Essawy, A. P. Ferreira, J. C. Rubio, T. Keller, K. Rajamani, and M. Ware. 2011. Enabling Real-Time Data Center Energy Management. (2011).Google ScholarGoogle Scholar
  10. Songchun Fan, Seyed Majid Zahedi, and Benjamin C Lee. 2016. The computational sprinting game. In ACM SIGOPS Operating Systems Review, Vol. Vol. 50. ACM, 561--575. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Xiaobo Fan, Wolf-Dietrich Weber, and Luiz Andre Barroso. 2007. Power provisioning for a warehouse-sized computer. In ACM SIGARCH Computer Architecture News, Vol. Vol. 35. 13--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Shuo Fang, Renuga Kanagavelu, Bu-Sung Lee, Chuan Heng Foh, and Khin Mi Mi Aung. 2013. Power-Efficient Virtual Machine Placement and Migration in Data Centers.Google ScholarGoogle Scholar
  13. Xing Fu, Xiaorui Wang, and Charles Lefurgy. 2011. How Much Power Oversubscription is Safe and Allowed in Data Centers Proceedings of the 8th ACM International Conference on Autonomic Computing (ICAC '11). ACM, New York, NY, USA, 21--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Gandhi, Yuan Chen, D. Gmach, M. Arlitt, and M. Marwah. 2011. Minimizing data center SLA violations and power consumption via hybrid resource provisioning. In Green Computing Conference and Workshops (IGCC), 2011 International. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Anshul Gandhi, Mor Harchol-Balter, Rajarshi Das, Jeffrey O Kephart, and Charles Lefurgy. 2009. Power capping via forced idleness. (2009).Google ScholarGoogle Scholar
  16. Lakshmi Ganesh, Jie Liu, Suman Nath, and Feng Zhao. 2009. Unleash stranded power in data centers with RackPacker. Workshop on Energy-Efficient Design (WEED) (2009).Google ScholarGoogle Scholar
  17. Sriram Govindan, Jeonghwan Choi, Bhuvan Urgaonkar, Anand Sivasubramaniam, and Andrea Baldini. 2009. Statistical profiling-based techniques for effective power provisioning in data centers. In Proceedings of the 4th ACM European conference on Computer systems (EuroSys). 317--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Sriram Govindan, Anand Sivasubramaniam, and Bhuvan Urgaonkar. 2011. Benefits and Limitations of Tapping into Stored Energy for Datacenters ACM SIGARCH Computer Architecture News. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Sriram Govindan, Di Wang, Anand Sivasubramaniam, and Bhuvan Urgaonkar. 2012. Leveraging Stored Energy for Handling Power Emergencies in Aggressively Provisioned Datacenters. In ACM SIGARCH Computer Architecture News. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Can Hankendi, Sherief Reda, and Ayse Kivilcim Coskun. 2013. vCap: Adaptive power capping for virtualized servers ISLPED. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Chang-Hong Hsu, Yunqi Zhang, Michael A Laurenzano, David Meisner, Thomas Wenisch, Jason Mars, Lingjia Tang, and Ronald G Dreslinski. 2015. Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on. IEEE, 271--282.Google ScholarGoogle Scholar
  22. Canturk Isci and Margaret Martonosi. 2003. Runtime power monitoring in high-end processors: Methodology and empirical data Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ana Klimovic, Christos Kozyrakis, Eno Thereksa, Binu John, and Sanjeev Kumar. 2016. Flash storage disaggregation. In Proceedings of the Eleventh European Conference on Computer Systems (EuroSys). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ana Klimovic, Heiner Litz, and Christos Kozyrakis. 2017. ReFlex: Remote Flash? Local Flash. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems. ACM, 345--359. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Vasileios Kontorinis, Liuyi Eric Zhang, Baris Aksanli, Jack Sampson, Houman Homayoun, Eddie Pettis, Dean M Tullsen, and Tajana Simunic Rosing. 2012. Managing distributed ups energy for effective power capping in data centers Computer Architecture (ISCA), 2012 39th Annual International Symposium on. IEEE, 488--499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Charles Lefurgy, Xiaorui Wang, and Malcolm Ware. 2008. Power capping: a prelude to power shifting. Cluster Computing Vol. 11, 2 (2008), 183--195. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Harold Lim, Aman Kansal, and Jie Liu. 2011. Power Budgeting for Virtualized Data Centers. In 2011 USENIX Annual Technical Conference (USENIX ATC'11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. David Lo, Liqun Cheng, Rama Govindaraju, Luiz André Barroso, and Christos Kozyrakis. 2014. Towards energy proportionality for large-scale latency-critical workloads ACM SIGARCH Computer Architecture News, Vol. Vol. 42. IEEE Press, 301--312. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. 2015. Heracles: improving resource efficiency at scale. In ACM SIGARCH Computer Architecture News, Vol. Vol. 43. ACM, 450--462. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research Vol. 9 (2008).Google ScholarGoogle Scholar
  31. Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. 2011. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In Proceedings of the 44th annual IEEE/ACM International Symposium on Microarchitecture. ACM, 248--259. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. David Meisner, Christopher M Sadler, Luiz André Barroso, Wolf-Dietrich Weber, and Thomas F Wenisch. 2011 a. Power management of online data-intensive services International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. David Meisner, Christopher M Sadler, Luiz André Barroso, Wolf-Dietrich Weber, and Thomas F Wenisch. 2011 b. Power management of online data-intensive services Computer Architecture (ISCA), 2011 38th Annual International Symposium on. IEEE, 319--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Ripal Nathuji and Karsten Schwan. 2007. Virtualpower: Coordinated Power Management in Virtualized Enterprise Systems ACM SIGOPS Operating Systems Review. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Steven Pelley, David Meisner, Pooya Zandevakili, Thomas F. Wenisch, and Jack Underwood. 2010. Power routing: dynamic power provisioning in the data center Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Vinicius Petrucci, Michael A Laurenzano, John Doherty, Yunqi Zhang, Daniel Mosse, Jason Mars, and Lingjia Tang. 2015. Octopus-man: QoS-driven task management for heterogeneous multicores in warehouse-scale computers. In International Symposium on High Performance Computer Architecture (HPCA).Google ScholarGoogle ScholarCross RefCross Ref
  37. Ramya Raghavendra, Parthasarathy Ranganathan, Vanish Talwar, Zhikui Wang, and Xiaoyun Zhu. 2008. No power struggles: Coordinated multi-level power management for the data center ACM SIGARCH Computer Architecture News, Vol. Vol. 36. ACM, 48--59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Parthasarathy Ranganathan, Phil Leech, David Irwin, and Jeffrey Chase. 2006. Ensemble-level power management for dense blade servers ACM SIGARCH Computer Architecture News, Vol. Vol. 34. IEEE Computer Society, 66--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Sherief Reda, Ryan Cochran, and Ayse K Coskun. 2012. Adaptive power capping for servers with multithreaded workloads. IEEE Micro Vol. 5, 32 (2012), 64--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, and Alex C Snoeren. 2015. Inside the social network's (datacenter) network. In ACM SIGCOMM Computer Communication Review, Vol. Vol. 45. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Michael Steinbach, Levent Ertöz, and Vipin Kumar. 2004. The challenges of clustering high dimensional data. In New directions in statistical physics. Springer, 273--309.Google ScholarGoogle Scholar
  42. Balaji Subramaniam and Wu-chun Feng. 2015. Towards energy-proportional computing using subsystem-level power management. arXiv preprint arXiv:1501.02724 (2015).Google ScholarGoogle Scholar
  43. Augusto Vega, Alper Buyuktosunoglu, Heather Hanson, Pradip Bose, and Srinivasan Ramani. 2013. Crank it up or dial it down: coordinated multiprocessor frequency and folding control. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture. ACM, 210--221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Akshat Verma, Puneet Ahuja, and Anindya Neogi. 2008. pMapper: Power and Migration Cost Aware Application Placement in Virtualized Systems. In ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Di Wang, Chuangang Ren, Sriram Govindan, Anand Sivasubramaniam, Bhuvan Urgaonkar, Aman Kansal, and Kushagra Vaid. 2013. ACE: Abstracting, characterizing and exploiting datacenter power demands.Google ScholarGoogle Scholar
  46. Di Wang, Chuangang Ren, Anand Sivasubramaniam, Bhuvan Urgaonkar, and Hosam Fathy. 2012. Energy Storage in Datacenters: What, Where, and How Much? ACM SIGMETRICS Performance Evaluation Review. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Xiaorui Wang, Ming Chen, Charles Lefurgy, and Tom W Keller. 2009. SHIP: Scalable hierarchical power control for large-scale data centers Parallel Architectures and Compilation Techniques, 2009. PACT'09. 18th International Conference on. IEEE, 91--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Qiang Wu, Qingyuan Deng, Ganesh, Chang-Hong Hsu, Yun Jin, Sanjeev Kumar, Bin Li, Justin Meza, and Yee Jiun Song. 2016. Dynamo: Facebook's Data Center-Wide Power Management System International Symposium on Computer Architecture (ISCA). Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Jianguo Yao, Xue Liu, Wenbo He, and Ashikur Rahman. 2012. Dynamic Control of Electricity Cost with Power Demand Smoothing and Peak Shaving for Distributed Internet Data Centers.Google ScholarGoogle Scholar
  50. Yunqi Zhang, George Prekas, Giovanni Matteo Fumarola, Marcus Fontoura, Inigo Goiri, and Ricardo Bianchini. 2016. History-Based Harvesting of Spare Cycles and Storage in Large-Scale Datacenters 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SmoothOperator: Reducing Power Fragmentation and Improving Power Utilization in Large-scale Datacenters

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        ASPLOS '18: Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems
        March 2018
        827 pages
        ISBN:9781450349116
        DOI:10.1145/3173162
        • cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 53, Issue 2
          ASPLOS '18
          February 2018
          809 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/3296957
          Issue’s Table of Contents

        Copyright © 2018 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 19 March 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        ASPLOS '18 Paper Acceptance Rate56of319submissions,18%Overall Acceptance Rate535of2,713submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader