ABSTRACT
Idle-state governors partially turn off idle CPUs, allowing them to go to states known as idle-states to save power. Exiting from these idle-sates, however, imposes delays on the execution of tasks and aggravates tail latency. Menu, the default idle-state governor of Linux, predicts periods of idleness based on the historical data and the disk I/O information to choose proper idle-sates. Our experiments show that Menu can save power, but at the cost of sacrificing tail latency, making Menu an inappropriate governor for data centers that host latency-sensitive applications. In this paper, we present the initial design of Yawn, an idle-state governor that aims to mitigate tail latency without sacrificing power. Yawn leverages online machine learning techniques to predict the idle periods based on information gathered from all parameters affecting idleness, including network I/O, resulting in more accurate predictions, which in turn leads to reduced response times. Preliminary benchmarking results demonstrate that Yawn reduces the 99th latency percentile of Memcached requests by up to 40%.
- 2019. Linux Kernel Mailing List. https://lkml.org/lkml/2019/1/6/178. Accessed: 2019-04-17.Google Scholar
- Dan Ardelean, Amer Diwan, and Chandra Erdman. 2018. Performance Analysis of Cloud Applications. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). 405--417. Google ScholarDigital Library
- Luiz Barroso, Mike Marty, David Patterson, and Parthasarathy Ranganathan. 2017. Attack of the Killer Microseconds. Commun. ACM 60, 4 (2017), 48--54. Google ScholarDigital Library
- Luiz André Barroso and Urs Hölzle. 2007. The Case for Energy-Proportional Computing. Computer 40, 12 (2007), 33--37. Google ScholarDigital Library
- Luiz André Barroso and Urs Hölzle. 2009. The Datacenter as a Computer: An Introduction to the Design of Warehouse-scale Machines. Synthesis Lectures on Computer Architecture 4, 1 (2009), 1--108.Google ScholarCross Ref
- Nicolo Cesa-Bianchi and Gabor Lugosi. 2006. Prediction, Learning, and Games. Cambridge University Press. Google ScholarDigital Library
- Chih-Hsun Chou, Laxmi N Bhuyan, and Shaolei Ren. 2017. TailCut: Power Reduction under Quality and Latency Constraints in Distributed Search Systems. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). 1465--1475.Google ScholarCross Ref
- Chih-Hsun Chou, Laxmi N Bhuyan, and Daniel Wong. 2019. μDPM: Dynamic Power Management for the Microsecond Era. In 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA). 120--132.Google ScholarCross Ref
- Chih-Hsun Chou, Daniel Wong, and Laxmi N Bhuyan. 2016. DynSleep: Fine-grained Power Management for a Latency-Critical Data Center Application. In Proceedings of the 2016 International Symposium on Low Power Electronics and Design (ISLPED '16). ACM, New York, NY, USA, 212--217. Google ScholarDigital Library
- Jeffrey Dean and Luiz André Barroso. 2013. The Tail at Scale. Commun. ACM 56, 2 (2013), 74--80. Google ScholarDigital Library
- Diego Didona and Willy Zwaenepoel. 2019. Size-aware Sharding For Improving Tail Latencies in In-memory Key-value Stores. (2019), 79--94. https://www.usenix.org/conference/nsdi19/presentation/didona Google ScholarDigital Library
- L Duan, D Zhan, and J Hohnerlein. 2015. Optimizing Cloud Data Center Energy Efficiency via Dynamic Prediction of CPU Idle Intervals. In 2015 IEEE 8th International Conference on Cloud Computing. ieeexplore.ieee.org, 985--988. Google ScholarDigital Library
- Babak Falsafi, Rachid Guerraoui, Javier Picorel, and Vasileios Trigonakis. 2016. Unlocking Energy. In 2016 USENIX Annual Technical Conference (USENIX ATC 16). 393--406. Google ScholarDigital Library
- Brad Fitzpatrick and Anatoly Vorobey. 2011. Memcached: a Distributed Memory Object Caching System.Google Scholar
- Corey Gough, Ian Steiner, and Winston Saunders. 2015. Energy Efficient Servers: Blueprints for Data Center Optimization. Apress. Google ScholarDigital Library
- Vishal Gupta, Paul Brett, David A Koufaty, Dheeraj Reddy, Scott Hahn, Karsten Schwan, and Ganapati Srinivasa. 2012. The Forgotten 'Uncore': On the Energy-Efficiency of Heterogeneous Cores. In 2012 USENIX Annual Technical Conference. usenix.org, 367--372. Google ScholarDigital Library
- C H Hsu, Y Zhang, M A Laurenzano, D Meisner, T Wenisch, J Mars, L Tang, and R G Dreslinski. 2015. Adrenaline: Pinpointing and Reining in Tail Queries with Quick Voltage Boosting. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). 271--282.Google Scholar
- Thomas Ilsche, Marcus Hähnel, Robert Schöne, Mario Bielert, and Daniel Hackenberg. 2018. Powernightmares: The Challenge of Efficiently Using Sleep States on Multi-core Systems. In Euro-Par 2017: Parallel Processing Workshops. Springer International Publishing, 623--635.Google Scholar
- Intel. 2018. Intel Software Developer's Manual.Google Scholar
- Calin Iorgulescu, Reza Azimi, Youngjin Kwon, Sameh Elnikety, Manoj Syamala, Vivek Narasayya, Herodotos Herodotou, Paulo Tomita, Alex Chen, Jack Zhang, and Junhua Wang. 2018. PerfIso: Performance Isolation for Commercial Latency-Sensitive Services. In 2018 USENIX Annual Technical Conference. USENIX Association, Boston, MA, 519--532. Google ScholarDigital Library
- Svilen Kanev, Kim Hazelwood, Gu-Yeon Wei, and David Brooks. 2014. Tradeoffs Between Power Management and Tail Latency in Warehouse-scale Applications. (2014), 31--40.Google Scholar
- Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge. In ACM SIGARCH Computer Architecture News, Vol. 45. ACM, 615--629. Google ScholarDigital Library
- Harshad Kasture, Davide B Bartolini, Nathan Beckmann, and Daniel Sanchez. 2015. Rubik: Fast Analytical Power Management for Latency-critical Systems. In Proceedings of the 48th International Symposium on Microarchitecture. ACM, 598--610. Google ScholarDigital Library
- Mustafa Korkmaz, Martin Karsten, Kenneth Salem, and Semih Salihoglu. 2018. Workload-Aware CPU Performance Scaling for Transactional Database Systems. In Proceedings of the 2018 International Conference on Management of Data (SIGMOD '18). ACM, New York, NY, USA, 291--306. Google ScholarDigital Library
- Etienne Le Sueur and Gernot Heiser. 2011. Slow Down or Sleep, That Is the Question. In 2011 USENIX Annual Technical Conference. Google ScholarDigital Library
- Jacob Leverich. 2014. Mutilate: High-performance Memcached Load Generator.Google Scholar
- Jacob Leverich and Christos Kozyrakis. 2014. Reconciling High Server Utilization and Sub-millisecond Quality-of-service. In Proceedings of the Ninth European Conference on Computer Systems (EuroSys '14). ACM, New York, NY, USA, 4:1--4:14. Google ScholarDigital Library
- Jialin Li, Naveen Kr Sharma, Dan R K Ports, and Steven D Gribble. 2014. Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency. In Proceedings of the ACM Symposium on Cloud Computing (SOCC '14). ACM, New York, NY, USA, 9:1--9:14. Google ScholarDigital Library
- David Lo, Liqun Cheng, Rama Govindaraju, Luiz André Barroso, and Christos Kozyrakis. 2014. Towards Energy Proportionality for Large-scale Latency-critical Workloads. In Proceeding of the 41st Annual International Symposium on Computer Architecuture (ISCA '14). IEEE Press, Piscataway, NJ, USA, 301--312. Google ScholarDigital Library
- Yanchao Lu, Quan Chen, Yao Shen, and Minyi Guo. 2017. Electro: Toward QoS-Aware Power Management for Latency-Critical Applications. In IEEE International Symposium on Parallel and Distributed Processing with Applications. IEEE, 221--228.Google Scholar
- David Meisner, Brian T Gold, and Thomas F Wenisch. 2009. PowerNap: Eliminating Server Idle Power. SIGARCH Computer Architecture News 37, 1 (March 2009), 205--216. Google ScholarDigital Library
- David Meisner, Christopher M Sadler, Luiz André Barroso, Wolf-Dietrich Weber, and Thomas F Wenisch. 2011. Power Management of Online Data-intensive Services. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA '11). ACM, New York, NY, USA, 319--330. Google ScholarDigital Library
- David Meisner and Thomas F. Wenisch. 2012. DreamWeaver: Architectural Support for Deep Sleep. SIGARCH Comput. Archit. News 40, 1 (March 2012), 313--324. Google ScholarDigital Library
- V Pallipadi, S Li, and A Belay. 2007. cpuidle: Do Nothing, Efficiently. Proceedings of the Linux Symposium (2007).Google Scholar
- George Prekas, Mia Primorac, Adam Belay, Christos Kozyrakis, and Edouard Bugnion. 2015. Energy Proportionality and Workload Consolidation for Latency-critical Applications. In Proceedings of the Sixth ACM Symposium on Cloud Computing (SoCC '15). ACM, New York, NY, USA, 342--355. Google ScholarDigital Library
- Andrei Roba and Zoltan Baruch. 2015. An Enhanced Approach to Dynamic Power Management for the Linux Cpuidle Subsystem. In IEEE International Conference on Intelligent Computer Communication and Processing (ICCP), 2015. 511--517.Google ScholarCross Ref
- R Sen and A Halverson. 2017. Frequency Governors for Cloud Database OLTP Workloads. In 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED). 1--6.Google Scholar
- Rathijit Sen and David A Wood. 2017. Pareto Governors for Energy-Optimal Computing. ACM Trans. Archit. Code Optim. 14, 1 (March 2017), 6:1--6:25. Google ScholarDigital Library
- Akshitha Sriraman and Thomas F Wenisch. 2018. μTune: Auto-Tuned Threading for OLDI Microservices. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 177--194. Google ScholarDigital Library
- Xi Yang, Stephen M Blackburn, and Kathryn S McKinley. 2016. Elfen Scheduling: Fine-Grain Principled Borrowing from Latency-Critical Workloads Using Simultaneous Multithreading. In 2016 USENIX Annual Technical Conference. usenix.org, 309--322. Google ScholarDigital Library
- Fan Yao, Jingxin Wu, Suresh Subramaniam, and Guru Venkataramani. 2017. WASP: Workload Adaptive Energy-latency Optimization in Server Farms Using Server Low-power States. In IEEE 10th International Conference on Cloud Computing (CLOUD), 2017. 171--178.Google Scholar
- Fan Yao, Jingxin Wu, Guru Venkataramani, and Suresh Subramaniam. 2017. TS-Bat: Leveraging Temporal-Spatial Batching for Data Center Energy Optimization. In IEEE Global Communications Conference (GLOBECOM 2017). 1--6.Google Scholar
- Xin Zhan, Reza Azimi, Svilen Kanev, David Brooks, and Sherief Reda. 2017. Carb: A C-state Power Management Arbiter for Latency-critical Workloads. IEEE Computer Architecture Letters 16, 1 (2017), 6--9.Google ScholarCross Ref
Recommendations
DynSleep: Fine-grained Power Management for a Latency-Critical Data Center Application
ISLPED '16: Proceedings of the 2016 International Symposium on Low Power Electronics and DesignServers running in datacenters are commonly kept underutilized to meet stringent latency targets. Due to poor energy-proportionality in commodity servers, the low utilization results in wasteful power consumption that cost millions of dollars. Applying ...
NMAP: Power Management Based on Network Packet Processing Mode Transition for Latency-Critical Workloads
MICRO '21: MICRO-54: 54th Annual IEEE/ACM International Symposium on MicroarchitectureProcessor power management exploiting Dynamic Voltage and Frequency Scaling (DVFS) plays a crucial role in improving the data-center’s energy efficiency. However, we observe that current power management policies in Linux (i.e., governors) often ...
Joint dynamic voltage scaling and adaptive body biasing for heterogeneous distributed real-time embedded systems
While dynamic power consumption has traditionally been the primary source of power consumption, leakage power is becoming an increasingly important concern as technology feature size continues to shrink. Previous system-level approaches focus on ...
Comments