skip to main content
10.1145/3307650.3322227acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

SoftSKU: optimizing server architectures for microservice diversity @scale

Published:22 June 2019Publication History

ABSTRACT

The variety and complexity of microservices in warehouse-scale data centers has grown precipitously over the last few years to support a growing user base and an evolving product portfolio. Despite accelerating microservice diversity, there is a strong requirement to limit diversity in underlying server hardware to maintain hardware resource fungibility, preserve procurement economies of scale, and curb qualification/test overheads. As such, there is an urgent need for strategies that enable limited server CPU architectures (a.k.a "SKUs") to provide performance and energy efficiency over diverse microservices. To this end, we first undertake a comprehensive characterization of the top seven microservices that run on the compute-optimized data center fleet at Facebook.

Our characterization reveals profound diversity in OS and I/O interaction, cache misses, memory bandwidth utilization, instruction mix, and CPU stall behavior. Whereas customizing a CPU SKU for each microservice might be beneficial, it is prohibitive. Instead, we argue for "soft SKUs", wherein we exploit coarse-grain (e.g., boot time) configuration knobs to tune the platform for a particular microservice. We develop a tool, μSKU, that automates search over a soft-SKU design space using A/B testing in production and demonstrate how it can obtain statistically significant gains (up to 7.2% and 4.5% performance improvement over stock and production servers, respectively) with no additional hardware requirements.

References

  1. S. Kanev, J. P. Darago, K. Hazelwood, P. Ranganathan, T. Moseley, G.-Y. Wei, and D. Brooks, "Profiling a warehouse-scale computer," in International Symposium on Computer Architecture, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. "The biggest thing amazon got right: The platform." https://gigaom.com/2011/10/12/419-the-biggest-thing-amazon-got-right-the-platform/.Google ScholarGoogle Scholar
  3. "Adopting microservices at netflix: Lessons for architectural design." https://www.nginx.com/blog/microservices-at-netflix-architectural-best-practices/.Google ScholarGoogle Scholar
  4. "Scaling Gilt: from Monolithic Ruby Application to Distributed Scala Micro-Services Architecture." https://www.infoq.com/presentations/scale-gilt.Google ScholarGoogle Scholar
  5. M. Villamizar, O. Garcés, H. Castro, M. Verano, L. Salamanca, R. Casallas, and S. Gil, "Evaluating the monolithic and the microservice architecture pattern to deploy web applications in the cloud," in Computing Colombian Conference, 2015.Google ScholarGoogle Scholar
  6. "What is microservices architecture?." https://smartbear.com/learn/api-design/what-are-microservices/.Google ScholarGoogle Scholar
  7. S. Kanev, K. Hazelwood, G.-Y. Wei, and D. Brooks, "Tradeoffs between power management and tail latency in warehouse-scale applications," in IEEE International Symposium on Workload Characterization, 2014.Google ScholarGoogle Scholar
  8. I. Nadareishvili, R. Mitra, M. McLarty, and M. Amundsen, Microservice Architecture: Aligning Principles, Practices, and Culture. 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Sriraman and T. F. Wenisch, "μSuite: A Benchmark Suite for Microservices," in IEEE International Symposium on Workload Characterization, 2018.Google ScholarGoogle Scholar
  10. A. Sriraman, "Unfair Data Centers for Fun and Profit," in Wild and Crazy Ideas (ASPLOS), 2019.Google ScholarGoogle Scholar
  11. A. Sriraman and T. F. Wenisch, "μTune: Auto-Tuned Threading for OLDI Microservices," in Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation, 2018.Google ScholarGoogle Scholar
  12. B. Fitzpatrick, "Distributed Caching with Memcached," Linux J., 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. "Mcrouter." https://github.com/facebook/mcrouter.Google ScholarGoogle Scholar
  14. Y. Zhang, D. Meisner, J. Mars, and L. Tang, "Treadmill: Attributing the Source of Tail Latency Through Precise Load Testing and Statistical Inference," in International Symposium on Computer Architecture, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. X. He, J. Pan, O. Jin, T. Xu, B. Liu, T. Xu, Y. Shi, A. Atallah, R. Herbrich, S. Bowers, and J. Q. n. Candela, "Practical Lessons from Predicting Clicks on Ads at Facebook," in International Workshop on Data Mining for Online Advertising, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. C. Li, et al., "TAO: Facebook's Distributed Data Store for the Social Graph," in USENIX Annual Technical Conference, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. M. Zuckerberg, R. Sanghvi, A. Bosworth, C. Cox, A. Sittig, C. Hughes, K. Geminder, and D. Corson, "Dynamically providing a news feed about a user of a social network," 2010.Google ScholarGoogle Scholar
  18. G. Ottoni, "HHVM JIT: A Profile-guided, Region-based Compiler for PHP and Hack," in Conference on Programming Language Design and Implementation, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. L. Henning, "Spec cpu2006 benchmark descriptions," SIGARCH Comp. Arch. News, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. A. Limaye and T. Adegbija, "A Workload Characterization of the SPEC CPU2017 Benchmark Suite," in International Symposium on Performance Analysis of Systems and Software, 2018.Google ScholarGoogle Scholar
  21. M. Ferdman, A. Adileh, O. Kocberber, S. Volos, M. Alisafaee, D. Jevdjic, C. Kaynak, A. D. Popescu, A. Ailamaki, and B. Falsafi, "Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware," in International Conference on Architectural Support for Programming Languages and Operating Systems, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Y. Gan and C. Delimitrou, "The Architectural Implications of Cloud Microservices," IEEE Computer Architecture Letters, 2018.Google ScholarGoogle Scholar
  23. G. Ayers, J. H. Ahn, C. Kozyrakis, and P. Ranganathan, "Memory Hierarchy for Web Search," in International Symposium on High Performance Computer Architecture (HPCA), 2018.Google ScholarGoogle Scholar
  24. O. Yamauchi, Hack and HHVM: programming productivity without breaking things. 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. K. Adams, J. Evans, B. Maher, G. Ottoni, A. Paroski, B. Simmers, E. Smith, and O. Yamauchi, "The hiphop virtual machine," in Acm Sigplan Notices, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. E. Rader and R. Gray, "Understanding user beliefs about algorithmic curation in the facebook news feed," in ACM conference on human factors in computing systems, 2015.Google ScholarGoogle Scholar
  27. E. Bakshy, S. Messing, and L. A. Adamic, "Exposure to ideologically diverse news and opinion on Facebook," Science, 2015.Google ScholarGoogle Scholar
  28. K. Hazelwood, S. Bird, D. Brooks, S. Chintala, U. Diril, D. Dzhulgakov, M. Fawzy, B. Jia, Y. Jia, and A. Kalro, "Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective," in International Symposium on High Performance Computer Architecture, 2018.Google ScholarGoogle Scholar
  29. V. Venkataramani, Z. Amsden, N. Bronson, G. Cabrera III, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, and J. Hoon, "Tao: how facebook serves the social graph," in International Conference on Management of Data, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. J. L. Carlson, Redis in Action. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. J. Doweck, W.-F. Kao, A. K.-y. Lu, J. Mandelblat, A. Rahatekar, L. Rappoport, E. Rotem, A. Yasin, and A. Yoaz, "Inside 6th-generation intel core: new microarchitecture code-named skylake," IEEE Micro, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. "Unlock system performance in dynamic environments." https://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-technology.html.Google ScholarGoogle Scholar
  33. C. Intel, "Improving Real-Time Performance by Utilizing Cache Allocation Technology," Intel Corporation, April, 2015.Google ScholarGoogle Scholar
  34. "Code and Data Prioritization - Introduction and Usage Models in the Intel Xeon Processor E5 v4 Family." https://software.intel.com/en-us/articles/introduction-to-code-and-data-prioritization-with-usage-models.Google ScholarGoogle Scholar
  35. D. Borthakur, J. Gray, J. S. Sarma, K. Muthukkaruppan, N. Spiegelberg, H. Kuang, K. Ranganathan, D. Molkov, A. Menon, and S. Rash, "Apache Hadoop goes realtime at Facebook," in International Conference on Management of data, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. T. Pelkonen, S. Franklin, J. Teller, P. Cavallaro, Q. Huang, J. Meza, and K. Veeraraghavan, "Gorilla: A fast, scalable, in-memory time series database," Proceedings of the VLDB Endowment, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. A. S. Aiyer, M. Bautin, G. J. Chen, P. Damania, P. Khemani, K. Muthukkaruppan, K. Ranganathan, N. Spiegelberg, L. Tang, and M. Vaidya, "Storage infrastructure behind Facebook messages: Using HBase at scale," IEEE Data Eng. Bull., 2012.Google ScholarGoogle Scholar
  38. G. Ren, E. Tune, T. Moseley, Y. Shi, S. Rus, and R. Hundt, "Google-wide profiling: A continuous profiling infrastructure for data centers," IEEE micro, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. "Emon user's guide." https://software.intel.com/en-us/download/emon-user-guide.Google ScholarGoogle Scholar
  40. N. Agrawal, V. Prabhakaran, T. Wobber, J. D. Davis, M. S. Manasse, and R. Panigrahy, "Design tradeoffs for SSD performance," in USENIX Annual Technical Conference, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. "Intel and Micron Produce Breakthrough Memory Technology." https://newsroom.intel.com/news-releases/intel-and-micron-produce-breakthrough-memory-technology/.Google ScholarGoogle Scholar
  42. V. Gogte, S. Diestelhorst, W. Wang, S. Narayanasamy, P. M. Chen, and T. F. Wenisch, "Persistency for synchronization-free regions," in Programming Language Design and Implementation, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. A. Kolli, V. Gogte, A. Saidi, S. Diestelhorst, P. M. Chen, S. Narayanasamy, and T. F. Wenisch, "Language-level Persistency," in International Symposium on Computer Architecture, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. J. Vienne, J. Chen, M. Wasi-Ur-Rahman, N. S. Islam, H. Subramoni, and D. K. Panda, "Performance analysis and evaluation of infiniband fdr and 40gige roce on hpc and cloud computing systems," in IEEE 20th Annual Symposium on High-Performance Interconnects, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. S. Cho, A. Suresh, T. Palit, M. Ferdman, and N. Honarmand, "Taming the Killer Microsecond," in International Symposium on Microarchitecture, 2018.Google ScholarGoogle Scholar
  46. L. Barroso, M. Marty, D. Patterson, and P. Ranganathan, "Attack of the Killer Microseconds," Communications of the ACM, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. A. Mirhosseini, A. Sriraman, and T. F. Wenisch, "Enhancing server efficiency in the face of killer microseconds," in International Symposium on High Performance Computer Architecture, 2019.Google ScholarGoogle Scholar
  48. A. Mirhosseini, A. Sriraman, and T. F. Wenisch, "Hiding the Microsecond-Scale Latency of Storage-Class Memories with Duplexity," in Annual Non-Volative Memories Workshop, 2019.Google ScholarGoogle Scholar
  49. L. Luo, A. Sriraman, B. Fugate, S. Hu, G. Pokam, C. J. Newburn, and J. Devietti, "LASER: Light, Accurate Sharing dEtection and Repair," in International Symposium on High Performance Computer Architecture, 2016.Google ScholarGoogle Scholar
  50. A. Sriraman and T. F. Wenisch, "Performance-Efficient Notification Paradigms for Disaggregated OLDI Microservices," in Workshop on Resource Disaggregation, 2019.Google ScholarGoogle Scholar
  51. A. Sriraman, S. Liu, S. Gunbay, S. Su, and T. F. Wenisch, "Deconstructing the Tail at Scale Effect Across Network Protocols," The Annual Workshop on Duplicating, Deconstructing, and Debunking, 2016.Google ScholarGoogle Scholar
  52. D. Tsafrir, "The context-switch overhead inflicted by hardware interrupts (and the enigma of do-nothing loops)," in Workshop on Experimental computer science, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. C. Li, C. Ding, and K. Shen, "Quantifying the cost of context switch," in Workshop on Experimental computer science, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Y. Dong, X. Yang, J. Li, G. Liao, K. Tian, and H. Guan, "High performance network virtualization with SR-IOV," Journal of Parallel and Distributed Computing, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. E. Y. Jeong, S. Woo, M. Jamshed, H. Jeong, S. Ihm, D. Han, and K. Park, "mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems," in USENIX Conference on Networked Systems Design and Implementation, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. A. Belay, G. Prekas, A. Klimovic, S. Grossman, C. Kozyrakis, and E. Bugnion, "IX: A Protected Dataplane Operating System for High Throughput and Low Latency," in USENIX Conference on Operating Systems Design and Implementation, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. A. Belay, A. Bittau, A. Mashtizadeh, D. Terei, D. Mazières, and C. Kozyrakis, "Dune: Safe User-level Access to Privileged CPU Features," in USENIX Symposium on Operating Systems Design and Implementation, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. P. Emmerich, M. Pudelko, S. Bauer, and G. Carle, "User Space Network Drivers," in Proceedings of the Applied Networking Research Workshop, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. M. Lavasani, H. Angepat, and D. Chiou, "An FPGA-based in-line accelerator for memcached," IEEE Computer Architecture Letters, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. T. R. Learmont, "Fine-grained consistency mechanism for optimistic concurrency control using lock groups," 2001.Google ScholarGoogle Scholar
  61. C. J. Blythe, G. A. Cuomo, E. A. Daughtrey, and M. R. Hogstrom, "Dynamic thread pool tuning techniques," 2007.Google ScholarGoogle Scholar
  62. A. Starovoitov, "BPF in LLVM and kernel," in Linux Plumbers Conference, 2015.Google ScholarGoogle Scholar
  63. A. Yasin, Y. Ben-Asher, and A. Mendelson, "Deep-dive analysis of the data analytics workload in cloudsuite," in International Symposium on Workload Characterization, 2014.Google ScholarGoogle Scholar
  64. D. Chen, D. X. Li, and T. Moseley, "AutoFDO: Automatic feedback-directed optimization for warehouse-scale applications," in International Symposium on Code Generation & Optimization, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. T. Johnson, M. Amini, and X. D. Li, "ThinLTO: scalable and incremental LTO," in IEEE/ACM International Symposium on Code Generation and Optimization, 2017. Google ScholarGoogle Scholar
  66. N. Hardavellas, M. Ferdman, B. Falsafi, and A. Ailamaki, "Reactive NUCA: Near-optimal Block Placement and Replication in Distributed Caches," in International Symposium on Computer Architecture, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. I. Papadakis, K. Nikas, V. Karakostas, G. Goumas, and N. Koziris, "Improving QoS and Utilisation in modern multi-core servers with Dynamic Cache Partitioning," in Proceedings of the Joined Workshops COSH 2017 and VisorHPC 2017, 2017.Google ScholarGoogle Scholar
  68. P. Lotfi-Kamran, B. Grot, M. Ferdman, S. Volos, O. Kocberber, J. Picorel, A. Adileh, D. Jevdjic, S. Idgunji, E. Ozer, and B. Falsafi, "Scale-out Processors," in International Symposium on Computer Architecture, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. S. Bharadwaj, G. Cox, T. Krishna, and A. Bhattacharjee, "Scalable Distributed Shared Last-Level TLBs Using Low-Latency Interconnects," in International Symposium on Microarchitecture, 2018.Google ScholarGoogle Scholar
  70. R. Kumar, B. Grot, and V. Nagarajan, "Blasting Through the Front-End Bottleneck with Shotgun," in International Conference on Architectural Support for Programming Languages and Operating Systems, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. A. Bhattacharjee, "Translation-Triggered Prefetching," in International Conference on Architectural Support for Programming Languages and Operating Systems, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. G. Cox and A. Bhattacharjee, "Efficient Address Translation for Architectures with Multiple Page Sizes," in International Conference on Architectural Support for Programming Languages and Operating Systems, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. B. Pham, A. Bhattacharjee, Y. Eckert, and G. H. Loh, "Increasing TLB reach by exploiting clustering in page translations," in International Symposium on High Performance Computer Architecture, 2014.Google ScholarGoogle Scholar
  74. B. Pham, V. Vaidyanathan, A. Jaleel, and A. Bhattacharjee, "Colt: Coalesced large-reach TLBs," in International Symposium on Microarchitecture, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. V. Karakostas, J. Gandhi, F. Ayar, A. Cristal, M. D. Hill, K. S. McKinley, M. Nemirovsky, M. M. Swift, and O. Ünsal, "Redundant Memory Mappings for Fast Access to Large Memories," in International Symposium on Computer Architecture, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. "Intel Memory Latency Checker v3.6." https://software.intel.com/en-us/articles/intelr-memory-latency-checker.Google ScholarGoogle Scholar
  77. B. Falsafi and T. F. Wenisch, "A primer on hardware prefetching," Synthesis Lectures on Computer Architecture, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  78. D. Meisner, J. Wu, and T. F. Wenisch, "BigHouse: A Simulation Infrastructure for Data Center Systems," in International Symposium on Performance Analysis of Systems & Software, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. E. Rotem, "Intel architecture, code name Skylake deep dive: A new architecture to manage power performance and energy efficiency," in Intel Developer Forum, 2015.Google ScholarGoogle Scholar
  80. D. Hackenberg, R. Schöne, T. Ilsche, D. Molka, J. Schuchart, and R. Geyer, "An energy efficiency feature survey of the intel haswell processor," in IEEE International Parallel and Distributed Processing Symposium Workshop, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. H. Akkan, M. Lang, and L. M. Liebrock, "Stepping towards noiseless linux environment," in International workshop on runtime and operating systems for supercomputers, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. "Intel resource director technology (rdt) in linux." https://01.org/intel-rdt-linux.Google ScholarGoogle Scholar
  83. "Disclosure of H/W prefetcher control on some Intel processors." https://software.intel.com/en-us/articles/disclosure-of-hw-prefetcher-control-on-some-intel-processors.Google ScholarGoogle Scholar
  84. A. Arcangeli, "Transparent hugepage support," in KVM forum, 2010.Google ScholarGoogle Scholar
  85. A. S. Gadre, K. Kabra, A. Vasani, and K. Darak, "X-xen: huge page support in xen," in Linux Symposium, 2011.Google ScholarGoogle Scholar
  86. B. Selman and C. P. Gomes, "Hill-climbing search," Encyclopedia of Cognitive Science, 2006.Google ScholarGoogle Scholar
  87. L. A. Barroso, J. Dean, and U. Holzle, "Web search for a planet: The google cluster architecture," in IEEE Micro, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. K. Lim, P. Ranganathan, J. Chang, C. Patel, T. Mudge, and S. Reinhardt, "Understanding and designing new server architectures for emerging warehouse-computing environments," in ACM SIGARCH Computer Architecture News, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. V. Janapa Reddi, B. C. Lee, T. Chilimbi, and K. Vaid, "Web search using mobile cores: quantifying and mitigating the price of efficiency," in ACM SIGARCH Computer Architecture News, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan, "Fawn: A fast array of wimpy nodes," in Symposium on Operating Systems Principles, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. P. Trancoso, J.-L. Larriba-Pey, Z. Zhang, and J. Torrellas, "The memory performance of DSS commercial workloads in shared-memory multiprocessors," in International Symposium High-Performance Computer Architecture, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. L. A. Barroso, K. Gharachorloo, and E. Bugnion, "Memory system characterization of commercial workloads," in ACM SIGARCH Computer Architecture News, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. P. Andrew, M. C. Adrian, S. C. Eric, D. Chiou, and K. Constantinides, "A reconfigurable fabric for accelerating large-scale datacenter services," in International Symposium on Computer Architecuture, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Z. Jia, L. Wang, J. Zhan, L. Zhang, and C. Luo, "Characterizing data analysis workloads in data centers," in International Symposium on Workload Characterization, 2013.Google ScholarGoogle Scholar
  95. C. Kozyrakis, A. Kansal, S. Sankar, and K. Vaid, "Server engineering insights for large-scale online services," IEEE micro, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. Y. Zhu, D. Richins, M. Halpern, and V. J. Reddi, "Microarchitectural Implications of Event-driven Server-side Web Applications," in International Symposium on Microarchitecture, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. H. M. Makrani and H. Homayoun, "MeNa: A memory navigator for modern hardware in a scale-out environment," in International Symposium on Workload Characterization, 2017.Google ScholarGoogle Scholar
  98. C.-H. Hsu, Y. Zhang, M. A. Laurenzano, D. Meisner, T. Wenisch, L. Tang, J. Mars, and R. Dreslinski, "Adrenaline: Pinpointing and Reining in Tail Queries with Quick Voltage Boosting," in International Symposium on High Performance Computer Architecture, 2015.Google ScholarGoogle Scholar
  99. H. Kasture, D. B. Bartolini, N. Beckmann, and D. Sanchez, "Rubik: Fast analytical power management for latency-critical systems," in International Symposium on Microarchitecture, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. G. Prekas, M. Primorac, A. Belay, C. Kozyrakis, and E. Bugnion, "Energy Proportionality and Workload Consolidation for Latency-critical Applications," in ACM Symposium on Cloud Computing, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. M. E. Haque, Y. He, S. Elnikety, T. D. Nguyen, R. Bianchini, and K. S. McKinley, "Exploiting Heterogeneity for Tail Latency and Energy Efficiency," in International Symposium on Microarchitecture, 2017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. S. Panneerselvam and M. Swift, "Rinnegan: Efficient Resource Use in Heterogeneous Architectures," in International Conference on Parallel Architectures and Compilation, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. C. Delimitrou and C. Kozyrakis, "Amdahl's law for tail latency," Communications of the ACM, 2018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. K. Chang, A. Kashyap, H. Hassan, S. Ghose, K. Hsieh, D. Lee, T. Li, G. Pekhimenko, S. Khan, and O. Mutlu, "Understanding Latency Variation in Modern DRAM Chips: Experimental Characterization, Analysis, and Optimization," in International Conference on Measurement and Modeling of Computer Science, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. M. Awasthi, "Rethinking Design Metrics for Datacenter DRAM," in International Symposium on Memory Systems, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. S. Volos, D. Jevdjic, B. Falsafi, and B. Grot, "An effective dram cache architecture for scale-out servers," tech. rep., 2016.Google ScholarGoogle Scholar
  107. Y. Wang, A. Tavakkol, L. Orosa, S. Ghose, N. Ghiasi, M. Patel, J. S. Kim, H. Hassan, M. Sadrosadati, and O. Mutlu, "Reducing DRAM Latency via Charge-Level-Aware Look-Ahead Partial Restoration," in International Symposium on Microarchitecture, 2018.Google ScholarGoogle Scholar
  108. C. Kaynak, B. Grot, and B. Falsafi, "Confluence: Unified Instruction Supply for Scale-out Servers," in International Symposium on Microarchitecture, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. J. Li, N. K. Sharma, D. R. K. Ports, and S. D. Gribble, "Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency," in ACM Symposium on Cloud Computing, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. M. Kambadur, T. Moseley, R. Hank, and M. A. Kim, "Measuring interference between live datacenter applications," in International Conference on High Performance Computing, Networking, Storage and Analysis, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. J. Mars, L. Tang, R. Hundt, K. Skadron, and M. L. Soffa, "Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations," in International Symposium on Microarchitecture, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. X. Zhang, E. Tune, R. Hagmann, R. Jnagal, V. Gokhale, and J. Wilkes, "Cpi 2: CPU performance isolation for shared compute clusters," in European Conference on Computer Systems, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  113. Y. Xu, Z. Musgrave, B. Noble, and M. Bailey, "Bobtail: Avoiding Long Tails in the Cloud," in NSDI, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. L. Tang, J. Mars, N. Vachharajani, R. Hundt, and M. L. Soffa, "The Impact of Memory Subsystem Resource Sharing on Datacenter Applications," in Int. Symposium on Computer Architecture, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. J. Mars and L. Tang, "Whare-map: heterogeneity in homogeneous warehouse-scale computers," in International Symposium on Computer Architecture, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. C. Delimitrou and C. Kozyrakis, "Paragon: QoS-aware Scheduling for Heterogeneous Datacenters," in International Conference on Architectural Support for Programming Languages and Operating Systems, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. X. Yang, S. M. Blackburn, and K. S. McKinley, "Elfen Scheduling: Fine-Grain Principled Borrowing from Latency-Critical Workloads Using Simultaneous Multithreading," in USENIX Annual Technical Conference, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  118. N. Mishra, J. D. Lafferty, and H. Hoffmann, "Esp: A machine learning approach to predicting application interference," in International Conference on Autonomic Computing, 2017.Google ScholarGoogle Scholar

Index Terms

  1. SoftSKU: optimizing server architectures for microservice diversity @scale

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ISCA '19: Proceedings of the 46th International Symposium on Computer Architecture
      June 2019
      849 pages
      ISBN:9781450366694
      DOI:10.1145/3307650

      Copyright © 2019 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 June 2019

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ISCA '19 Paper Acceptance Rate62of365submissions,17%Overall Acceptance Rate543of3,203submissions,17%

      Upcoming Conference

      ISCA '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader