ABSTRACT
Caching is a classic technique for improving system performance by reducing client-perceived latency and server load. However, cache management still needs to be improved and is even more difficult in multi-tenant systems. To shed light on these problems and discuss possible solutions, we performed a workload characterization of a multi-tenant cache operated by a large ecommerce platform. In this platform, each one of thousands of tenants operates independently. We found that the workload patterns of the tenants could be very different. Also, the characteristics of the tenants change over time. Based on these findings, we highlight strategies to improve the management of multi-tenant cache systems.
- Nginx documentation. https://nginx.org/en/docs/. Accessed: 2023--11-05.Google Scholar
- Arlitt, M. F., and Williamson, C. L. Internet web servers: workload characterization and performance implications. IEEE/ACM Trans. Netw. 5, 5 (1997), 631--645.Google ScholarDigital Library
- Berger, D. S., Berg, B., Zhu, T., Sen, S., and Harchol-Balter, M. Robinhood: Tail latency aware caching - dynamic reallocation from cache-rich to cache-poor. In 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018, Carlsbad, CA, USA, October 8--10, 2018 (2018), A. C. Arpaci-Dusseau and G. Voelker, Eds., USENIX Association, pp. 195--212.Google Scholar
- Braun, H.-W., and Claffy, K. C. Web traffic characterization: an assessment of the impact of caching documents from ncsa's web server. Computer Networks and ISDN systems 28, 1--2 (1995), 37--51.Google Scholar
- Chai, Y., Du, Z., Qin, X., and Bader, D. A. WEC: improving durability of SSD cache drives by caching write-efficient data. IEEE Trans. Computers 64, 11 (2015), 3304--3316.Google ScholarDigital Library
- Cidon, A., Rushton, D., Rumble, S. M., and Stutsman, R. Memshare: a dynamic multi-tenant key-value cache. In 2017 USENIX Annual Technical Conference, USENIX ATC 2017, Santa Clara, CA, USA, July 12--14, 2017 (2017), D. D. Silva and B. Ford, Eds., USENIX Association, pp. 321--334.Google Scholar
- Einziger, G., Friedman, R., and Manes, B. Tinylfu: A highly efficient cache admission policy. ACM Trans. Storage 13, 4 (2017), 35:1--35:31.Google ScholarDigital Library
- Eisenman, A., Cidon, A., Pergament, E., Haimovich, O., Stutsman, R., Alizadeh, M., and Katti, S. Flashield: a hybrid key-value cache that controls flash write amplification. In 16th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2019, Boston, MA, February 26--28, 2019 (2019), J. R. Lorch and M. Yu, Eds., USENIX Association, pp. 65--78.Google Scholar
- Gu, R., Li, S., Dai, H., Wang, H., Luo, Y., Fan, B., Basat, R. B., Wang, K., Song, Z., Chen, S., Wang, B., Huang, Y., and Chen, G. Adaptive online cache capacity optimization via lightweight working set size estimation at scale. In 2023 USENIX Annual Technical Conference, USENIX ATC 2023, Boston, MA, USA, July 10--12, 2023 (2023), J. Lawall and D. Williams, Eds., USENIX Association, pp. 467--484.Google Scholar
- Huang, S., Wei, Q., Feng, D., Chen, J., and Chen, C. Improving flash-based disk cache with lazy adaptive replacement. ACM Trans. Storage 12, 2 (2016), 8:1--8:24.Google ScholarDigital Library
- Megiddo, N., and Modha, D. S. ARC: A self-tuning, low overhead replacement cache. In Proceedings of the FAST '03 Conference on File and Storage Technologies, March 31 - April 2, 2003, Cathedral Hill Hotel, San Francisco, California, USA (2003), J. Chase, Ed., USENIX.Google Scholar
- Ponciano, L., Andrade, N., and Brasileiro, F. V. Bittorrent traffic from a caching perspective. J. Braz. Comput. Soc. 19, 4 (2013), 475--491.Google ScholarCross Ref
- Pu, Q., Li, H., Zaharia, M., Ghodsi, A., and Stoica, I. Fairride: Near-optimal, fair cache sharing. In 13th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2016, Santa Clara, CA, USA, March 16--18, 2016 (2016), K. J. Argyraki and R. Isaacs, Eds., USENIX Association, pp. 393--406.Google Scholar
- Sediyono, A. Dynamic average of inter-reference time as a metric of web cache replacement policy. In Proceeding of International Conference on Rural Information and Communication Technology (2009).Google Scholar
- Seyri, A., Pan, A., and Vamanan, B. Dynamically sharing memory between memcached tenants using tingo. In Proceedings of the 15th International Conference on emerging Networking EXperiments and Technologies, CoNEXT 2019, Companion Volume, Orlando, FL, USA, December 9--12, 2019 (2019), ACM, pp. 40--42.Google ScholarDigital Library
- Suh, G. E., Rudolph, L., and Devadas, S. Dynamic partitioning of shared cache memory. J. Supercomput. 28, 1 (2004), 7--26.Google ScholarDigital Library
- Vallamsetty, U., Kant, K., and Mohapatra, P. Characterization of e-commerce traffic. Electronic Commerce Research 3, 1 (2003), 167--192.Google ScholarDigital Library
- Xiang, X., Ding, C., Luo, H., and Bao, B. HOTL: a higher order theory of locality. In Architectural Support for Programming Languages and Operating Systems, ASPLOS 2013, Houston, TX, USA, March 16--20, 2013 (2013), V. Sarkar and R. Bodík, Eds., ACM, pp. 343--356.Google Scholar
- Yang, J., Qiu, Z., Zhang, Y., Yue, Y., and Rashmi, K. V. FIFO can be better than LRU: the power of lazy promotion and quick demotion. In Proceedings of the 19th Workshop on Hot Topics in Operating Systems, HOTOS 2023, Providence, RI, USA, June 22--24, 2023 (2023), M. Schwarzkopf, A. Baumann, and N. Crooks, Eds., ACM, pp. 70--79.Google ScholarDigital Library
- Yang, J., Zhang, Y., Qiu, Z., Yue, Y., and Vinayak, R. FIFO queues are all you need for cache eviction. In Proceedings of the 29th Symposium on Operating Systems Principles, SOSP 2023, Koblenz, Germany, October 23--26, 2023 (2023), J. Flinn, M. I. Seltzer, P. Druschel, A. Kaufmann, and J. Mace, Eds., ACM, pp. 130--149.Google ScholarDigital Library
- Yang, T., Pollen, S., Uysal, M., Merchant, A., Wolfmeister, H., and Khalid, J. Cachesack: Theory and experience of google's admission optimization for datacenter flash caches. ACM Trans. Storage 19, 2 (2023), 13:1--13:24.Google ScholarDigital Library
Index Terms
- No Clash on Cache: Observations from a Multi-tenant Ecommerce Platform
Recommendations
Increasing hardware data prefetching performance using the second-level cache
Techniques to reduce or tolerate large memory latencies are critical for achieving high processor performance. Hardware data prefetching is one of the most heavily studied solutions, but it is essentially applied to first-level caches where it can ...
Reducing cache misses through programmable decoders
Level-one caches normally reside on a processor's critical path, which determines clock frequency. Therefore, fast access to level-one cache is important. Direct-mapped caches exhibit faster access time, but poor hit rates, compared with same sized set-...
A new cache replacement algorithm for last-level caches by exploiting tag-distance correlation of cache lines
Cache memory plays a crucial role in determining the performance of processors, especially for embedded processors where area and power are tightly constrained. It is necessary to have effective management mechanisms, such as cache replacement policies, ...
Comments