ABSTRACT
Recent proposals for the disaggregation of compute, memory, storage, and accelerators in data centers promise substantial operational benefits. Unfortunately, for resources like memory, this comes at the cost of performance overhead due to the potential insertion of network latency into every load and store operation. This effect is particularly felt by data-intensive systems due to the size of their working sets, the frequency at which they need to access memory, and the relatively low computation per access. This performance impairment offsets the elasticity benefit of disaggregated memory. This paper presents TELEPORT, a compute pushdown framework for data-intensive systems that run on disaggregated architectures; compared to prior work on compute pushdown, TELEPORT is unique in its efficiency and flexibility. We have developed optimization prin- ciples for several popular systems including a columnar in-memory DBMS, a graph processing system, and a MapReduce system. The evaluation results show that using TELEPORT to push down simple operators improves the performance of these systems on state-of-the-art disaggregated OSes by an order of magnitude, thus fully exploiting the elasticity of disaggregated data centers.
- Big data analytics on-premises, in the cloud, or on hadoop | vertica. https://www.vertica.com.Google Scholar
- GraphLab PowerGraph. https://github.com/jegonzal/PowerGraph.Google Scholar
- LegoOS. https://github.com/WukLab/LegoOS.Google Scholar
- MonetDB. https://www.monetdb.org/.Google Scholar
- Connectx-6 single/dual-port adapter supporting 200Gb/s with VPI. https://www.mellanox.com/products/infiniband-adapters/connectx-6, 2020.Google Scholar
- Postgresql: The world's most advanced open source relational database. https://www.postgresql.org/, 2020.Google Scholar
- Bigquery: Cloud data warehouse. https://cloud.google.com/bigquery, 2021.Google Scholar
- Sarita V. Adve and Mark D. Hill. Weak ordering - A new definition. In Jean-Loup Baer, Larry Snyder, and James R. Goodman, editors, Proceedings of the 17th Annual International Symposium on Computer Architecture, Seattle, WA, USA, June 1990, pages 2--14. ACM, 1990.Google Scholar
- Marcos K. Aguilera, Kimberly Keeton, Stanko Novakovic, and Sharad Singhal. Designing far memory data structures: Think outside the box. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS), 2019.Google ScholarDigital Library
- Sebastian Angel, Mihir Nanavati, and Siddhartha Sen. Disaggregation and the application. In Proceedings of the USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), 2020.Google Scholar
- Panagiotis Antonopoulos, Alex Budovski, Cristian Diaconu, Alejandro Hernandez Saenz, Jack Hu, Hanuma Kodavalla, Donald Kossmann, Sandeep Lingam, Umar Farooq Minhas, Naveen Prakash, Vijendra Purohit, Hugh Qu, Chaitanya Sreenivas Ravella, Krystyna Reisteter, Sheetal Shrotri, Dixin Tang, and Vikram Wakade. Socrates: The new SQL server in the cloud. In Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska, editors, Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, pages 1743--1756. ACM, 2019.Google Scholar
- Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. Spark SQL: relational data processing in spark. In Timos K. Sellis, Susan B. Davidson, and Zachary G. Ives, editors, Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015, pages 1383--1394. ACM, 2015.Google ScholarDigital Library
- David F. Bacon, Nathan Bales, Nicolas Bruno, Brian F. Cooper, Adam Dickinson, Andrew Fikes, Campbell Fraser, Andrey Gubarev, Milind Joshi, Eugene Kogan, Alexander Lloyd, Sergey Melnik, Rajesh Rao, David Shue, Christopher Taylor, Marcel van der Holst, and Dale Woodford. Spanner: Becoming a SQL system. In Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu, editors, Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14--19, 2017, pages 331--343. ACM, 2017.Google ScholarDigital Library
- Wei Cao, Yingqiang Zhang, Xinjun Yang, Feifei Li, Sheng Wang, Qingda Hu, Xuntao Cheng, Zongzhi Chen, Zhenjun Liu, Jing Fang, Bo Wang, Yuhui Wang, Haiqing Sun, Ze Yang, Zhushi Cheng, Sen Chen, Jian Wu, Wei Hu, Jianwei Zhao, Yusong Gao, Songlu Cai, Yunyang Zhang, and Jiawang Tong. Polardb serverless: A cloud native database for disaggregated data centers. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, pages 2477--2489. ACM, 2021.Google Scholar
- Amanda Carbonari and Ivan Beschastnikh. Tolerating Faults in Disaggregated Datacenters. In Proceedings of the ACM Workshop on Hot Topics in Networks (HotNets), 2017.Google Scholar
- Jack Chen, Samir Jindel, Robert Walzer, Rajkumar Sen, Nika Jimsheleishvilli, and Michael Andrews. The memsql query optimizer: A modern optimizer for real-time analytics in a distributed database. Proc. VLDB Endow., 9(13):1401--1412, 2016.Google ScholarDigital Library
- Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. One trillion edges: Graph processing at facebook-scale. Proc. VLDB Endow., 8(12):1804--1815, 2015.Google ScholarDigital Library
- Paolo Costa, Hitesh Ballani, and Dushyanth Narayanan. Rethinking the network stack for rack-scale computers. In Proceedings of the USENIX Workshop on Hot Topics in Cloud Computing (HotCloud), 2014.Google Scholar
- J. Draper, J. Chame, M. Hall, C. Steele, T. Barrett, J. LaCoss, J. Granacki, J. Shin, C. Chen, C. Woo Kang, I. Kim, and G. Daglikoca. The architecture of the DIVA processing-in-memory chip. In Proceedings of the International Conference on Supercomputing (ICS), 2002.Google ScholarDigital Library
- D. G. Elliott, M. Stumm, W. M. Snelgrove, C. Cojocaru, and R. Mckenzie. Computational RAM: implementing processors in memory. IEEE Design & Test of Computers, 16(1), 1999.Google Scholar
- Peter X. Gao, Akshay Narayan, Sagar Karandikar, Joao Carreira, Sangjin Han, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. Network Requirements for Resource Disaggregation. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2016.Google Scholar
- M. Gokhale, B. Holmes, and K. Iobst. Processing in memory: The Terasys massively parallel PIM array. IEEE Computer, 28(4), 1995.Google Scholar
- Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In Chandu Thekkath and Amin Vahdat, editors, 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood, CA, USA, October 8--10, 2012, pages 17--30. USENIX Association, 2012.Google Scholar
- Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2012.Google Scholar
- Juncheng Gu, Youngmoon Lee, Yiwen Zhang, Mosharaf Chowdhury, and Kang G. Shin. Efficient memory disaggregation with INFINISWAP. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2017.Google Scholar
- SPARC International Inc and David L Weaver. The SPARC architecture manual. Prentice-Hall, 1994.Google Scholar
- Xiaowei Jiang, Yuejun Hu, Yu Xiang, Guangran Jiang, Xiaojun Jin, Chen Xia, Weihua Jiang, Jun Yu, Haitao Wang, Yuan Jiang, Jihong Ma, Li Su, and Kai Zeng. Alibaba hologres: A cloud-native service for hybrid serving/analytical processing. Proc. VLDB Endow., 13(12):3272--3284, 2020.Google ScholarDigital Library
- Christos Kozyrakis. Phoenix. https://github.com/kozyraki/phoenix.Google Scholar
- Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, and Wei Hong. Tinydb: an acquisitional query processing system for sensor networks. ACM Trans. Database Syst., 30(1):122--173, 2005.Google ScholarDigital Library
- K. Mai, T. Paaske, N. Jayasena, R. Ho, W. J. Dally, and M. Horowitz. Smart memories: a modular reconfigurable architecture. In Proceedings of the International Symposium on Computer Architecture (ISCA), 2000.Google ScholarCross Ref
- Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the ACM SIGMOD Conference, 2010.Google ScholarDigital Library
- Hasan Al Maruf and Mosharaf Chowdhury. Effectively prefetching remote memory with leap. In Proceedings of the USENIX Annual Technical Conference (ATC), July 2020.Google Scholar
- Ingo Mü ller, Renato Marroquin, and Gustavo Alonso. Lambada: Interactive data analytics on cold data using serverless cloud infrastructure. In David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo, editors, Proceedings of the ACM SIGMOD Conference, 2020.Google Scholar
- Vijay Nagarajan, Daniel J. Sorin, Mark D. Hill, and David A. Wood. A primer on memory consistency and cache coherence, second edition. Synthesis Lectures on Computer Architecture, 15(1):1--294, 2020.Google ScholarCross Ref
- Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin. Latency-tolerant software distributed shared memory. In Proceedings of the USENIX Annual Technical Conference (ATC), July 2015.Google Scholar
- M. Oskin, F. T. Chong, and T. Sherwood. Active pages: Acomputation model for intelligent memory. In Proceedings of the International Symposium on Computer Architecture (ISCA), 1998.Google ScholarCross Ref
- Mark S. Papamarcos and Janak H. Patel. A low-overhead coherence solution for multiprocessors with private cache memories. In Proceedings of the International Symposium on Computer Architecture (ISCA), 1984.Google Scholar
- D. Patterson, T. Anderson, N. Cardwell, R. Fromm, K. Keeton, C. Kozyrakis, R. Thomas, and K. Yelick. A case for intelligent RAM. IEEE Micro, 17(2), 1997.Google ScholarDigital Library
- Matthew Perron, Raul Castro Fernandez, David J. DeWitt, and Samuel Madden. Starling: A scalable query engine on cloud functions. In David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo, editors, Proceedings of the ACM SIGMOD Conference, 2020.Google Scholar
- Russell Power and Jinyang Li. Piccolo: Building fast, distributed programs with partitioned tables. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), USA, 2010.Google Scholar
- Zhenyuan Ruan, Malte Schwarzkopf, Marcos K. Aguilera, and Adam Belay. AIFM: high-performance, application-integrated far memory. In 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020, Virtual Event, November 4--6, 2020, pages 315--332. USENIX Association, 2020.Google Scholar
- Bart Samwel, John Cieslewicz, Ben Handy, Jason Govig, Petros Venetis, Chanjun Yang, Keith Peters, Jeff Shute, Daniel Tenedorio, Himani Apte, Felix Weigel, David Wilhite, Jiacheng Yang, Jun Xu, Jiexing Li, Zhan Yuan, Craig Chasseur, Qiang Zeng, Ian Rae, Anurag Biyani, Andrew Harn, Yang Xia, Andrey Gubichev, Amr El-Helw, Orri Erling, Zhepeng Yan, Mohan Yang, Yiqun Wei, Thanh Do, Colin Zheng, Goetz Graefe, Somayeh Sardashti, Ahmed M. Aly, Divy Agrawal, Ashish Gupta, and Shivakumar Venkataraman. F1 query: Declarative querying at scale. Proc. VLDB Endow., 11(12):1835--1848, 2018.Google ScholarDigital Library
- Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang. LegoOS: A disseminated, distributed OS for hardware resource disaggregation. In Andrea C. Arpaci-Dusseau and Geoff Voelker, editors, Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2018.Google Scholar
- Vishal Shrivastav, Asaf Valadarsky, Hitesh Ballani, Paolo Costa, Ki Suh Lee, Han Wang, Rachit Agarwal, and Hakim Weatherspoon. Shoal: A Network Architecture for Disaggregated Racks. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2019.Google Scholar
- David Sidler, Zeke Wang, Monica Chiosa, Amit Kulkarni, and Gustavo Alonso. Strom: Smart remote memory. In Proceedings of the ACM European Conference on Computer Systems (EuroSys), 2020.Google Scholar
- H. S. Stone. A logic-in-memory computer. IEEE Transactions on Computers, C-19(1), 1970.Google Scholar
- Michael Stonebraker and Akhil Kumar. Operating system support for data management. IEEE Database Eng. Bull., 9(3):43--50, 1986.Google Scholar
- Shin-Yeh Tsai and Yiying Zhang. LITE kernel RDMA support for datacenter applications. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP), 2017.Google ScholarDigital Library
- Amin Vahdat. Coming of age in the fifth epoch of distributed computing: The power of sustained exponential growth. SIGCOMM 2020 Keynote, 2020.Google Scholar
- Alexandre Verbitski, Anurag Gupta, Debanjan Saha, Murali Brahmadesam, Kamal Gupta, Raman Mittal, Sailesh Krishnamurthy, Sandor Maurice, Tengiz Kharatishvili, and Xiaofeng Bao. Amazon aurora: Design considerations for high throughput cloud-native relational databases. In Semih Salihoglu, Wenchao Zhou, Rada Chirkova, Jun Yang, and Dan Suciu, editors, Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA, May 14--19, 2017, pages 1041--1052. ACM, 2017.Google ScholarDigital Library
- Chenxi Wang, Haoran Ma, Shi Liu, Yuanqi Li, Zhenyuan Ruan, Khanh Nguyen, Michael D. Bond, Ravi Netravali, Miryung Kim, and Guoqing Harry Xu. Semeru: A memory-disaggregated managed runtime. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2020.Google Scholar
- J. Yang and J. Leskovec. Defining and evaluating network communities based on ground-truth. In Proceedings of the IEEE International Conference on Data Mining series (ICDM), 2012.Google Scholar
- Xiangyao Yu, Matt Youill, Matthew E. Woicik, Abdurrahman Ghanem, Marco Serafini, Ashraf Aboulnaga, and Michael Stonebraker. Pushdowndb: Accelerating a DBMS using S3 computation. In Proceedings of the IEEE International Conference on Data Engineering (ICDE), 2020.Google Scholar
- Qizhen Zhang, Philip Bernstein, Daniel Berger, Badrish Chandramouli, Vincent Liu, and Boon Thau Loo. Compucache: Remote computable caching using spot vms. In 12th Conference on Innovative Data Systems Research, CIDR 2022, Chaminade, CA, USA, January 9--12, 2022, Online Proceedings. www.cidrdb.org, 2022.Google Scholar
- Qizhen Zhang, Yifan Cai, Sebastian Angel, Ang Chen, Vincent Liu, and Boon Thau Loo. Rethinking data management systems for disaggregated data centers. In Proceedings of Conference on Innovative Data Systems Research (CIDR), January 2020.Google Scholar
- Qizhen Zhang, Yifan Cai, Xinyi Chen, Sebastian Angel, Ang Chen, Vincent Liu, and Boon Thau Loo. Understanding the effect of data center resource disaggregation on production dbmss. Proceedings of the VLDB Endowment, 13(9):1568--1581, May 2020.Google ScholarDigital Library
- Yingqiang Zhang, Chaoyi Ruan, Cheng Li, Jimmy Yang, Wei Cao, Feifei Li, Bo Wang, Jing Fang, Yuhui Wang, Jingze Huo, and Chao Bi. Towards cost-effective and elastic cloud database deployment via memory disaggregation. Proc. VLDB Endow., 14(10):1900--1912, 2021.Google ScholarDigital Library
Index Terms
- Optimizing Data-intensive Systems in Disaggregated Data Centers with TELEPORT
Recommendations
Architectural Support for Efficient Data Movement in Fully Disaggregated Systems
SIGMETRICS '23Traditional data centers include monolithic servers that tightly integrate CPU, memory and disk (Figure 1a). Instead, Disaggregated Systems (DSs) [8, 13, 18, 27] organize multiple compute (CC), memory (MC) and storage devices as independent, failure-...
Architectural Support for Efficient Data Movement in Fully Disaggregated Systems
SIGMETRICS '23: Abstract Proceedings of the 2023 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer SystemsTraditional data centers include monolithic servers that tightly integrate CPU, memory and disk (Figure 1a). Instead, Disaggregated Systems (DSs) [8, 13, 18, 27] organize multiple compute (CC), memory (MC) and storage devices as independent, failure-...
Near to Far: An Evaluation of Disaggregated Memory for In-Memory Data Processing
DIMES '23: Proceedings of the 1st Workshop on Disruptive Memory SystemsEfficient in-memory data processing relies on the availability of sufficient resources, be it CPU time or available main memory. Traditional approaches are coping with resource limitations by either adding more processors or RAM sticks to a single ...
Comments