skip to main content
research-article

Limousine: Blending Learned and Classical Indexes to Self-Design Larger-than-Memory Cloud Storage Engines

Published:26 March 2024Publication History
Skip Abstract Section

Abstract

We present Limousine, a self-designing key-value storage engine, that can automatically morph to the near-optimal storage engine architecture shape given a workload, a cloud budget, and target performance. At its core, Limousine identifies the fundamental design principles of storage engines as combinations of learned and classical data structures that collaborate through algorithms for data storage and access. By unifying these principles over diverse hardware and three major cloud providers (AWS, GCP, and Azure), Limousine creates a massive design space of quindecillion (1048) storage engine designs the vast majority of which do not exist in literature or industry. Limousine contains a distribution-aware IO model to accurately evaluate any candidate design. Using these models, Limousine searches within the exhaustive design space to construct a navigable continuum of designs connected along a Pareto frontier of cloud cost and performance. If storage engines contain learned components, Limousine also introduces efficient lazy write algorithms to optimize the holistic read-write performance. Once the near-optimal design is decided for the given context, Limousine automatically materializes the corresponding design in Rust code. Using the YCSB benchmark, we demonstrate that storage engines automatically designed and generated by Limousine scale better by up to 3 orders of magnitude when compared with state-of-the-art industry-leading engines such as RocksDB, WiredTiger, FASTER, and Cosine, over diverse workloads, data sets, and cloud budgets.

References

  1. 2019. AWS Calculator. https://calculator.s3.amazonaws.com/index.html.Google ScholarGoogle Scholar
  2. 2019. Azure Calculator. https://azure.microsoft.com/en-us/pricing/.Google ScholarGoogle Scholar
  3. 2019. GCP Calculator. https://cloud.google.com/products/calculator/.Google ScholarGoogle Scholar
  4. 2019. Microsoft Azure. https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/.Google ScholarGoogle Scholar
  5. 2020. General purpose Azure VMs. https://docs.microsoft.com/en-us/azure/virtual-machines/sizes-general?toc=/azure/virtual-machines/linux/toc.json&bc=/azure/virtual-machines/linux/breadcrumb/toc.json.Google ScholarGoogle Scholar
  6. 2024. Amazon OpenStreetMap on AWS. https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page.Google ScholarGoogle Scholar
  7. 2024. Apache Cassandra. http://cassandra.apache.org.Google ScholarGoogle Scholar
  8. 2024. CouchDB. http://couchdb.apache.org/.Google ScholarGoogle Scholar
  9. 2024. Facebook RocksDB. https://github.com/facebook/rocksdb.Google ScholarGoogle Scholar
  10. 2024. MongoDB. http://www.mongodb.com/.Google ScholarGoogle Scholar
  11. 2024. WiredTiger. https://github.com/wiredtiger/wiredtiger.Google ScholarGoogle Scholar
  12. Hussam Abu-Libdeh, Deniz Altinbuken, Alex Beutel, Ed H. Chi, Lyric Doshi, Tim Kraska, Xiaozhou Li, Andy Ly, and Christopher Olston. 2020. Learned Indexes for a Google-scale Disk-based Database. NeurIPS abs/2012.12501 (2020).Google ScholarGoogle Scholar
  13. Sanjay Agrawal, Surajit Chaudhuri, and Vivek R. Narasayya. 2000. Automated Selection of Materialized Views and Indexes in SQL Databases. In Proceedings of the 26th International Conference on Very Large Data Bases (VLDB '00). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 496--505.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sanjay Agrawal, Vivek Narasayya, and Beverly Yang. 2004. Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design. In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (Paris, France) (SIGMOD '04). Association for Computing Machinery, New York, NY, USA, 359--370. https://doi.org/10.1145/1007568.1007609Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Abdullah Al-Mamun, Hao Wu, and Walid G. Aref. 2020. A Tutorial on Learned Multi-Dimensional Indexes. In Proceedings of the 28th International Conference on Advances in Geographic Information Systems (Seattle, WA, USA) (SIGSPATIAL '20). Association for Computing Machinery, New York, NY, USA, 1--4. https://doi.org/10.1145/3397536.3426358Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. G. Amdahl. 1967. Validity of the Single-Processor Approach to Achieving Large-Scale Computing Capabilities. In AFIPS spring joint computer conference.Google ScholarGoogle Scholar
  17. Timothy G. Armstrong, Vamsi Ponnekanti, Dhruba Borthakur, and Mark Callaghan. 2013. LinkBench: a Database Benchmark Based on the Facebook Social Graph. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 1185--1196. https://doi.org/10.1145/2463676.2465296Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Manos Athanassoulis, Michael S. Kester, Lukas M. Maas, Radu Stoica, Stratos Idreos, Anastasia Ailamaki, and Mark D. Callaghan. 2016. Designing Access Methods: The RUM Conjecture. In International Conference on Extending Database Technology.Google ScholarGoogle Scholar
  19. Gerth Stolting Brodal and Rolf Fagerberg. 2003. Lower Bounds for External Memory Dictionaries. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms (Baltimore, Maryland) (SODA '03). Society for Industrial and Applied Mathematics, USA, 546--554.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Zhao Cao, Shimin Chen, Feifei Li, Min Wang, and Xiaoyang Sean Wang. 2013. LogKV: Exploiting Key-Value Stores for Log Processing. In Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR). http://cidrdb.org/cidr2013/Papers/CIDR13{_}Paper46.pdfGoogle ScholarGoogle Scholar
  21. B. Chandramouli, G. Prasaad, D. Kossmann, J. Levandoski, J. Hunter, and M. Barnett. 2018. Faster: A Concurrent Key-Value Store with In-Place Updates. In ACM SIGMOD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Subarna Chatterjee, Meena Jagadeesan, Wilson Qin, and Stratos Idreos. 2022. Cosine: A Cloud-Cost Optimized Self-Designing Key-Value Storage Engine. In Proceedings of the VLDB Endowment.Google ScholarGoogle Scholar
  23. Subarna Chatterjee, Mark Pekala, Lev Kruglyak, and Stratos Idreos. 2024. Technical Report for Limousine.Google ScholarGoogle Scholar
  24. Surajit Chaudhuri and Vivek Narasayya. 2007. Self-Tuning Database Systems: A Decade of Progress. In Proceedings of the 33rd International Conference on Very Large Data Bases (Vienna, Austria) (VLDB '07). VLDB Endowment, 3--14.Google ScholarGoogle Scholar
  25. Douglas Comer. 1979. Ubiquitous B-Tree. ACM Comput. Surv. 11, 2 (jun 1979), 121--137. https://doi.org/10.1145/356770.356776Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Douglas Comer. 1979. Ubiquitous B-Tree. ACM Comput. Surv. 11, 2 (jun 1979), 121--137. https://doi.org/10.1145/356770.356776Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (Indianapolis, Indiana, USA) (SoCC '10). Association for Computing Machinery, New York, NY, USA, 143--154. https://doi.org/10.1145/1807128.1807152Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. G. Cudahy. 2022. The Key To Intelligent Connectivity In A World Awash With IoT Data? Making Decisions At The Edge. https://www.forbes.com/sites/forbestechcouncil/2022/03/04/the-key-to-intelligent-connectivity-in-a-world-awash-with-iot-data-making-decisions-at-the-edge/'sh=7521b0794b13Google ScholarGoogle Scholar
  29. Carlo Curino, Yang Zhang, Evan P. C. Jones, and Samuel Madden. 2010. Schism: a Workload-Driven Approach to Database Replication and Partitioning. Proc. VLDB Endow. 3, 1 (2010), 48--57. https://doi.org/10.14778/1920841.1920853Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Benoît Dageville and Mohamed Zait. 2002. Chapter 88 - SQL Memory Management in Oracle9i. In VLDB '02: Proceedings of the 28th International Conference on Very Large Databases, Philip A. Bernstein, Yannis E. Ioannidis, Raghu Ramakrishnan, and Dimitris Papadias (Eds.). Morgan Kaufmann, San Francisco, 962--973. https://doi.org/10.1016/B978--155860869--6/50095-0Google ScholarGoogle ScholarCross RefCross Ref
  31. Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal Navigable Key-Value Store. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD '17). Association for Computing Machinery, New York, NY, USA, 79--94. https://doi.org/10.1145/3035918.3064054Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Niv Dayan and Stratos Idreos. 2018. Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 505--520. https://doi.org/10.1145/3183713.3196927Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Niv Dayan and Stratos Idreos. 2019. The Log-Structured Merge-Bush & the Wacky Continuum. In Proceedings of the 2019 International Conference on Management of Data (Amsterdam, Netherlands) (SIGMOD '19). Association for Computing Machinery, New York, NY, USA, 449--466. https://doi.org/10.1145/3299869.3319903Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon's Highly Available Key-Value Store. In Proceedings of Twenty-First ACM SIGOPS Symposium on Operating Systems Principles (Stevenson, Washington, USA) (SOSP '07). Association for Computing Machinery, New York, NY, USA, 205--220. https://doi.org/10.1145/1294261.1294281Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Jialin Ding, Umar Farooq Minhas, Jia Yu, Chi Wang, Jaeyoung Do, Yinan Li, Hantian Zhang, Badrish Chandramouli, Johannes Gehrke, Donald Kossmann, David Lomet, and Tim Kraska. 2020. ALEX: An Updatable Adaptive Learned Index. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 969--984. https://doi.org/10.1145/3318464.3389711Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. F. R. Drake. 1974. Set Theory: An Introduction to Large Cardinals. Studies in Logic & the Foundations of Mathematics, Elsevier Science Ltd. 76 (1974).Google ScholarGoogle Scholar
  37. Songyun Duan, Vamsidhar Thummala, and Shivnath Babu. 2009. Tuning Database Configuration Parameters with ITuned. Proc. VLDB Endow. 2, 1 (aug 2009), 1246--1257. https://doi.org/10.14778/1687627.1687767Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Raul Castro Fernandez, Ziawasch Abedjan, Famien Koko, Gina Yuan, Samuel Madden, and Michael Stonebraker. 2018. Aurum: A Data Discovery System. In 34th IEEE International Conference on Data Engineering, ICDE 2018, Paris, France, April 16--19, 2018. IEEE Computer Society, 1001--1012. https://doi.org/10.1109/ICDE.2018.00094Google ScholarGoogle ScholarCross RefCross Ref
  39. Paolo Ferragina, Fabrizio Lillo, and Giorgio Vinciguerra. 2020. Why Are Learned Indexes so Effective?. In Proceedings of the 37th International Conference on Machine Learning (ICML'20). JMLR.org, Article 293, 10 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Paolo Ferragina and Giorgio Vinciguerra. 2020. The PGM-Index: A Fully-Dynamic Compressed Learned Index with Provable Worst-Case Bounds. Proc. VLDB Endow. 13, 8 (apr 2020), 1162--1175. https://doi.org/10.14778/3389133.3389135Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. W. Fokoue, A. Fokoue, K. Srinivas, A. Kementsietsidis, G. Hu, and G. Xie. 2015. SQLGraph: An Efficient Relational-Based Property Graph Store. In In Proceedings of the International Conference on Management of Data, SIGMOD.Google ScholarGoogle Scholar
  42. Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, and Tim Kraska. 2019. FITing-Tree: A Data-Aware Index Structure. In Proceedings of the 2019 International Conference on Management of Data (Amsterdam, Netherlands) (SIGMOD '19). Association for Computing Machinery, New York, NY, USA, 1189--1206. https://doi.org/10.1145/3299869.3319860Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Goetz Graefe. 2010. Modern B-Tree Techniques. Foundations and Trends in Databases 3, 4 (2010). https://doi.org/10.1561/1900000028Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. J. L. Hennessy and D. A. Patterson. 2003. Computer Architecture: A Quantitative Approach. Morgan Kauffman.Google ScholarGoogle Scholar
  45. Mark D. Hill and Michael R. Marty. 2008. Amdahl's Law in the Multicore Era. Computer 41, 7 (July 2008), 33--38. https://doi.org/10.1109/MC.2008.209Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Gui Huang, Xuntao Cheng, Jianying Wang, Yujie Wang, Dengcheng He, Tieying Zhang, Feifei Li, Sheng Wang, Wei Cao, and Qiang Li. 2019. X-Engine: An Optimized Storage Engine for Large-Scale E-Commerce Transaction Processing. In Proceedings of the 2019 International Conference on Management of Data (Amsterdam, Netherlands) (SIGMOD '19). Association for Computing Machinery, New York, NY, USA, 651--665. https://doi.org/10.1145/3299869.3314041Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. S. Idreos and M. Callaghan. 2020. Key-Value Storage Engines. In ACM SIGMOD Tutorial.Google ScholarGoogle Scholar
  48. Stratos Idreos, Niv Dayan, Wilson Qin, Mali Akmanalp, Sophie Hilgard, Andrew Ross, James Lennon, Varun Jain, Harshita Gupta, David Li, and Zichen Zhu. 2019. Design Continuums and the Path Toward Self-Designing Key-Value Stores that Know and Learn. In Biennial Conference on Innovative Data Systems Research (CIDR).Google ScholarGoogle Scholar
  49. Stratos Idreos and Tim Kraska. 2019. From Auto-tuning One Size Fits All to Self-designed and Learned Data-intensive Systems. In Proceedings of the ACM SIGMOD International Conference on Management of Data.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Stratos Idreos, Konstantinos Zoumpatianos, Manos Athanassoulis, Niv Dayan, Brian Hentschel, Michael S. Kester, Demi Guo, Lukas M. Maas, Wilson Qin, Abdul Wasay, and Yiyou Sun. 2018. The Periodic Table of Data Structures. IEEE Data Eng. Bull. 41 (2018), 64--75.Google ScholarGoogle Scholar
  51. Stratos Idreos, Kostas Zoumpatianos, Subarna Chatterjee, Wilson Qin, Abdul Wasay, Brian Hentschel, Michael S. Kester, Niv Dayan, Demi Guo, Minseo Kang, and Yiyou Sun. 2019. Learning Data Structure Alchemy. IEEE Data Eng. Bull. 42 (2019), 47--58.Google ScholarGoogle Scholar
  52. Stratos Idreos, Kostas Zoumpatianos, Brian Hentschel, Michael S Kester, and Demi Guo. 2018. The Data Calculator: Data Structure Design and Cost Synthesis from First Principles and Learned Cost Models. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 535--550. https://doi.org/10.1145/3183713.3199671Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. M. R. Jain. 2019. Why we choose Badger over RocksDB in Dgraph. https://blog.dgraph.io/post/badger-over-rocksdb-in-dgraph/.Google ScholarGoogle Scholar
  54. William Jannen, Jun Yuan, Yang Zhan, Amogh Akshintala, John Esmet, Yizheng Jiao, Ankur Mittal, Prashant Pandey, Phaneendra Reddy, Leif Walsh, Michael Bender, Martin Farach-Colton, Rob Johnson, Bradley C. Kuszmaul, and Donald E. Porter. 2015. BetrFS: A Right-Optimized Write-Optimized File System. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (Santa Clara, CA) (FAST'15). USENIX Association, USA, 301--315.Google ScholarGoogle Scholar
  55. Andreas Kipf, Ryan Marcus, Alexander van Renen, Mihail Stoian, Alfons Kemper, Tim Kraska, and Thomas Neumann. 2020. RadixSpline: A Single-Pass Learned Index. In Proceedings of the Third International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (Portland, Oregon) (aiDM '20). Association for Computing Machinery, New York, NY, USA, Article 5, 5 pages. https://doi.org/10.1145/3401071.3401659Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 489--504. https://doi.org/10.1145/3183713.3196909Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Eva Kwan, Sam Lightstone, Berni Schiefer, Adam Storm, and Leanne Wu. 2003. Automatic Database Configuration for DB2 Universal Database: Compressing Years of Performance Expertise into Seconds of Execution. 620--629.Google ScholarGoogle Scholar
  58. Pengfei Li, Yu Hua, Jingnan Jia, and Pengfei Zuo. 2021. FINEdex: A Fine-Grained Learned Index Scheme for Scalable and Concurrent Memory Systems. Proc. VLDB Endow. 15, 2 (oct 2021), 321--334. https://doi.org/10.14778/3489496.3489512Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Yinan Li, Bingsheng He, Robin Jun Yang, Qiong Luo, and Ke Yi. 2010. Tree Indexing on Solid State Drives. Proc. VLDB Endow. 3, 1--2, 1195--1206. https://doi.org/10.14778/1920841.1920990Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Lin Ma, Dana Van Aken, Ahmed Hefny, Gustavo Mezerhane, Andrew Pavlo, and Geoffrey J. Gordon. 2018. Query-Based Workload Forecasting for Self-Driving Database Management Systems. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 631--645. https://doi.org/10.1145/3183713.3196908Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Lin Ma, William Zhang, Jie Jiao, Wuwen Wang, Matthew Butrovich, Wan Shen Lim, Prashanth Menon, and Andrew Pavlo. 2021. MB2: Decomposed Behavior Modeling for Self-Driving Database Management Systems. In Proceedings of the 2021 International Conference on Management of Data (Virtual Event, China) (SIGMOD '21). Association for Computing Machinery, New York, NY, USA, 1248--1261. https://doi.org/10.1145/3448016.3457276Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. P. Malkowski. 2018. MyRocks Disk Full Edge Case. https://www.percona.com/blog/2018/09/20/myrocks-disk-full-edge-case/.Google ScholarGoogle Scholar
  63. W. Oledzki. 2013. memcached is a weird creature. http://hoborglabs.com/en/blog/2013/memcached-php.Google ScholarGoogle Scholar
  64. Andrew Pavlo, Matthew Butrovich, Lin Ma, Prashanth Menon, Wan Shen Lim, Dana Van Aken, and William Zhang. 2021. Make Your Database System Dream of Electric Sheep: Towards Self-Driving Operation. Proc. VLDB Endow. 14, 12 (jul 2021), 3211--3221. https://doi.org/10.14778/3476311.3476411Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Andrew Pavlo, Carlo Curino, and Stan Zdonik. 2012. Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. Proceedings of the ACM SIGMOD International Conference on Management of Data (05 2012). https://doi.org/10.1145/2213836.2213844Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Jun Rao and Kenneth A. Ross. 2000. Making B- Trees Cache Conscious in Main Memory. SIGMOD Rec. 29, 2 (may 2000), 475--486. https://doi.org/10.1145/335191.335449Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. J. Rydning. 2022. Worldwide IDC Global DataSphere Forecast, 2022--2026: Enterprise Organizations Driving Most of the Data Growth. https://www.idc.com/getdoc.jsp?containerId=US49018922Google ScholarGoogle Scholar
  68. Subhadeep Sarkar and Manos Athanassoulis. 2022. Dissecting, Designing, and Optimizing LSM-Based Data Stores. In Proceedings of the 2022 International Conference on Management of Data (Philadelphia, PA, USA) (SIGMOD '22). Association for Computing Machinery, New York, NY, USA, 2489--2497. https://doi.org/10.1145/3514221.3522563Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Subhadeep Sarkar and Manos Athanassoulis. 2022. Dissecting, Designing, and Optimizing LSM-Based Data Stores. In Proceedings of the 2022 International Conference on Management of Data (Philadelphia, PA, USA) (SIGMOD '22). Association for Computing Machinery, New York, NY, USA, 2489--2497. https://doi.org/10.1145/3514221.3522563Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Subhadeep Sarkar, Kaijie Chen, Zichen Zhu, and Manos Athanassoulis. 2022. Compactionary: A Dictionary for LSM Compactions. In Proceedings of the 2022 International Conference on Management of Data (Philadelphia, PA, USA) (SIGMOD '22). Association for Computing Machinery, New York, NY, USA, 2429--2432. https://doi.org/10.1145/3514221.3520169Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Subhadeep Sarkar, Tarikul Islam Papon, Dimitris Staratzis, and Manos Athanassoulis. 2020. Lethe: A Tunable Delete- Aware LSM Engine. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 893--908. https://doi.org/10.1145/3318464.3389757Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Subhadeep Sarkar, Dimitris Staratzis, Ziehen Zhu, and Manos Athanassoulis. 2021. Constructing and Analyzing the LSM Compaction Design Space. Proc. VLDB Endow. 14, 11 (jul 2021), 2216--2229. https://doi.org/10.14778/3476249.3476274Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Utku Sirin and Stratos Idreos. 2024. The Image Calculator: 10x Faster Image-AI Inference by Replacing JPEG with Self- designing Storage Format. In Proceedings of the ACM on Management of Data (PACMMOD) (SIGMOD '24). Association for Computing Machinery. https://doi.org/10.1145/3639307Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Zhaoyan Sun, Xuanhe Zhou, and Guoliang Li. 2023. Learned Index: A Comprehensive Experimental Evaluation. Proc. VLDB Endow. 16, 8 (apr 2023), 1992--2004. https://doi.org/10.14778/3594512.3594528Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-Scale Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD '17). Association for Computing Machinery, New York, NY, USA, 1009--1024. https://doi.org/10.1145/3035918.3064029Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Youyun Wang, Chuzhe Tang, Zhaoguo Wang, and Haibo Chen. 2020. SIndex: A Scalable Learned Index for String Keys. In Proceedings of the 11th ACM SIGOPS Asia-Pacific Workshop on Systems (Tsukuba, Japan) (APSys '20). Association for Computing Machinery, New York, NY, USA, 17--24. https://doi.org/10.1145/3409963.3410496Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Z. Wei, G. Pierre, and C. H. Chi. 2011. CloudTPS: Scalable Transactions for Web Applications in the Cloud. IEEE Transactions on Services Computing (2011), 525--539.Google ScholarGoogle Scholar
  78. C. Weinschenk. 2018. Deeper Dive-Will growth in data traffic ever slow down? https://www.fiercewireless.com/wireless/special-report-will-growth-data-traffic-ever-slow-downGoogle ScholarGoogle Scholar
  79. J. Wise. 2022. How Much Data Is Created Every Day In 2022? http://tinyurl.com/dataCreatedEverydayGoogle ScholarGoogle Scholar
  80. Haoyuan Xing, Sofoklis Floratos, Spyros Blanas, Suren Byna, M. Prabhat, Kesheng Wu, and Paul Brown. 2018. Array-Bridge: Interweaving Declarative Array Processing in SciDB with Imperative HDF5-Based Programs. In 2018 IEEE 34th International Conference on Data Engineering (ICDE). 977--988. https://doi.org/10.1109/ICDE.2018.00092Google ScholarGoogle ScholarCross RefCross Ref
  81. Cong Yan and Alvin Cheung. 2019. Generating Application-Specific Data Layouts for in-Memory Databases. Proc. VLDB Endow. 12, 11 (jul 2019), 1513--1525. https://doi.org/10.14778/3342263.3342630Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Hadian Ali Heinis Thomas Yang Guang, Liang Liang. 2023. FLIRT: A Fast Learned Index for Rolling Time frames. In Proceedings of the 26th International Conference on Extending Database Technology (EDBT) (EDBT '23). Association for Computing Machinery.Google ScholarGoogle Scholar

Index Terms

  1. Limousine: Blending Learned and Classical Indexes to Self-Design Larger-than-Memory Cloud Storage Engines

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Management of Data
      Proceedings of the ACM on Management of Data  Volume 2, Issue 1
      SIGMOD
      February 2024
      1874 pages
      EISSN:2836-6573
      DOI:10.1145/3654807
      Issue’s Table of Contents

      Copyright © 2024 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 March 2024
      Published in pacmmod Volume 2, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader