Abstract
Presto is an open-source distributed SQL query engine that supports analytics workloads involving multiple exabyte-scale data sources. Presto is used for low-latency interactive use cases as well as long-running ETL jobs at Meta. It was originally launched at Meta in 2013 and donated to the Linux Foundation in 2019. Over the last ten years, upholding query latency and scalability with the hyper growth of data volume at Meta as well as new SQL analytics requirements have raised impressive challenges for Presto. A top priority has been ensuring query reliability does not regress with the shift towards smaller, more elastic container allocation, which requires queries to run with substantially smaller memory headroom and can be preempted at any time. Additionally, new demands from machine learning, privacy, and graph analytics have driven Presto maintainers to think beyond traditional data analytics. In this paper, we discuss several successful evolutions in recent years that have improved Presto latency as well as scalability by several orders of magnitude in production at Meta. Some of the notable ones are hierarchical caching, native vectorized execution engines, materialized views, and Presto on Spark. With these new capabilities, we have deprecated or are in the process of deprecating various legacy query engines so that Presto becomes the single piece to serve interactive, ad-hoc, ETL, and graph processing workloads for the entire data warehouse.
Supplemental Material
- RaptorX: Building a 10X Faster Presto. 2021. https://prestodb.io/blog/2021/02/04/raptorx.Google Scholar
- Oracle Labs PGX: Parallel Graph AnalytiX. 2022. https://www.oracle.com/middleware/technologies/parallel-graph-analytix.html.Google Scholar
- Renzo Angles, Marcelo Arenas, Pablo Barceló, Peter Boncz, George Fletcher, Claudio Gutierrez, Tobias Lindaaker, Marcus Paradies, Stefan Plantikow, Juan Sequeda, et al. 2018. G-CORE: A core for future graph query languages. In Proceedings of the 2018 International Conference on Management of Data. 1421--1432.Google ScholarDigital Library
- Snowpark API. 2022. https://docs.snowflake.com/en/developer-guide/snowpark/index.html.Google Scholar
- Michael Armbrust, Tathagata Das, Sameer Paranjpye, Reynold Xin, Shixiong Zhu, Ali Ghodsi, Burak Yavuz, Mukul Murthy, Joseph Torres, Liwen Sun, Peter A. Boncz, Mostafa Mokhtar, Herman Van Hovell, Adrian Ionescu, Alicja Luszczak, Michal Switakowski, Takuya Ueshin, Xiao Li, Michal Szafranski, Pieter Senster, and Matei Zaharia. 2020. Delta Lake: High-Performance ACID Table Storage over Cloud Object Stores. Proc. VLDB Endow. , Vol. 13, 12 (2020), 3411--3424.Google ScholarDigital Library
- Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. 2015. Spark SQL: Relational Data Processing in Spark. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia. 1383--1394.Google ScholarDigital Library
- Nikos Armenatzoglou, Sanuj Basu, Naga Bhanoori, Mengchu Cai, Naresh Chainani, Kiran Chinta, Venkatraman Govindaraju, Todd J. Green, Monish Gupta, Sebastian Hillig, Eric Hotinger, Yan Leshinksy, Jintian Liang, Michael McCreedy, Fabian Nagel, Ippokratis Pandis, Panos Parchas, Rahul Pathak, Orestis Polychroniou, Foyzur Rahman, Gaurav Saxena, Gokul Soundararajan, Sriram Subramanian, and Doug Terry. 2022. Amazon Redshift Re-invented. In SIGMOD '22: International Conference on Management of Data. ACM, 2205--2217.Google Scholar
- Presto Unlimited: MPP SQL Engine at Scale. 2019. https://prestodb.io/blog/2019/08/05/presto-unlimited-mpp-database-at-scale.Google Scholar
- Bradley R Bebee, Daniel Choi, Ankit Gupta, Andi Gutmans, Ankesh Khandelwal, Yigit Kiran, Sainath Mallidi, Bruce McGaughy, Mike Personick, Karthik Rajan, et al. 2018. Amazon Neptune: Graph Data Management in the Cloud.. In ISWC (P&D/Industry/BlueSky).Google Scholar
- Alexander Behm, Shoumik Palkar, Utkarsh Agarwal, Timothy Armstrong, David Cashman, Ankur Dave, Todd Greenstein, Shant Hovsepian, Ryan Johnson, Arvind Sai Krishnan, Paul Leventis, Ala Luszczak, Prashanth Menon, Mostafa Mokhtar, Gene Pang, Sameer Paranjpye, Greg Rahn, Bart Samwel, Tom van Bussel, Herman Van Hovell, Maryann Xue, Reynold Xin, and Matei Zaharia. 2022. Photon: A Fast Query Engine for Lakehouse Systems. In SIGMOD '22: International Conference on Management of Data. ACM, 2326--2339.Google Scholar
- Brendan Burns, Brian Grant, David Oppenheimer, Eric A. Brewer, and John Wilkes. 2016. Borg, Omega, and Kubernetes. Commun. ACM , Vol. 59, 5 (2016), 50--57.Google ScholarDigital Library
- Meta Data Centers. 2022. https://datacenters.fb.com/.Google Scholar
- Biswapesh Chattopadhyay, Priyam Dutta, Weiran Liu, Ott Tinn, Andrew McCormick, Aniket Mokashi, Paul Harvey, Hector Gonzalez, David Lomax, Sagar Mittal, Roee Ebenstein, Nikita Mikhaylin, Hung-Ching Lee, Xiaoyan Zhao, Tony Xu, Luis Perez, Farhad Shahmohammadi, Tran Bui, Neil Mckay, Selcuk Aya, Vera Lychagina, and Brett Elliott. 2019. Procella: Unifying serving and analytical data at YouTube. Proc. VLDB Endow. , Vol. 12, 12 (2019), 2022--2034.Google ScholarDigital Library
- Biswapesh Chattopadhyay, Pedro Eugenio Rocha Pedreira, Sundaram Narayanan, Sameer Agarwal, Yutian Sun, Peng Li, Suketu Vakharia, and Weiran Liu. 2023. Shared Foundations: Modernizing Meta's Data Lakehouse. In 13th Conference on Innovative Data Systems Research, CIDR.Google Scholar
- Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. 2015. One Trillion Edges: Graph Processing at Facebook-Scale. Proc. VLDB Endow. , Vol. 8, 12 (2015), 1804--1815.Google ScholarDigital Library
- ClickHouse. 2016. https://clickhouse.com/.Google Scholar
- Disaggregated Coordinator. 2022. https://prestodb.io/blog/2022/04/15/disggregated-coordinator.Google Scholar
- Beno^i t Dageville, Thierry Cruanes, Marcin Zukowski, Vadim Antonov, Artin Avanes, Jon Bock, Jonathan Claybaugh, Daniel Engovatov, Martin Hentschel, Jiansheng Huang, Allison W. Lee, Ashish Motivala, Abdul Q. Munir, Steven Pelley, Peter Povinec, Greg Rahn, Spyridon Triantafyllis, and Philipp Unterbrunner. 2016. The Snowflake Elastic Data Warehouse. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016. ACM, 215--226.Google Scholar
- Ankur Dave, Alekh Jindal, Li Erran Li, Reynold Xin, Joseph Gonzalez, and Matei Zaharia. 2016. GraphFrames: an integrated API for mixing graph and relational queries. In Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems, Redwood Shores, CA, USA, June 24 - 24, 2016, , Peter A. Boncz and Josep Llu'i s Larriba-Pey (Eds.). ACM, 2.Google ScholarDigital Library
- Jeffrey Dean and Sanjay Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In 6th Symposium on Operating System Design and Implementation (OSDI 2004). 137--150.Google Scholar
- Alin Deutsch, Nadime Francis, Alastair Green, Keith Hare, Bei Li, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Wim Martens, Jan Michels, et al. 2022. Graph pattern matching in gql and sql/pgq. In Proceedings of the 2022 International Conference on Management of Data. 2246--2258.Google ScholarDigital Library
- David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, and David A. Wood. 1984. Implementation Techniques for Main Memory Database Systems. In SIGMOD'84, Proceedings of Annual Meeting, Boston, Massachusetts, USA, June 18--21, 1984. ACM Press, 1--8.Google ScholarDigital Library
- Tomasz Drabas and Denny Lee. 2017. Learning PySpark. Packt Publishing Ltd.Google Scholar
- Cynthia Dwork. 2006. Differential privacy. In Automata, Languages and Programming: 33rd International Colloquium, ICALP 2006, Venice, Italy, July 10--14, 2006, Proceedings, Part II 33. Springer, 1--12.Google Scholar
- Cosco: An efficient facebook-scale shuffle service. 2020. https://databricks.com/session/cosco-an-efficient-facebook-scale-shuffle-service.Google Scholar
- Nadime Francis, Alastair Green, Paolo Guagliardo, Leonid Libkin, Tobias Lindaaker, Victor Marsault, Stefan Plantikow, Mats Rydberg, Petra Selmer, and Andrés Taylor. 2018. Cypher: An evolving query language for property graphs. In Proceedings of the 2018 International Conference on Management of Data. 1433--1445.Google ScholarDigital Library
- Apache Hudi. 2017. https://hudi.apache.org.Google Scholar
- Apache Iceberg. 2018. https://iceberg.apache.org.Google Scholar
- Avoid Data Silos in Presto in Meta: the journey from Raptor to RaptorX. 2022. https://prestodb.io/blog/2022/01/28/avoid-data-silos-in-presto-in-meta.Google Scholar
- Xiaowei Jiang, Yuejun Hu, Yu Xiang, Guangran Jiang, Xiaojun Jin, Chen Xia, Weihua Jiang, Jun Yu, Haitao Wang, Yuan Jiang, Jihong Ma, Li Su, and Kai Zeng. 2020. Alibaba Hologres: A Cloud-Native Service for Hybrid Serving/Analytical Processing. Proc. VLDB Endow. , Vol. 13, 12 (2020), 3272--3284.Google ScholarDigital Library
- GQL: One Property Query Language. 2022. https://gql.today/.Google Scholar
- Yuan Mei, Luwei Cheng, Vanish Talwar, Michael Y. Levin, Gabriela Jacques-Silva, Nikhil Simha, Anirban Banerjee, Brian Smith, Tim Williamson, Serhat Yilmaz, Weitao Chen, and Guoqiang Jerry Chen. 2020. Turbine: Facebook's Service Management Platform for Stream Processing. In 36th IEEE International Conference on Data Engineering, ICDE 2020, Dallas, TX, USA, April 20--24, 2020. IEEE, 1591--1602.Google ScholarCross Ref
- Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, and Theo Vassilakis. 2010. Dremel: Interactive Analysis of Web-Scale Datasets. Proc. VLDB Endow. , Vol. 3, 1 (2010), 330--339.Google ScholarDigital Library
- Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, Theo Vassilakis, Hossein Ahmadi, Dan Delorey, Slava Min, Mosha Pasumansky, and Jeff Shute. 2020. Dremel: A Decade of Interactive SQL Analysis at Web Scale. Proc. VLDB Endow. , Vol. 13, 12 (2020), 3461--3472.Google ScholarDigital Library
- Neo4j. 2022. https://neo4j.com/.Google Scholar
- Diego Ongaro and John K. Ousterhout. 2014. In Search of an Understandable Consensus Algorithm. In 2014 USENIX Annual Technical Conference, USENIX ATC '14. 305--319.Google Scholar
- Common Sub-Expression optimization. 2021. https://prestodb.io/blog/2021/11/22/common-sub-expression-optimization.Google Scholar
- Apache ORC. 2013. https://orc.apache.org/.Google Scholar
- Apache Parquet. 2013. https://parquet.apache.org/.Google Scholar
- Pedro Pedreira, Chris Croswhite, and Luis Carlos Erpen De Bona. 2016. Cubrick: Indexing Millions of Records per Second for Interactive Analytics. Proc. VLDB Endow. , Vol. 9, 13 (2016), 1305--1316.Google ScholarDigital Library
- Pedro Pedreira, Orri Erling, Maria Basmanova, Kevin Wilfong, Laith S. Sakka, Krishna Pai, Wei He, and Biswapesh Chattopadhyay. 2022. Velox: Meta's Unified Execution Engine. Proc. VLDB Endow. , Vol. 15, 12, 3372--3384.Google ScholarDigital Library
- Mark Raasveldt and Hannes Mü hleisen. 2019. DuckDB: an Embeddable Analytical Database. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference. ACM, 1981--1984.Google ScholarDigital Library
- Bart Samwel, John Cieslewicz, Ben Handy, Jason Govig, Petros Venetis, Chanjun Yang, Keith Peters, Jeff Shute, Daniel Tenedorio, Himani Apte, Felix Weigel, David Wilhite, Jiacheng Yang, Jun Xu, Jiexing Li, Zhan Yuan, Craig Chasseur, Qiang Zeng, Ian Rae, Anurag Biyani, Andrew Harn, Yang Xia, Andrey Gubichev, Amr El-Helw, Orri Erling, Zhepeng Yan, Mohan Yang, Yiqun Wei, Thanh Do, Colin Zheng, Goetz Graefe, Somayeh Sardashti, Ahmed M. Aly, Divy Agrawal, Ashish Gupta, and Shivakumar Venkataraman. 2018. F1 Query: Declarative Querying at Scale. Proc. VLDB Endow. , Vol. 11, 12 (2018), 1835--1848.Google ScholarDigital Library
- Raghav Sethi, Martin Traverso, Dain Sundstrom, David Phillips, Wenlei Xie, Yutian Sun, Nezih Yegitbasi, Haozhun Jin, Eric Hwang, Nileema Shingte, and Christopher Berner. 2019. Presto: SQL on Everything. In 35th IEEE International Conference on Data Engineering, ICDE. IEEE, 1802--1813.Google Scholar
- Leonard D. Shapiro. 1986. Join Processing in Database Systems with Large Main Memories. ACM Trans. Database Syst. , Vol. 11, 3 (1986), 239--264.Google ScholarDigital Library
- Chunqiang Tang, Kenny Yu, Kaushik Veeraraghavan, Jonathan Kaldor, Scott Michelson, Thawan Kooburat, Aravind Anbudurai, Matthew Clark, Kabir Gogia, Long Cheng, Ben Christensen, Alex Gartrell, Maxim Khutornenko, Sachin Kulkarni, Marcin Pawlowski, Tuomas Pelkonen, Andre Rodrigues, Rounak Tibrewal, Vaishnavi Venkatesan, and Peter Zhang. 2020. Twine: A Unified Cluster Management System for Shared Infrastructure. In 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020, Virtual Event, November 4--6, 2020. USENIX Association, 787--803.Google Scholar
- Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Ning Zhang, Suresh Anthony, Hao Liu, and Raghotham Murthy. 2010. Hive - a petabyte scale data warehouse using Hadoop. In Proceedings of the 26th International Conference on Data Engineering, ICDE. 996--1005.Google ScholarCross Ref
- TigerGraph. 2022. https://www.tigergraph.com/.Google Scholar
- Apache Tinkerpop. 2022. https://tinkerpop.apache.org/.Google Scholar
- Tutorial: How to Define SQL Functions With Presto Across All Connectors. 2021. https://dzone.com/articles/tutorial-how-to-define-sql-functions-with-presto-a.Google Scholar
- Oskar van Rest, Sungpack Hong, Jinha Kim, Xuming Meng, and Hassan Chafi. 2016. PGQL: a property graph query language. In Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems. 1--6.Google ScholarDigital Library
- Vinod Kumar Vavilapalli, Arun C. Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, Bikas Saha, Carlo Curino, Owen O'Malley, Sanjay Radia, Benjamin Reed, and Eric Baldeschwieler. 2013. Apache Hadoop YARN: yet another resource negotiator. In ACM Symposium on Cloud Computing, SOCC '13, Santa Clara, CA, USA, October 1--3, 2013, , Guy M. Lohman (Ed.). ACM, 5:1--5:16.Google ScholarDigital Library
- Royce J Wilson, Celia Yuxin Zhang, William Lam, Damien Desfontaines, Daniel Simmons-Marengo, and Bryant Gipson. 2020. Differentially private SQL with bounded user contribution. Proceedings on privacy enhancing technologies, Vol. 2020, 2 (2020), 230--250.Google ScholarCross Ref
- Scaling with Presto on Spark. 2021. https://prestodb.io/blog/2021/10/26/Scaling-with-Presto-on-Spark.Google Scholar
- Getting Started with PrestoDB and Aria Scan Optimizations. 2020. https://prestodb.io/blog/2020/08/14/getting-started-and-aria.Google Scholar
- Reynold S. Xin, Joseph E. Gonzalez, Michael J. Franklin, and Ion Stoica. 2013. GraphX: a resilient distributed graph system on Spark. In First International Workshop on Graph Data Management Experiences and Systems, GRADES, co-located with SIGMOD/PODS. CWI/ACM, 2.Google ScholarDigital Library
- Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster Computing with Working Sets. In 2nd USENIX Workshop on Hot Topics in Cloud Computing, HotCloud'10. ioGoogle ScholarDigital Library
Index Terms
- Presto: A Decade of SQL Analytics at Meta
Recommendations
Evaluating SQL-on-Hadoop for Big Data Warehousing on Not-So-Good Hardware
IDEAS '17: Proceedings of the 21st International Database Engineering & Applications SymposiumBig Data is currently conceptualized as data whose volume, variety or velocity impose significant difficulties in traditional techniques and technologies. Big Data Warehousing is emerging as a new concept for Big Data analytics. In this context, SQL-on-...
Evaluating Presto and SparkSQL with TPC-DS
Database Systems for Advanced Applications. DASFAA 2022 International WorkshopsAbstractFrom the perspective of the development trend of database technology and the application of big data, the unified management and analysis of relational data and non-relational data is a new trend. New relational computing engines, such as SparkSQL ...
Modeling Analytics for Computational Storage
ICPE '20: Proceedings of the ACM/SPEC International Conference on Performance EngineeringNext generation flash storage will be armed with a substantial amount of computing power. In this paper, we investigate opportunities to utilize this computational capability to optimize Online Analytical Processing (OLAP) applications. We have directed ...
Comments