skip to main content
research-article

SQL2FPGA: Automated Acceleration of SQL Query Processing on Modern CPU-FPGA Platforms

Published: 30 September 2024 Publication History

Abstract

Today’s big data query engines are constantly under pressure to keep up with the rapidly increasing demand for faster processing of more complex workloads. In the past few years, FPGA-based database acceleration efforts have demonstrated promising performance improvement with good energy efficiency. However, few studies target the programming and design automation support to leverage the FPGA accelerator benefits in query processing. Most of them rely on the SQL query plan generated by CPU query engines and manually map the query plan onto the FPGA accelerators, which is tedious and error-prone. Moreover, such CPU-oriented query plans do not consider the utilization of FPGA accelerators and could lose more optimization opportunities.
In this article, we present SQL2FPGA, an FPGA accelerator-aware compiler to automatically map SQL queries onto the heterogeneous CPU-FPGA platforms. Our SQL2FPGA front-end takes an optimized logical plan of an SQL query from a database query engine and transforms it into a unified operator-level intermediate representation. To generate an optimized FPGA-aware physical plan, SQL2FPGA implements a set of compiler optimization passes to (1) improve operator acceleration coverage by the FPGA, (2) eliminate redundant computation during physical execution, and (3) minimize data transfer overhead between operators on the CPU and FPGA. Furthermore, it also leverages machine learning techniques to predict and identify the optimal platform, either CPU or FPGA, for the physical execution of individual query operations. Finally, SQL2FPGA generates the associated query acceleration code for heterogeneous CPU-FPGA system deployment. Compared to the widely used Apache Spark SQL framework running on the CPU, SQL2FPGA—using two AMD/Xilinx HBM-based Alveo U280 FPGA boards and Ver.2020 AMD/Xilinx FPGA overlay designs—achieves an average performance speedup of 10.1x and 13.9x across all 22 TPC-H benchmark queries in a scale factor of 1 GB (SF1) and 30 GB (SF30), respectively. While evaluated on AMD/Xilinx Alveo U50 FPGA boards, SQL2FPGA using Ver. 2022 AMD/Xilinx FPGA overlay designs also achieve an average speedup of 9.6x at SF1 scale factor.

References

[1]
Mert Akdere, Ugur Çetintemel, Matteo Riondato, Eli Upfal, and Stanley B. Zdonik. 2012. Learning-Based Query Performance Modeling and Prediction. In Proceedings of the 2012 IEEE 28th International Conference on Data Engineering. 390–401.
[2]
Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. 2015. Spark SQL: Relational Data Processing in Spark. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD ’15). ACM, New York, NY, 1383–1394.
[3]
David Bacon, Rodric Rabbah, and Sunil Shukla. 2013. FPGA Programming for the Masses: The Programmability of FPGAs Must Improve If They Are to Be Part of Mainstream Computing. Queue 11, 2 (Feb 2013), 40–52.
[4]
Peter Bakkum and Kevin Skadron. 2010. Accelerating SQL Database Operations on a GPU with CUDA. In Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units(GPGPU-3). ACM, New York, NY, 94–103.
[5]
Andreas Becher, Florian Bauer, Daniel Ziener, and Jürgen Teich. 2014. Energy-Aware SQL Query Acceleration Through FPGA-Based Dynamic Partial Reconfiguration. In Proceedings of the 2014 24th International Conference on Field Programmable Logic and Applications (FPL). 1–8.
[6]
Jared Casper and Kunle Olukotun. 2014. Hardware Acceleration of Database Operations. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA ’14). ACM, New York, NY, 151–160.
[7]
Christopher Dennl, Daniel Ziener, and Jurgen Teich. 2012. On-the-Fly Composition of FPGA-Based SQL Query Accelerators Using a Partially Reconfigurable Module Library. In Proceedings of the 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines. 45–52.
[8]
Christopher Dennl, Daniel Ziener, and Jürgen Teich. 2013. Acceleration of SQL Restrictions and Aggregations Through FPGA-Based Dynamic Partial Reconfiguration. In Proceedings of the 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines. 25–28.
[9]
Hadi Esmaeilzadeh, Emily Blem, Renee St. Amant, Karthikeyan Sankaralingam, and Doug Burger. 2011. Dark Silicon and the End of Multicore Scaling. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA ’11). ACM, New York, NY, 365–376.
[10]
Jian Fang, Yvo T. B. Mulder, Jan Hidders, Jinho Lee, and H. Peter Hofstee. 2019. In-Memory Database Acceleration on FPGAs: A Survey. The VLDB Journal 29, 1 (Oct 2019), 33–59.
[11]
Phil Francisco. 2011. The Netezza Data Appliance Architecture: A Platform for High Performance Data Warehousing and Analytics. IBM Corp.
[12]
Bingsheng He, Mian Lu, Ke Yang, Rui Fang, Naga K. Govindaraju, Qiong Luo, and Pedro V. Sander. 2009. Relational Query Coprocessing on Graphics Processors. ACM Transactions on Database Systems 34, 4 (2009), Article 21, 39 pages.
[13]
Gui Huang, Xuntao Cheng, Jianying Wang, Yujie Wang, Dengcheng He, Tieying Zhang, Feifei Li, Sheng Wang, Wei Cao, and Qiang Li. 2019. X-Engine: An Optimized Storage Engine for Large-scale E-commerce Transaction Processing. In Proceedings of the 2019 International Conference on Management of Data (SIGMOD ’19). 651–665.
[14]
Kristiyan Manev, Anuj Vaishnav, Charalampos Kritikakis, and Dirk Koch. 2019. Scalable Filtering Modules for Database Acceleration on FPGAs. In Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies (HEART 2019). ACM, New York, NY, 6 pages.
[15]
MonetDB. 2022. MonetDB: The Database System to Speed Up Your Analytical Jobs. Retrieved December 20, 2022 from https://www.monetdb.org/
[16]
Rene Mueller, Jens Teubner, and Gustavo Alonso. 2010. Glacier: A Query-to-Hardware Compiler. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD ’10). ACM, New York, NY 1159–1162.
[17]
Kaspar Mätas, Kristiyan Manev, Joseph Powell, and Dirk Koch. 2022. Automated Generation and Orchestration of Stream Processing Pipelines on FPGAs. In Proceedings of the 2022 International Conference on Field-Programmable Technology (ICFPT). 1–10. DOI:
[18]
Jian Ouyang, Wei Qi, Yong Wang, Yichen Tu, Jing Wang, and Bowen Jia. 2016. SDA: Software-Defined Accelerator for General-Purpose Big Data Analysis System. In Proceedings of the 2016 IEEE Hot Chips 28 Symposium (HCS). 1–23. DOI:
[19]
Muhsen Owaida, David Sidler, Kaan Kara, and Gustavo Alonso. 2017. Centaur: A Framework for Hybrid CPU-FPGA Databases. In Proceedings of the 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). 211–218.
[20]
Ippokratis Pandis. 2021. The Evolution of Amazon Redshift. Proceedings of the VLDB Endowment 14, 12 (Jul 2021), 3162–3174.
[21]
Johns Paul, Jiong He, and Bingsheng He. 2016. GPL: A GPU-Based Pipelined Query Processing Engine. In Proceedings of the 2016 International Conference on Management of Data (SIGMOD ’16). ACM, New York, NY, 1935–1950.
[22]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Edouard Duchesnay. 2011. Scikit-Learn: Machine Learning in Python. Journal of Machine Learning Research 12, 85 (2011), 2825–2830.
[23]
PostgreSQL. 2022. PostgreSQL: The World’s Most Advanced Open Source Relational Database. Retrieved December 20, 2022 from https://www.postgresql.org/
[24]
J. Ross Quinlan. 1986. Induction of Decision Trees. Machine Learning 1 (1986), 81–106.
[25]
David Sidler, Muhsen Owaida, Zsolt István, Kaan Kara, and Gustavo Alonso. 2017. doppioDB: A Hardware Accelerated Database. In Proceedings of the 2017 27th International Conference on Field Programmable Logic and Applications (FPL). 1–1.
[26]
Apache Spark. 2022. Unified Engine for Large-Scale Data Analytics. Retrieved December 20, 2022 from https://spark.apache.org/
[27]
Bharat Sukhwani, Hong Min, Mathew Thoennes, Parijat Dube, Balakrishna Iyer, Bernard Brezzo, Donna Dillenberger, and Sameh Asaad. 2012. Database Analytics Acceleration Using FPGAs. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques (PACT ’12). ACM, New York, NY, 411–420.
[28]
Bharat Sukhwani, Mathew Thoennes, Hong Min, Parijat Dube, Bernard Brezzo, Sameh Asaad, and Donna Dillenberger. 2013. Large Payload Streaming Database Sort and Projection on FPGAs. In Proceedings of the 2013 25th International Symposium on Computer Architecture and High Performance Computing. 25–32.
[29]
Bharat Sukhwani, Mathew Thoennes, Hong Min, Parijat Dube, Bernard Brezzo, Sameh Asaad, and Donna Dillenberger. 2015. A Hardware/Software Approach for Database Query Acceleration with FPGAs. International Journal of Parallel Programming 43, 6 (2015), 1129–1159.
[30]
TPC. 2022a. TPC-DS is a Decision Support Benchmark. Retrieved December 20, 2022 from https://www.tpc.org/tpcds/
[31]
TPC. 2022b. TPC-H is a Decision Support Benchmark. Retrieved December 20, 2022 from https://www.tpc.org/tpch/
[32]
Zeke Wang, Johns Paul, Hui Yan Cheah, Bingsheng He, and Wei Zhang. 2016. Relational Query Processing on OpenCL-Based FPGAs. In Proceedings of the 2016 26th International Conference on Field Programmable Logic and Applications (FPL). 1–10.
[33]
Satoru Watanabe, Kazuhisa Fujimoto, Yuji Saeki, Yoshifumi Fujikawa, and Hiroshi Yoshino. 2019. Column-Oriented Database Acceleration Using FPGAs. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE). 686–697.
[34]
Louis Woods, Zsolt István, and Gustavo Alonso. 2014. Ibex: An Intelligent Storage Engine with Support for Advanced SQL Offloading. Proceedings of the VLDB Endowment 7, 11 (2014), 963–974.
[35]
Haicheng Wu, Gregory Diamos, Tim Sheard, Molham Aref, Sean Baxter, Michael Garland, and Sudhakar Yalamanchili. 2014a. Red Fox: An Execution Environment for Relational Query Processing on GPUs. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO ’14). ACM, New York, NY, 44–54.
[36]
Lisa Wu, Andrea Lottarini, Timothy K. Paine, Martha A. Kim, and Kenneth A. Ross. 2014b. Q100: The Architecture and Design of a Database Processing Unit. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’14). ACM, New York, NY, 255–268.
[37]
Xilinx. 2022. Vitis Database Library. Retrieved December 20, 2022 from https://www.xilinx.com/products/design-tools/vitis/vitis-libraries/vitis-database.html
[38]
Xilinx. 2023a. Alveo U280 Data Center Accelerator Card Data Sheet (DS963). Retrieved December 20, 2022 from https://docs.xilinx.com/r/en-US/ds963-u280/Summary
[39]
Xilinx. 2023b. Alveo U50 Data Center Accelerator Card Data Sheet. Retrieved December 20, 2022 from https://www.xilinx.com/content/dam/xilinx/support/documents/data_sheets/ds965-u50.pdf
[40]
Shuotao Xu, Thomas Bourgeat, Tianhao Huang, Hojun Kim, Sungjin Lee, and Arvind Arvind. 2020. AQUOMAN: An Analytic-Query Offloading Machine. In Proceedings of the 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). 386–399.
[41]
Teng Zhang, Jianying Wang, Xuntao Cheng, Hao Xu, Nanlong Yu, Gui Huang, Tieying Zhang, Dengcheng He, Feifei Li, Wei Cao, Zhongdong Huang, and Jianling Sun. 2020. FPGA-Accelerated Compactions for LSM-Based Key-Value Store. In Proceedings of the 18th USENIX Conference on File and Storage Technologies(FAST’20). USENIX Association, 225–238.
[42]
Daniel Ziener, Florian Bauer, Andreas Becher, Christopher Dennl, Klaus Meyer-Wegener, Ute Schürfeld, Jürgen Teich, Jörg-Stephan Vogt, and Helmut Weber. 2016. FPGA-Based Dynamically Reconfigurable SQL Query Processing. ACM Transactions on Reconfigurable Technology and Systems 9, 4 (Aug 2016), Article 25, 24 pages. 77

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Reconfigurable Technology and Systems
ACM Transactions on Reconfigurable Technology and Systems  Volume 17, Issue 3
September 2024
434 pages
EISSN:1936-7414
DOI:10.1145/3613592
  • Editor:
  • Deming Chen
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 September 2024
Online AM: 02 July 2024
Accepted: 17 June 2024
Revised: 06 May 2024
Received: 11 January 2024
Published in TRETS Volume 17, Issue 3

Check for updates

Author Tags

  1. Big data analytics
  2. analytical query processing
  3. HBM-based FPGA
  4. high-level synthesis
  5. compilation framework

Qualifiers

  • Research-article

Funding Sources

  • NSERC Discovery
  • Alliance
  • CFI John R. Evans Leaders
  • Huawei Canada and AMD-Xilinx

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 189
    Total Downloads
  • Downloads (Last 12 months)189
  • Downloads (Last 6 weeks)42
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media