skip to main content
10.1145/3542929.3563503acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

GHive: accelerating analytical query processing in apache hive via CPU-GPU heterogeneous computing

Published: 07 November 2022 Publication History

Abstract

As a popular distributed data warehouse system, Apache Hive has been widely used for big data analytics in many organizations. Meanwhile, exploiting the massive parallelism of GPU to accelerate online analytical processing (OLAP) has been extensively explored in the database community. In this paper, we present GHive, which enhances CPU-based Hive via CPU-GPU heterogeneous computing. GHive is designed for the business intelligence applications and provides the same API as Hive for compatibility. To run SQL queries jointly on both CPU and GPU, GHive comes with three key techniques: (i) a novel data model gTable, which is column-based and enables efficient data movement between CPU memory and GPU memory; (ii) a GPU-based operator library Panda, which provides a complete set of SQL operators with extensively optimized GPU implementations; (iii) a hardware-aware MapReduce job placement scheme, which puts jobs judiciously on either GPU or CPU via a cost-based approach. In the experiments, we observe that GHive outperforms Hive in both query processing speed and operating expense on the Star Schema Benchmark (SSB).

References

[1]
2022. Apache Arrow. https://arrow.apache.org/
[2]
2022. Apache Flink. https://en.wikipedia.org/wiki/Apache_Flink
[3]
2022. Apache Hadoop. https://hadoop.apache.org/
[4]
2022. Apache Spark. https://en.wikipedia.org/wiki/Apache_Spark
[5]
2022. BlazingSQL. https://blazingsql.com
[6]
2022. CUB. https://nvlabs.github.io/cub/
[7]
2022. GHive. https://github.com/DBGroup-SUSTech/GHive
[8]
2022. IBM Spark GPU. https://github.com/IBMSparkGPU/GPUEnabler
[9]
2022. libcuDF. https://github.com/rapidsai/cudf
[10]
2022. MPS. https://docs.nvidia.com/deploy/mps
[11]
2022. NVML. https://developer.nvidia.com/nvidia-management-library-nvml
[12]
2022. OmniSciDB. https://www.omnisci.com/platform/omniscidb
[13]
2022. rapids. https://rapids.ai
[14]
2022. Thrust. https://github.com/NVIDIA/thrust
[15]
Daniel Abadi, Peter A. Boncz, Stavros Harizopoulos, Stratos Idreos, and Samuel Madden. 2013. The Design and Implementation of Modern Column-Oriented Database Systems. Found. Trends Databases 5 (2013), 197--280.
[16]
Edmon Begoli, Jesús Camacho-Rodríguez, Julian Hyde, Michael J Mior, and Daniel Lemire. 2018. Apache calcite: A foundational framework for optimized query processing over heterogeneous data sources. In SIGMOD. 221--230.
[17]
Sebastian Breß, Henning Funke, and Jens Teubner. 2016. Robust Query Processing in Co-Processor-accelerated Databases. In SIGMOD, Fatma Özcan, Georgia Koutrika, and Sam Madden (Eds.). 1891--1906.
[18]
Jesús Camacho-Rodríguez, Ashutosh Chauhan, Alan Gates, Eugene Koifman, Owen O'Malley, Vineet Garg, Zoltan Haindrich, Sergey Shelukhin, Prasanth Jayachandran, Siddharth Seth, et al. 2019. Apache hive: From mapreduce to enterprise-grade big data warehousing. In SIGMOD. 1773--1786.
[19]
Cen Chen, Kenli Li, Aijia Ouyang, and Keqin Li. 2018. Flinkcl: An opencl-based in-memory computing architecture on heterogeneous cpu-gpu clusters for big data. IEEE Trans. Comput. 67, 12 (2018), 1765--1779.
[20]
Cen Chen, Kenli Li, Aijia Ouyang, Zeng Zeng, and Keqin Li. 2018. GFlink: An in-memory computing architecture on heterogeneous CPU-GPU clusters for big data. TPDS 29, 6 (2018), 1275--1288.
[21]
Periklis Chrysogelos, Manos Karpathiotakis, Raja Appuswamy, and Anastasia Ailamaki. 2019. HetExchange: Encapsulating heterogeneous CPU-GPU parallelism in JIT compiled engines. PVLDB 12, 5, 544--556.
[22]
Periklis Chrysogelos, Panagiotis Sioulas, and Anastasia Ailamaki. 2019. Hardware-conscious query processing in gpu-accelerated analytical engines. In CIDR.
[23]
Periklis Chrysogelos, Panagiotis Sioulas, and Anastasia Ailamaki. 2019. Hardware-conscious Query Processing in GPU-accelerated Analytical Engines. In CIDR.
[24]
Jeffrey Dean and Sanjay Ghemawat. 2010. MapReduce: a flexible data processing tool. Commun. ACM 53, 1 (2010), 72--77.
[25]
Rui Fang, Bingsheng He, Mian Lu, Ke Yang, Naga K Govindaraju, Qiong Luo, and Pedro V Sander. 2007. GPUQP: query co-processing using graphics processors. In SIGMOD. 1061--1063.
[26]
Henning Funke, Sebastian Breß, Stefan Noll, Volker Markl, and Jens Teubner. 2018. Pipelined query processing in coprocessor environments. In SIGMOD. 1603--1618.
[27]
Henning Funke and Jens Teubner. 2020. Data-Parallel Query Processing on Non-Uniform Data. PVLDB 13, 6 (2020), 884--897.
[28]
Max Heimel, Michael Saecker, Holger Pirk, Stefan Manegold, and Volker Markl. 2013. Hardware-oblivious parallelism for in-memory column-stores. PVLDB 6, 9 (2013), 709--720.
[29]
Max Heimel, Michael Saecker, Holger Pirk, Stefan Manegold, and Volker Markl. 2013. Hardware-Oblivious Parallelism for In-Memory Column-Stores. PVLDB 6, 9 (2013), 709--720.
[30]
Eugene Gorbatov Howard David, Rahul Khanna Ulf R. Hanebutte, and Christian Le. 2010. RAPL: memory power estimation and capping. In International Symposium on Low Power Electronics and Design. 189--194.
[31]
Yin Huai, Ashutosh Chauhan, Alan Gates, Gunther Hagleitner, Eric N Hanson, Owen O'Malley, Jitendra Pandey, Yuan Yuan, Rubao Lee, and Xiaodong Zhang. 2014. Major technical advancements in apache hive. In SIGMOD. 1235--1246.
[32]
Myeongjae Jeon, Shivaram Venkataraman, Amar Phanishayee, Junjie Qian, Wencong Xiao, and Fan Yang. 2019. Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads. In ATC. 947--960.
[33]
Tomas Karnagel, René Müller, and Guy M Lohman. 2015. Optimizing GPU-accelerated Group-By and Aggregation. ADMS@VLDB 8 (2015), 20.
[34]
Rubao Lee, Tian Luo, Yin Huai, Fusheng Wang, Yongqiang He, and Xiaodong Zhang. 2011. Ysmart: Yet another sql-to-mapreduce translator. In ICDCS. IEEE, 25--36.
[35]
Rubao Lee, Minghong Zhou, Chi Li, Shenggang Hu, Jianping Teng, Dongyang Li, and Xiaodong Zhang. 2021. The art of balance: a RateupDB experience of building a CPU/GPU hybrid database product. PVLDB 14, 12 (2021), 2999--3013.
[36]
Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2015. How good are query optimizers, really? PVLDB 9, 3 (2015), 204--215.
[37]
Zhila Nouri Lewis and Yi-Cheng Tu. 2022. G-PICS: A Framework for GPU-Based Spatial Indexing and Query Processing. TKDE 34, 3 (2022), 1243--1257.
[38]
Ang Li, Shuaiwen Leon Song, Jieyang Chen, Jiajia Li, Xu Liu, Nathan R Tallent, and Kevin J Barker. 2019. Evaluating modern gpu interconnect: Pcie, nvlink, nv-sli, nvswitch and gpudirect. TPDS 31, 1 (2019), 94--110.
[39]
Jing Li, Hung-Wei Tseng, Chunbin Lin, Yannis Papakonstantinou, and Steven Swanson. 2016. Hippogriffdb: Balancing i/o and gpu bandwidth in big data analytics. PVLDB 9, 14 (2016), 1647--1658.
[40]
Clemens Lutz, Sebastian Breß, Steffen Zeuch, Tilmann Rabl, and Volker Markl. 2020. Pump up the volume: Processing large data on GPUs with fast interconnects. In SIGMOD. 1633--1649.
[41]
Ingo Müller, Cornelius Ratsch, and Franz Färber. 2014. Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems. In EDBT. 283--294.
[42]
Patrick O'Neil, Elizabeth O'Neil, Xuedong Chen, and Stephen Revilak. 2009. The star schema benchmark and augmented fact table indexing. In Technology Conference on Performance Evaluation and Benchmarking. 237--252.
[43]
Johns Paul, Bingsheng He, Shengliang Lu, and Chiew Tong Lau. 2020. Improving execution efficiency of just-in-time compilation based query processing on gpus. PVLDB 14, 2 (2020), 202--214.
[44]
Johns Paul and Bingsheng et al. He. 2020. Revisiting hash join on graphics processors: A decade later. Distributed and Parallel Databases 38, 4 (2020), 771--793.
[45]
Syed Mohammad Aunn Raza, Periklis Chrysogelos, Panagiotis Sioulas, Vladimir Indjic, Angelos Christos Anadiotis, and Anastasia Ailamaki. 2020. GPU-accelerated data management under the test of time. In CIDR.
[46]
Bikas Saha, Hitesh Shah, Siddharth Seth, Gopal Vijayaraghavan, Arun Murthy, and Carlo Curino. 2015. Apache tez: A unifying framework for modeling and building data processing applications. In SIGMOD. 1357--1369.
[47]
Anil Shanbhag, Samuel Madden, and Xiangyao Yu. 2020. A study of the fundamental performance characteristics of GPUs and CPUs for database analytics. In SIGMOD. 1617--1632.
[48]
Panagiotis Sioulas, Periklis Chrysogelos, Manos Karpathiotakis, Raja Appuswamy, and Anastasia Ailamaki. 2019. Hardware-conscious hash-joins on gpus. In ICDE. 698--709.
[49]
Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Suresh Anthony, Hao Liu, Pete Wyckoff, and Raghotham Murthy. 2009. Hive: a warehousing solution over a map-reduce framework. PVLDB 2, 2 (2009), 1626--1629.
[50]
Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Ning Zhang, Suresh Antony, Hao Liu, and Raghotham Murthy. 2010. Hive-a petabyte scale data warehouse using hadoop. In ICDE. 996--1005.
[51]
Vinod Kumar Vavilapalli, Arun C Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, et al. 2013. Apache hadoop yarn: Yet another resource negotiator. In SoCC. 1--16.
[52]
Haicheng Wu, Gregory Diamos, Tim Sheard, Molham Aref, Sean Baxter, Michael Garland, and Sudhakar Yalamanchili. 2014. Red fox: An execution environment for relational query processing on gpus. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. 44--54.
[53]
Haicheng Wu, Gregory F. Diamos, Tim Sheard, Molham Aref, Sean Baxter, Michael Garland, and Sudhakar Yalamanchili. 2014. Red Fox: An Execution Environment for Relational Query Processing on GPUs. In 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, 44.
[54]
Long Xiang, Bo Tang, and Chuan Yang. 2019. Accelerating exact inner product retrieval by cpu-gpu systems. In SIGIR. 1277--1280.
[55]
Yuan Yuan, Rubao Lee, and Xiaodong Zhang. 2013. The Yin and Yang of processing data warehousing queries on GPU devices. PVLDB 6, 10 (2013), 817--828.
[56]
Yuan Yuan, Meisam Fathi Salmi, Yin Huai, Kaibo Wang, Rubao Lee, and Xiaodong Zhang. 2016. Spark-GPU: An accelerated in-memory data processing engine on clusters. In IEEE Big Data. 273--283.
[57]
Yansong Zhang, Yu Zhang, Jiaheng Lu, Shan Wang, Zhuan Liu, and Ruichen Han. 2020. One size does not fit all: accelerating OLAP workloads with GPUs. Distributed and Parallel Databases (2020), 1--43.

Cited By

View all
  • (2024)CGgraph: An Ultra-Fast Graph Processing System on Modern Commodity CPU-GPU Co-processorProceedings of the VLDB Endowment10.14778/3648160.364817917:6(1405-1417)Online publication date: 3-May-2024
  • (2024)ML-Based Dynamic Operator-Level Query Mapping for Stream Processing Systems in Heterogeneous Computing Environments2024 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER59578.2024.00027(226-237)Online publication date: 24-Sep-2024
  • (2023)Towards Building The Next Generation Computation EngineProceedings of the ACM Turing Award Celebration Conference - China 202310.1145/3603165.3607435(129-130)Online publication date: 25-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SoCC '22: Proceedings of the 13th Symposium on Cloud Computing
November 2022
574 pages
ISBN:9781450394147
DOI:10.1145/3542929
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

SoCC '22
Sponsor:
SoCC '22: ACM Symposium on Cloud Computing
November 7 - 11, 2022
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 169 of 722 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)17
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)CGgraph: An Ultra-Fast Graph Processing System on Modern Commodity CPU-GPU Co-processorProceedings of the VLDB Endowment10.14778/3648160.364817917:6(1405-1417)Online publication date: 3-May-2024
  • (2024)ML-Based Dynamic Operator-Level Query Mapping for Stream Processing Systems in Heterogeneous Computing Environments2024 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER59578.2024.00027(226-237)Online publication date: 24-Sep-2024
  • (2023)Towards Building The Next Generation Computation EngineProceedings of the ACM Turing Award Celebration Conference - China 202310.1145/3603165.3607435(129-130)Online publication date: 25-Sep-2023
  • (2023)QEVIS: Multi-Grained Visualization of Distributed Query ExecutionIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332693030:1(153-163)Online publication date: 26-Oct-2023
  • (2023)A Comparative Study of the Performance of Real time databases and Big data Analytics Frameworks2023 7th International Multi-Topic ICT Conference (IMTIC)10.1109/IMTIC58887.2023.10178651(1-7)Online publication date: 10-May-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media