skip to main content
10.1145/3514221.3520166acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper

GHive: A Demonstration of GPU-Accelerated Query Processing in Apache Hive

Published: 11 June 2022 Publication History

Abstract

As a distributed, fault-tolerant data warehouse system for large-scale data analytics, Apache Hive has been used for various applications in many organizations (e.g., Facebook, Amazon, and Huawei). Exploiting the large degrees of parallelism of GPU to improve the performance of online analytical processing (OLAP) in database system is a common practice in the industry. Meanwhile, it is a common practice to exploit the large degrees of parallelism of GPU to improve the performance of online analytical processing (OLAP) in database systems. This demo presents GHive, which enables Apache Hive to accelerate OLAP queries by jointly utilizing CPU and GPU in intelligent and efficient ways. The takeaways for SIGMOD attendees include: (1) the superior performance of GHive compared with vanilla Hive that only uses CPU; (2) intuitive visualizations of execution statistics for Hive and GHive to understand where the acceleration of GHive comes from; (3) detailed profiling of the time taken by each operator on CPU and GPU to show the advantages of GPU execution.

References

[1]
2021. Apache Hadoop. https://hadoop.apache.org
[2]
2021. OmniSciDB. https://www.omnisci.com/platform/omniscidb
[3]
Jesús Camacho-Rodríguez and et al. 2019. Apache hive: From mapreduce to enterprise-grade big data warehousing. In SIGMOD. 1773--1786.
[4]
Periklis Chrysogelos and et al. 2019. Hardware-conscious query processing in gpu-accelerated analytical engines. In CIDR.
[5]
Yin Huai and et al. 2014. Major technical advancements in apache hive. In SIGMOD. 1235--1246.
[6]
Patrick O'Neil and et al. 2009. The star schema benchmark and augmented fact table indexing. In Technology Conference on Performance Evaluation and Benchmarking. Springer, 237--252.
[7]
Ashish Thusoo and et al. 2009. Hive: a warehousing solution over a map-reduce framework. PVLDB 2, 2 (2009), 1626--1629.
[8]
Long Xiang and et al. 2019. Accelerating exact inner product retrieval by cpu-gpu systems. In SIGIR. 1277--1280.

Cited By

View all
  • (2024)A Genetic Algorithm Model for Join Order Query Optimization in Hadoop-Hive FrameworkEmerging Trends in Expert Applications and Security10.1007/978-981-97-3745-1_25(289-300)Online publication date: 28-Sep-2024
  • (2023)Towards Building The Next Generation Computation EngineProceedings of the ACM Turing Award Celebration Conference - China 202310.1145/3603165.3607435(129-130)Online publication date: 25-Sep-2023
  • (2023)QEVIS: Multi-Grained Visualization of Distributed Query ExecutionIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332693030:1(153-163)Online publication date: 26-Oct-2023

Index Terms

  1. GHive: A Demonstration of GPU-Accelerated Query Processing in Apache Hive

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data
    June 2022
    2597 pages
    ISBN:9781450392495
    DOI:10.1145/3514221
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 June 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Apache Hive
    2. CPU-GPU co-processing
    3. OLAP

    Qualifiers

    • Short-paper

    Conference

    SIGMOD/PODS '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)58
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Genetic Algorithm Model for Join Order Query Optimization in Hadoop-Hive FrameworkEmerging Trends in Expert Applications and Security10.1007/978-981-97-3745-1_25(289-300)Online publication date: 28-Sep-2024
    • (2023)Towards Building The Next Generation Computation EngineProceedings of the ACM Turing Award Celebration Conference - China 202310.1145/3603165.3607435(129-130)Online publication date: 25-Sep-2023
    • (2023)QEVIS: Multi-Grained Visualization of Distributed Query ExecutionIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.332693030:1(153-163)Online publication date: 26-Oct-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media