skip to main content
10.1145/3132847.3132916acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A Study of Main-Memory Hash Joins on Many-core Processor: A Case with Intel Knights Landing Architecture

Published: 06 November 2017 Publication History

Abstract

Advanced processor architectures have been driving new designs, implementations and optimizations of main-memory hash join algorithms recently. The newly released Intel Xeon Phi many-core processor of the Knights Landing architecture (KNL) embraces interesting hardware features such as many low-frequency out-of-order cores connected on a 2D mesh, and high-bandwidth multi-channel memory (MCDRAM). In this paper, we experimentally revisit the state-of-the-art main-memory hash join algorithms to study how the new hardware features of KNL affect the algorithmic design and tuning as well as to identify the opportunities for further performance improvement on KNL. Our experiments show that, although many existing optimizations are still valid on KNL with proper tuning, even the state-of-the-art algorithms have severely underutilized the memory bandwidth and other hardware resources.

References

[1]
Martina-Cezara Albutiu, Alfons Kemper, and Thomas Neumann. 2012. Massively Parallel Sort-merge Joins in Main Memory Multi-core Database Systems. Proc. VLDB Endow., Vol. 5, 10 (2012), 1064--1075.
[2]
Cagri Balkesen, Gustavo Alonso, Jens Teubner, and M. Tamer Özsu. 2013. Multi-core, Main-memory Joins: Sort vs. Hash Revisited. Proc. VLDB Endow., Vol. 7, 1 (2013), 85--96.
[3]
Spyros Blanas, Yinan Li, and Jignesh M. Patel. 2011. Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data. ACM, 37--48.
[4]
Peter A. Boncz, Stefan Manegold, and Martin L. Kersten. 1999. Database Architecture Optimized for the New Bottleneck: Memory Access Proceedings of the 25th International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., 54--65.
[5]
Shimin Chen, Anastassia Ailamaki, Phillip B. Gibbons, and Todd C. Mowry. 2007. Improving Hash Join Performance Through Prefetching. ACM Trans. Database Syst. Vol. 32, 3 (2007).
[6]
Xuntao Cheng, Bingsheng He, and Chiew Tong Lau. 2015. Energy-Efficient Query Processing on Embedded CPU-GPU Architectures Proceedings of the 11th International Workshop on Data Management on New Hardware. ACM, 10:1--10:7.
[7]
Xuntao Cheng, Bingsheng He, Mian Lu, Chiew Tong Lau, Huynh Phung Huynh, and Rick Siow Mong Goh. 2016. Efficient Query Processing on Many-core Architectures: A Case Study with Intel Xeon Phi Processor. In Proceedings of the 2016 International Conference on Management of Data. ACM, 2081--2084.
[8]
Bingsheng He, Ke Yang, Rui Fang, Mian Lu, Naga Govindaraju, Qiong Luo, and Pedro Sander. 2008. Relational Joins on Graphics Processors. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, 511--524.
[9]
Jiong He, Mian Lu, and Bingsheng He. 2013. Revisiting Co-processing for Hash Joins on the Coupled CPU-GPU Architecture. Proc. VLDB Endow., Vol. 6, 10 (2013), 889--900.
[10]
Kaixi Hou, Hao Wang, and Wu-chun Feng. 2015. ASPaS: A Framework for Automatic SIMDization of Parallel Sorting on x86-based Many-core Processors. In Proceedings of the 29th ACM on International Conference on Supercomputing. ACM, 383--392.
[11]
James Jeffers and et al. 2016. Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition. Morgan Kaufmann.
[12]
Saurabh Jha, Bingsheng He, Mian Lu, Xuntao Cheng, and Huynh Phung Huynh. 2015. Improving Main Memory Hash Joins on Intel Xeon Phi Processors: An Experimental Approach. Proc. VLDB Endow., Vol. 8, 6 (2015), 642--653.
[13]
Tim Kaldewey, Guy Lohman, Rene Mueller, and Peter Volk. 2012. GPU Join Processing Revisited. In Proceedings of the Eighth International Workshop on Data Management on New Hardware. ACM, 55--62.
[14]
A. Kemper and T. Neumann. 2011. HyPer: A hybrid OLTP amp;OLAP main memory database system based on virtual memory snapshots 2011 IEEE 27th International Conference on Data Engineering. 195--206.
[15]
Tim Kiefer, Thomas Kissinger, Benjamin Schlegel, Dirk Habich, Daniel Molka, and Wolfgang Lehner. 2014. ERIS Live: A NUMA-aware In-memory Storage Engine for Tera-scale Multiprocessor Systems Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. ACM, 689--692.
[16]
Changkyu Kim, Tim Kaldewey, Victor W. Lee, Eric Sedlar, Anthony D. Nguyen, Nadathur Satish, Jatin Chhugani, Andrea Di Blas, and Pradeep Dubey. 2009. Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-core CPUs. Proc. VLDB Endow., Vol. 2, 2 (2009), 1378--1389.
[17]
Arun Kumar, Jeffrey Naughton, Jignesh M. Patel, and Xiaojin Zhu. 2016. To Join or Not to Join?: Thinking Twice About Joins Before Feature Selection Proceedings of the 2016 International Conference on Management of Data. ACM, 19--34.
[18]
Viktor Leis, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2014. Morsel-driven Parallelism: A NUMA-aware Query Evaluation Framework for the Many-core Age Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. ACM, 743--754.
[19]
Yinan Li and Jignesh M. Patel. 2013. BitWeaving: Fast Scans for Main Memory Data Processing Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, 289--300.
[20]
Gabriel H. Loh. 2008. 3D-Stacked Memory Architectures for Multi-core Processors Proceedings of the 35th Annual International Symposium on Computer Architecture. IEEE Computer Society, 453--464.
[21]
G. E. Moore. 2006. Cramming more components onto integrated circuits, Reprinted from Electronics, volume 38, number 8, April 19, 1965, pp.114 ff. IEEE Solid-State Circuits Society Newsletter, Vol. 11, 5 (2006), 33--35.
[22]
Holger Pirk, Oscar Moll, Matei Zaharia, and Sam Madden. 2016. Voodoo - a Vector Algebra for Portable Database Performance on Modern Hardware. Proc. VLDB Endow., Vol. 9, 14 (2016), 1707--1718.
[23]
Orestis Polychroniou, Arun Raghavan, and Kenneth A. Ross. 2015. Rethinking SIMD Vectorization for In-Memory Databases Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM, 1493--1508.
[24]
Iraklis Psaroudakis, Tobias Scheuer, Norman May, Abdelkader Sellami, and Anastasia Ailamaki. 2016. Adaptive NUMA-aware Data Placement and Task Scheduling for Analytical Workloads in Main-memory Column-stores. Proc. VLDB Endow., Vol. 10, 2 (2016), 37--48.
[25]
Nadathur Satish, Changkyu Kim, Jatin Chhugani, Anthony D. Nguyen, Victor W. Lee, Daehyun Kim, and Pradeep Dubey. 2010. Fast Sort on CPUs and GPUs: A Case for Bandwidth Oblivious SIMD Sort Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data. ACM, 351--362.
[26]
Stefan Schuh, Xiao Chen, and Jens Dittrich. 2016. An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory Proceedings of the 2016 International Conference on Management of Data. ACM, 1961--1976.
[27]
Avinash Sodani. 2015. Knights landing (KNL): 2nd Generation Intel® Xeon Phi processor Hot Chips. IEEE, 1--24.
[28]
Kian-Lee Tan, Qingchao Cai, Beng Chin Ooi, Weng-Fai Wong, Chang Yao, and Hao Zhang. 2015. In-memory Databases: Challenges and Opportunities From Software and Hardware Perspectives. SIGMOD Rec., Vol. 44, 2 (2015), 35--40.
[29]
Jens Teubner, Gustavo Alonso, Cagri Balkesen, and M. Tamer Ozsu. 2013. Main-memory Hash Joins on Multi-core CPUs: Tuning to the Underlying Hardware Proceedings of the 2013 IEEE International Conference on Data Engineering. IEEE Computer Society, 362--373.
[30]
H. Zhang, G. Chen, B. C. Ooi, K. L. Tan, and M. Zhang. 2015. In-Memory Big Data Management and Processing: A Survey. IEEE Transactions on Knowledge and Data Engineering, Vol. 27, 7 (2015), 1920--1948.

Cited By

View all
  • (2022)Triton Join: Efficiently Scaling to a Large Join State on GPUs with Fast InterconnectsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517911(1017-1032)Online publication date: 10-Jun-2022
  • (2021)Efficient local locking for massively multithreaded in-memory hash-based operatorsThe VLDB Journal10.1007/s00778-020-00642-5Online publication date: 11-Feb-2021
  • (2020)One size does not fit all: accelerating OLAP workloads with GPUsDistributed and Parallel Databases10.1007/s10619-020-07304-zOnline publication date: 31-Jul-2020
  • Show More Cited By

Index Terms

  1. A Study of Main-Memory Hash Joins on Many-core Processor: A Case with Intel Knights Landing Architecture

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '17: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management
      November 2017
      2604 pages
      ISBN:9781450349185
      DOI:10.1145/3132847
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 November 2017

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. database operators
      2. hash join algorithms
      3. many-core processor

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      CIKM '17
      Sponsor:

      Acceptance Rates

      CIKM '17 Paper Acceptance Rate 171 of 855 submissions, 20%;
      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)17
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 01 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Triton Join: Efficiently Scaling to a Large Join State on GPUs with Fast InterconnectsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517911(1017-1032)Online publication date: 10-Jun-2022
      • (2021)Efficient local locking for massively multithreaded in-memory hash-based operatorsThe VLDB Journal10.1007/s00778-020-00642-5Online publication date: 11-Feb-2021
      • (2020)One size does not fit all: accelerating OLAP workloads with GPUsDistributed and Parallel Databases10.1007/s10619-020-07304-zOnline publication date: 31-Jul-2020
      • (2020)VIP: A SIMD vectorized analytical query engineThe VLDB Journal10.1007/s00778-020-00621-wOnline publication date: 13-Jul-2020
      • (2020)Robust and efficient memory management in Apache AsterixDBSoftware: Practice and Experience10.1002/spe.279950:7(1114-1151)Online publication date: 17-Feb-2020
      • (2019)Interleaved multi-vectorizingProceedings of the VLDB Endowment10.14778/3368289.336829013:3(226-238)Online publication date: 1-Nov-2019
      • (2019)Implementing efficient data compression and encryption in a persistent key-value store for HPCThe International Journal of High Performance Computing Applications10.1177/1094342019847264(109434201984726)Online publication date: 23-May-2019
      • (2019)Deploying Hash Tables on Die-Stacked High Bandwidth MemoryProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3358015(239-248)Online publication date: 3-Nov-2019
      • (2019)Towards Practical Vectorized Analytical Query EnginesProceedings of the 15th International Workshop on Data Management on New Hardware10.1145/3329785.3329928(1-7)Online publication date: 1-Jul-2019
      • (2019)BriskStreamProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3300067(705-722)Online publication date: 25-Jun-2019
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media