skip to main content
10.1145/3514221.3526042acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Scaling Equi-Joins

Published:11 June 2022Publication History

ABSTRACT

This paper proposes Adaptive-Multistage-Join (AM-Join) for scalable and fast equi-joins in distributed shared-nothing architectures. AM-Join utilizes (a) Tree-Join, a novel algorithm that scales well when the joined tables share hot keys, and (b) Broadcast-Join, the fastest-known when joining keys that are hot in only one table.

Unlike the state-of-the-art algorithms, AM-Join (a) holistically solves the join-key skew problem by achieving load balancing throughout the join execution, and (b) supports all outer-join variants without record deduplication or custom table partitioning. For the best AM-Join outer-join performance, we propose Index-Broadcast-Join (IB-Join) for Small-Large outer-joins, where one table fits in memory and the other is orders of magnitude larger. IB-Join improves on the state-of-the-art outer-join algorithms.

The proposed algorithms can be adopted in any shared-nothing architecture. We implemented a MapReduce version using Spark. Our evaluation shows the proposed algorithms execute significantly faster and scale to more skewed and orders-of-magnitude bigger tables when compared to the state-of-the-art algorithms.

Skip Supplemental Material Section

Supplemental Material

SIGMOD22-ip01.mp4

mp4

195.3 MB

References

  1. F. Afrati, N. Stasinopoulos, J. Ullman, and A. Vassilakopoulos. SharesSkew: An Algorithm to Handle Skew for Joins in MapReduce. Information Systems, 77:129--150, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  2. F. Afrati and J. Ullman. Optimizing Joins in a Map-Reduce Environment. In EDBT International Conference on Extending Database Technology, pages 99--110, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. P. Agarwal, G. Cormode, Z. Huang, J. Phillips, Z. Wei, and K. Yi. Mergeable Summaries. TODS ACM Transactions on Database Systems, 38(4):1--28, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M.-C. Albutiu, A. Kemper, and T. Neumann. Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems. arXiv preprint arXiv:1207.0145, 2012.Google ScholarGoogle Scholar
  5. K. Alway and A. Nica. Constructing Join Histograms from Histograms with q-error Guarantees. In ACM SIGMOD International Conference on Management of Data, pages 2245--2246, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Apache Hadoop. http://hadoop.apache.org.Google ScholarGoogle Scholar
  7. F. Atta, S. Viglas, and S. Niazi. SAND Join - A Skew Handling Join Algorithm for Google's MapReduce Framework. In IEEE INMIC International Multitopic Conference, pages 170--175. IEEE, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  8. C. Balkesen, G. Alonso, J. Teubner, and M. T. Özsu. Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited. Proceedings of the VLDB Endowment, 7(1):85--96, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Balkesen, J. Teubner, G. Alonso, and M. T. Özsu. Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware. In IEEE ICDE International Conference on Data Engineering, pages 362--373, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Bandle, J. Giceva, and T. Neumann. To Partition, or Not to Partition, That is the Join Question in a Real System. In ACM SIGMOD International Conference on Management of Data, pages 168--180, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Bar-Noy and S. Kipnis. Designing Broadcasting Algorithms in the Postal Model for Message-Passing Systems. Mathematical Systems Theory, 27(5):431--452, 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. C. Barthels, S. Loesing, G. Alonso, and D. Kossmann. Rack-Scale In-Memory Join Processing using RDMA. In ACM SIGMOD International Conference on Management of Data, pages 1463--1475, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. C. Barthels, I. Müller, T. Schneider, G. Alonso, and T. Hoefler. Distributed Join Algorithms on Thousands of Cores. Proceedings of the VLDB Endowment, 10(5):517--528, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. P. Bernstein, N. Goodman, E. Wong, C. Reeve, and J. Rothnie Jr. Query Processing in a System for Distributed Databases. TODS ACM Transactions on Database Systems, 6(4):602--625, 1981.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. C. Binnig, A. Crotty, A. Galakatos, T. Kraska, and E. Zamanian. The End of Slow Networks: It's Time for a Redesign. Proceedings of the VLDB Endowment, 9(7):528--539, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Blanas, Y. Li, and J. Patel. Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs. In ACM SIGMOD International Conference on Management of Data, pages 37--48, 2011.Google ScholarGoogle Scholar
  17. S. Blanas, J. Patel, V. Ercegovac, J. Rao, E. Shekita, and Y. Tian. A Comparison of Join Algorithms for Log Processing in MapReduce. In ACM SIGMOD International Conference on Management of Data, pages 975--986, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Blasgen and K. Eswaran. Storage and Access in Relational Data Bases. IBM Systems Journal, 16(4):363--377, 1977.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. N. Bruno, Y. Kwon, and M.-C. Wu. Advanced Join Strategies for Large-Scale Distributed Computation. Proceedings of the VLDB Endowment, 7(13):1484--1495, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Chen and V. Prasanna. Accelerating Equi-Join on a CPU-FPGA Heterogeneous Platform. In IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), pages 212--219, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  21. Z. Chen and A. Zhang. A Survey of Approximate Quantile Computation on Large-Scale Data. IEEE Access, 8:34585--34597, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  22. L. Cheng, S. Kotoulas, T. Ward, and G. Theodoropoulos. QbDJ: A Novel Framework for Handling Skew in Parallel Join Processing on Distributed Memory. In IEEE HPCC International Conference on High Performance Computing and Communications, pages 1519--1527. IEEE, 2013.Google ScholarGoogle Scholar
  23. L. Cheng, S. Kotoulas, T. Ward, and G. Theodoropoulos. Robust and Skew-resistant Parallel Joins in Shared-Nothing Systems. In ACM CIKM International Conference on Conference on Information and Knowledge Management, pages 1399--1408, 2014.Google ScholarGoogle Scholar
  24. L. Cheng, I. Tachmazidis, S. Kotoulas, and G. Antoniou. Design and Evaluation of Small-Large Outer Joins in Cloud Computing Environments. Journal of Parallel and Distributed Computing, 110:2--15, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  25. T.-Y. Cheung. A Method for Equijoin Queries in Distributed Relational Databases. IEEE TOC Transactions on Computers, 100(8):746--751, 1982.Google ScholarGoogle Scholar
  26. S. Chu, M. Balazinska, and D. Suciu. From theory to practice: Efficient join query evaluation in a parallel database system. In ACM SIGMOD International Conference on Management of Data, pages 63--78, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. G. Cormode and P. Veselỳ. A Tight Lower Bound for Comparison-Based Quantile Summaries. In ACM PODS SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 81--93, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Das, J. Gehrke, and M. Riedewald. Approximate Join Processing over Data Streams. In ACM SIGMOD International Conference on Management of Data, pages 40--51, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, 51(1):107--113, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. D. DeWitt, S. Ghandeharizadeh, D. Schneider, A. Bricker, H. Hsiao, and R. Rasmussen. The Gamma Database Machine Project. IEEE TKDE Transactions on Knowledge and Data Engineering, 2(1):44--62, 1990.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. D. DeWitt, J. Naughton, D. Schneider, and S. Seshadri. Practical Skew Handling in Parallel Joins. Technical report, University of Wisconsin-Madison Department of Computer Sciences, 1992.Google ScholarGoogle Scholar
  32. D. DeWitt, M. Smith, and H. Boral. A Single-User Performance Evaluation of the Teradata Database Machine. In International Workshop on High Performance Transaction Systems, pages 243--276. Springer, 1987.Google ScholarGoogle Scholar
  33. E. Gavagsaz, A. Rezaee, and H. Javadi. Load Balancing in Join Algorithms for Skewed Data in MapReduce Systems. The Journal of Supercomputing, 75(1):228--254, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. G. Graefe. Sort-Merge-Join: An Idea Whose Time Has(h) Passed? In IEEE ICDE International Conference on Data Engineering, pages 406--417. IEEE, 1994.Google ScholarGoogle Scholar
  35. V. Gulisano, Y. Nikolakopoulos, M. Papatriantafilou, and P. Tsigas. ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join. IEEE Transactions on Big Data, 7(2):299--312, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  36. C. Guo, H. Chen, F. Zhang, and C. Li. Distributed Join Algorithms on Multi-CPU Clusters with GPUDirect RDMA. In ICPP International Conference on Parallel Processing, pages 1--10, 2019.Google ScholarGoogle Scholar
  37. M. Hassan and M. Bamha. An Efficient Parallel Algorithm for Evaluating Join Queries on Heterogeneous Distributed Systems. In IEEE HiPC International Conference on High Performance Computing, pages 350--358. IEEE, 2009.Google ScholarGoogle Scholar
  38. B. He, K. Yang, R. Fang, M. Lu, N. Govindaraju, Q. Luo, and P. Sander. Relational joins on graphics processors. In ACM SIGMOD International Conference on Management of Data, pages 511--524, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. D. Jiang, A. Tung, and G. Chen. MAP-JOIN-REDUCE: Toward Scalable and Efficient Data Analysis on Large Clusters. IEEE TKDE Transactions on Knowledge and Data Engineering, 23(9):1299--1311, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  40. T. Kaldewey, G. Lohman, R. Mueller, and P. Volk. GPU Join Processing Revisited. In International Workshop on Data Management on New Hardware, pages 55--62, 2012.Google ScholarGoogle Scholar
  41. C. Kim, T. Kaldewey, V. Lee, E. Sedlar, A. Nguyen, N. Satish, J. Chhugani, A. D. Blas, and P. Dubey. Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs. Proceedings of the VLDB Endowment, 2(2):1378--1389, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. M. Kitsuregawa, H. Tanaka, and T. Moto-Oka. Application of Hash to Data Base Machine and its Architecture. New Generation Computing, 1(1):63--74, 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. M. Lakshmi and P. Yu. Effectiveness of Parallel Joins. IEEE Computer Architecture Letters, 2(04):410--424, 1990.Google ScholarGoogle Scholar
  44. R. L"ammel. Google's MapReduce programming model - Revisited. Science of Computer Programming, 70(1):1--30, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  45. F. Li, S. Das, M. Syamala, and V. Narasayya. Accelerating Relational Databases by Leveraging Remote Memory and RDMA. In ACM SIGMOD International Conference on Management of Data, pages 355--370, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Q. Lin, B. Ooi, Z. Wang, and C. Yu. Scalable Distributed Stream Join Processing. In ACM SIGMOD International Conference on Management of Data, pages 811--825, 2015.Google ScholarGoogle Scholar
  47. J. Linn and C. Dyer. Data-Intensive Text Processing with MapReduce. Synthesis Lectures on Human Language Technologies, 3(1):1--177, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  48. S. Manegold, P. Boncz, and M. Kersten. Optimizing Main-Memory Join on Modern Hardware. IEEE TKDE Transactions on Knowledge and Data Engineering, 14(4):709--730, 2002.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. A. Metwally, D. Agrawal, and A. E. Abbadi. Efficient Computation of Frequent and Top-k Elements in Data Streams. In ICDT International Conference on Database Theory, pages 398--412. Springer, 2005.Google ScholarGoogle Scholar
  50. A. Metwally and C. Faloutsos. V-SMART-Join: A Scalable MapReduce Framework for All-Pair Similarity Joins of Multisets and Vectors. Proceedings of the VLDB Endowment, 5(8):704--715, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. A. Nica, I. Charlesworth, and M. Panju. Analyzing Query Optimization Process: Portraits of Join Enumeration Algorithms. In IEEE ICDE International Conference on Data Engineering, pages 1301--1304. IEEE, 2012.Google ScholarGoogle Scholar
  52. A. Okcan and M. Riedewald. Processing Theta-Joins using MapReduce. In ACM SIGMOD International Conference on Management of Data, pages 949--960, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. J. Paul, B. He, S. Lu, and C. Lau. Revisiting Hash Join on Graphics Processors: A Decade Later. Distributed and Parallel Databases, pages 1--23, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. J. Paul, S. Lu, B. He, and C. Lau. MG-Join: A Scalable Join for Massively Parallel Multi-GPU Architectures. In ACM SIGMOD International Conference on Management of Data, pages 1413--1425, 2021.Google ScholarGoogle Scholar
  55. O. Polychroniou, W. Zhang, and K. Ross. Track Join: Distributed Joins with Minimal Network Traffic. In ACM SIGMOD International Conference on Management of Data, pages 1483--1494, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. O. Polychroniou, W. Zhang, and K. Ross. Distributed Joins and Data Placement for Minimal Network Traffic. TODS ACM Transactions on Database Systems, 43(3):1--45, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. D. Quoc, I. Akkus, P. Bhatotia, S. Blanas, R. Chen, C. Fetzer, and T. Strufe. ApproxJoin: Approximate Distributed Joins. In ACM SoCC Symposium on Cloud Computing, pages 426--438, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. W. Rödiger, S. Idicula, A. Kemper, and T. Neumann. Flow-Join: Adaptive Skew Handling for Distributed Joins over High-Speed Networks. In IEEE ICDE International Conference on Data Engineering, pages 1194--1205. IEEE, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  59. W. Rödiger, T. Mühlbauer, A. Kemper, and T. Neumann. High-Speed Query Processing over High-Speed Networks. Proceedings of the VLDB Endowment, 9(4):228--239, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. R. Rui, H. Li, and Y.-C. Tu. Efficient Join Algorithms For Large Database Tables in a Multi-GPU Environment. Proceedings of the VLDB Endowment, 14(4):708--720, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. A. Salama, C. Binnig, T. Kraska, A. Scherp, and T. Ziegler. Rethinking Distributed Query Execution on High-Speed Networks. IEEE Data Engineering Bulletin, 40(1):27--37, 2017.Google ScholarGoogle Scholar
  62. P. Sanders, J. Speck, and J. Tr"aff. Two-Tree Algorithms for Full Bandwidth Broadcast, Reduction and Scan. Parallel Computing, 35(12):581--594, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. D. Schneider and D. DeWitt. A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. ACM SIGMOD Record, 18(2):110--121, 1989.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. S. Schuh, X. Chen, and J. Dittrich. An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory. In ACM SIGMOD International Conference on Management of Data, pages 1961--1976, 2016.Google ScholarGoogle Scholar
  65. D. Shasha and T.-L. Wang. Optimizing Equijoin Queries In Distributed Databases Where Relations Are Hash Partitioned. TODS ACM Transactions on Database Systems, 16(2):279--308, 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. P. Sioulas, P. Chrysogelos, M. Karpathiotakis, R. Appuswamy, and A. Ailamaki. Hardware-conscious Hash-Joins on GPUs. In IEEE ICDE International Conference on Data Engineering, pages 698--709, 2019.Google ScholarGoogle Scholar
  67. M. Stonebraker. The Case for Shared Nothing. IEEE Database Engineering Bulletin, 9(1):4--9, 1986.Google ScholarGoogle Scholar
  68. S. Suri and S. Vassilvitskii. Counting Triangles and the Curse of the Last Reducer. In WWW International Conference on World Wide Web, pages 607--614, 2011.Google ScholarGoogle Scholar
  69. Y. Tian, F. Özcan, T. Zou, R. Goncalves, and H. Pirahesh. Building a Hybrid Warehouse: Efficient Joins Between Data Stored in HDFS and Enterprise Warehouse. TODS ACM Transactions on Database Systems, 41(4):1--38, 2016.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. A. Vitorovic, M. Elseidy, and C. Koch. Load Balancing and Skew Resilience for Parallel Joins. In IEEE ICDE International Conference on Data Engineering, pages 313--324. IEEE, 2016.Google ScholarGoogle Scholar
  71. Word frequency in Wikipedia (November 27, 2006). https://en.wikipedia.org/wiki/Zipf's_law.Google ScholarGoogle Scholar
  72. Y. Xu and P. Kostamaa. A New Algorithm for Small-Large Table Outer Joins in Parallel DBMS. In IEEE ICDE International Conference on Data Engineering, pages 1018--1024. IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  73. Y. Xu, P. Kostamaa, X. Zhou, and L. Chen. Handling Data Skew in Parallel Joins in Shared-Nothing Systems. In ACM SIGMOD International Conference on Management of Data, pages 1043--1052, 2008.Google ScholarGoogle Scholar
  74. H.-C. Yang, A. Dasdan, R.-L. Hsiao, and D. Parker. Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters. In ACM SIGMOD International Conference on Management of Data, pages 1029--1040, 2007.Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. K. Yi and Q. Zhang. Optimal Tracking of Distributed Heavy Hitters and Quantiles. Algorithmica, 65(1):206--223, 2013.Google ScholarGoogle Scholar
  76. M. Zaharia, M. Chowdhury, M. Franklin, S. Shenker, and I. Stoica. Spark: Cluster Computing with Working Sets. HotCloud, 10(10--10):95, 2010.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Scaling Equi-Joins

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data
      June 2022
      2597 pages
      ISBN:9781450392495
      DOI:10.1145/3514221

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 June 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate785of4,003submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader