skip to main content
10.1145/1565694.1565701acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Spinning relations: high-speed networks for distributed join processing

Published: 28 June 2009 Publication History

Abstract

By leveraging modern networking hardware (RDMA-enabled network cards), we can shift priorities in distributed database processing significantly. Complex and sophisticated mechanisms to avoid network traffic can be replaced by a scheme that takes advantage of the bandwidth and low latency offered by such interconnects.
We illustrate this phenomenon with cyclo-join, an efficient join algorithm based on continuously pumping data through a ring-structured network. Our approach is capable of exploiting the resources of all CPUs and distributed main-memory available in the network for processing queries of arbitrary shape and datasets of arbitrary size.

References

[1]
S. Acharya, R. Alonso, M. Franklin, and S. Zdonik. Broadcast Disks: Data Management for Asymmetric Communication Environments. In Proc. of the ACM SIGMOD Int'l Conference on Management of Data, pages 199--210, San Jose, CA, USA, 1995.
[2]
Philip A. Bernstein and Dah-Ming W. Chiu. Using Semi-Joins to Solve Relational Queries. Journal of the ACM, 28(1):25--40, 1981.
[3]
Thomas F. Bowen, Gita Gopal, Gary Herman, Takako Hickey, K. C. Lee, William H. Mansfield, John Raitz, and Abel Weinrib. The Datacycle Architecture. Communications of the ACM, 35(12):71--81, 1992.
[4]
David D. Clark, Van Jacobson, John Romkey, and Howard Salwen. An Analysis of TCP Processing Overhead. IEEE Communications Magazine, 27:23--29, 1989.
[5]
A. Foong, T. Huff, H. Hum, J. Patwardhan, and G. Regnier. TCP Performance Re-Visited. In Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software, pages 70--79, 2003.
[6]
Philip W. Frey and Gustavo Alonso. Minimizing the Hidden Cost of RDMA. In Proc. of the 29th Int'l Conference on Distributed Computing Systems (ICDCS), Montreal, QC, Canada, June 2009.
[7]
InfiniBand Trade Association. InfiniBand Architecture Specification. http://www.infinibandta.org.
[8]
S. Ioannidis, E. Markatos, and J. Sevaslidou. Using Network Memory to Improve the Performance of Transaction-Based Systems. In Proc. of the 4th ACM LCR, Pittsburgh, PA, USA, May 1998.
[9]
H. T. Kung and Charles E. Leiserson. Systolic Arrays (for VLSI). In Sparse Matrix Proceedings, pages 256--282, Knoxville, TN, USA, November 1978.
[10]
L. Mackert and G. Lohman. R* Optimizer Validation and Performance Evaluation for Distributed Queries. In Proc. of the 12th Int'l Conference on Very Large Data Bases (VLDB), pages 149--159, Kyoto, Japan, August 1986.
[11]
Stefan Manegold, Peter Boncz, and Martin Kersten. Optimizing Main-Memory Join on Modern Hardware. IEEE Transactions Knowledge and Data Engineering, 14(4):709--730, 2002.
[12]
A. Romanow, J. Mogul, T. Talpey, and S. Bailey. Remote Direct Memory Access (RDMA) over IP Problem Statement, 2005.
[13]
A. Shatdal, C. Kant, and J. F. Naughton. Cache Conscious Algorithms for Relational Query Processing. In Proc. of the 20th Int'l Conference on Very Large Data Bases (VLDB), Santiago de Chile, Chile, September 1994.
[14]
Z. Smith. Bandwidth: a Memory Bandwidth Benchmark. http://home.comcast.net/~fbui/bandwidth.html.
[15]
P. Valduriez and G. Gardarin. Join and Semijoin Algorithms for a Multiprocessor Database Machine. ACM Transactions on Database Systems (TODS), 9(1), March 1984.

Cited By

View all
  • (2023)A Step Toward Deep Online AggregationProceedings of the ACM on Management of Data10.1145/35892691:2(1-28)Online publication date: 20-Jun-2023
  • (2023)Prerequisite-driven Fair Clustering on Heterogeneous Information NetworksProceedings of the ACM on Management of Data10.1145/35892671:2(1-27)Online publication date: 20-Jun-2023
  • (2023)Efficient and Portable Einstein Summation in SQLProceedings of the ACM on Management of Data10.1145/35892661:2(1-19)Online publication date: 20-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DaMoN '09: Proceedings of the Fifth International Workshop on Data Management on New Hardware
June 2009
63 pages
ISBN:9781605587011
DOI:10.1145/1565694
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2009

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

DaMoN 2009
Sponsor:
DaMoN 2009: Data Management on New Hardware
June 28, 2009
Rhode Island, Providence

Acceptance Rates

Overall Acceptance Rate 94 of 127 submissions, 74%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)A Step Toward Deep Online AggregationProceedings of the ACM on Management of Data10.1145/35892691:2(1-28)Online publication date: 20-Jun-2023
  • (2023)Prerequisite-driven Fair Clustering on Heterogeneous Information NetworksProceedings of the ACM on Management of Data10.1145/35892671:2(1-27)Online publication date: 20-Jun-2023
  • (2023)Efficient and Portable Einstein Summation in SQLProceedings of the ACM on Management of Data10.1145/35892661:2(1-19)Online publication date: 20-Jun-2023
  • (2023)BALANCE: Bayesian Linear Attribution for Root Cause LocalizationProceedings of the ACM on Management of Data10.1145/35889491:1(1-26)Online publication date: 30-May-2023
  • (2023)Grep: A Graph Learning Based Database Partitioning SystemProceedings of the ACM on Management of Data10.1145/35889481:1(1-24)Online publication date: 30-May-2023
  • (2023)Distributed GPU Joins on Fast RDMA-capable NetworksProceedings of the ACM on Management of Data10.1145/35887091:1(1-26)Online publication date: 30-May-2023
  • (2023)ClipSim: A GPU-friendly Parallel Framework for Single-Source SimRank with Accuracy GuaranteeProceedings of the ACM on Management of Data10.1145/35887071:1(1-26)Online publication date: 30-May-2023
  • (2023)Transaction Scheduling: From Conflicts to Runtime ConflictsProceedings of the ACM on Management of Data10.1145/35887061:1(1-26)Online publication date: 30-May-2023
  • (2019)Concurrent query processing in a GPU-based database systemPLOS ONE10.1371/journal.pone.021472014:4(e0214720)Online publication date: 16-Apr-2019
  • (2019)Design and Evaluation of an RDMA-aware Data Shuffling Operator for Parallel Database SystemsACM Transactions on Database Systems10.1145/336090044:4(1-45)Online publication date: 12-Dec-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media