research-article

Join processing for flash SSDs: remembering past lessons

Authors:
Jaeyoung Do

Univ. of Wisconsin-Madison

Univ. of Wisconsin-Madison
View Profile

,
Jignesh M. Patel

Univ. of Wisconsin-Madison

Univ. of Wisconsin-Madison
View Profile

DaMoN '09: Proceedings of the Fifth International Workshop on Data Management on New HardwareJune 2009Pages 1–8https://doi.org/10.1145/1565694.1565696

Published:28 June 2009Publication History

DaMoN '09: Proceedings of the Fifth International Workshop on Data Management on New Hardware

Pages 1–8

ABSTRACT

Flash solid state drives (SSDs) provide an attractive alternative to traditional magnetic hard disk drives (HDDs) for DBMS applications. Naturally there is substantial interest in redesigning critical database internals, such as join algorithms, for flash SSDs. However, we must carefully consider the lessons that we have learnt from over three decades of designing and tuning algorithms for magnetic HDD-based systems, so that we continue to reuse techniques that worked for magnetic HDDs and also work with flash SSDs.

The focus of this paper is on recalling some of these lessons in the context of ad hoc join algorithms. Based on an actual implementation of four common ad hoc join algorithms on both a magnetic HDD and a flash SSD, we show that many of the "surprising" results from magnetic HDD-based join methods also hold for flash SSDs. These results include the superiority of block nested loops join over sort-merge join and Grace hash join in many cases, and the benefits of blocked I/Os. In addition, we find that simply looking at the I/O costs when designing new flash SSD join algorithms can be problematic, as the CPU cost is often a bigger component of the total join cost with SSDs. We hope that these results provide insights and better starting points for researchers designing new join algorithms for flash SSDs.

References

SQLite3. http://www.sqlite.org/.Google Scholar
Transaction Processing Performance Council. http://www.tpc.org/.Google Scholar
A. Ailamaki, D. DeWitt, M. Hill, and M. Skounakis. Weaving Relations for Cache Performance. In proceedings of the 27th International Conference on Very Large Data Bases (VLDB), pages 169--180, 2001. Google ScholarDigital Library
L. Bouganim, B. Jonsson, and P. Bonnet. uFLIP: Understanding Flash IO Patterns. In proceedings of the 4th Biennial Conference on Innovative Data Systems Research (CIDR), 2009.Google Scholar
K. Bratbergsengen. Hashing Methods and Relational Algebra Operations. In proceedings of the 10th International Conference on Very Large Data Bases (VLDB), pages 323--333, 1984. Google ScholarDigital Library
G. Graefe. The five-minute rule twenty years later, and how flash memory changes the rules. In proceedings of the 3rd International Workshop on Data Management on New Hardware (DaMoN), 2007. Google ScholarDigital Library
J. Gray and B. Fizgerald. Flash Disk Opportunity for Server-Applications. ACM QUEUE, 6(4):18--23, July 2008. Google ScholarDigital Library
L. Haas, M. Carey, M. Livny, and A. Shukla. SEEKing the truth about ad hoc join costs. The VLDB journal, 6(3):241--256, 1997. Google ScholarDigital Library
C. Hwang. Nanotechnology Enables a New Memory Growth Model. Proceedings of the IEEE, 91(11):1765--1771, November 2003.Google ScholarCross Ref
M. Kitsuregawa, H. Tanaka, and T. Moto-Oka. Application of Hash to Database Machine and Its Architecture. New Generation Computing, 1(1):63--74, 1983.Google ScholarDigital Library
D. Knuth. The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison-Wesley, Reading, Mass, 1973.Google ScholarDigital Library
S. Lee and B. Moon. Design of Flash-Based DBMS: An In-Page Logging Approach. In proceedings of the ACM SIGMOD International Conference on Management of Data, pages 55--66, 2007. Google ScholarDigital Library
S. Lee, B. Moon, C. Park, J. Kim, and S. Kim. A Case for Flash Memory SSD in Enterprise Database Applications. In proceedings of the ACM SIGMOD International Conference on Management of Data, pages 1075--1086, 2008. Google ScholarDigital Library
D. Myers. On the Use of NAND Flash Memory in High-Performance Relational Databases. Master's Thesis, MIT, 2008.Google Scholar
M. Polte, J. Simsa, and G. Gibson. Comparing Performance of Solid State Devices and Mechanical Disks. In proceedings of the 3rd Petascale Data Storage Workshop (PDS Workshop), 2008.Google Scholar
M. Shah, S. Harizopoulos, J. Wiener, and G. Graefe. Fast Scans and Joins using Flash Drives. In proceedings of the 4th International Workshop on Data Management on New Hardware (DaMoN), 2008. Google ScholarDigital Library
L. Shapiro. Join Processing in Database Systems with Large Main Memories. ACM Transactions on Database Systems, 11(3):239--264, September 1986. Google ScholarDigital Library
D. Tsirogiannis, S. Harizopoulos, M. Shah, J. Wiener, and G. Graefe. Query Processing Techniques for Solid State Drives. In proceedings of the ACM SIGMOD International Conference on Management of Data, 2009. Google ScholarDigital Library

Index Terms

Recommendations

Optimizing Nonindexed Join Processing in Flash Storage-Based Systems

Flash memory-based disks (or simply flash disks) have been widely used in today's computer systems. With their continuously increasing capacity and dropping price, it is envisioned that some database systems will operate on flash disks in the near ...
Read More
Exploiting Internal Parallelism of Flash-based SSDs

For the last few years, the major driving force behind the rapid performance improvement of SSDs has been the increment of parallel bus channels between a flash controller and flash memory packages inside the solid-state drives (SSDs). However, there ...
Read More
Optimizing NAND flash-based SSDs via retention relaxation
FAST'12: Proceedings of the 10th USENIX conference on File and Storage Technologies

As NAND Flash technology continues to scale down and more bits are stored in a cell, the raw reliability of NAND Flash memories degrades inevitably. To meet the retention capability required for a reliable storage system, we see a trend of longer write ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
DaMoN '09: Proceedings of the Fifth International Workshop on Data Management on New Hardware
June 2009
63 pages
ISBN:9781605587011
DOI:10.1145/1565694
Conference Chairs:
Peter A. Boncz
CWI
,
Kenneth A. Ross
Columbia University
Copyright © 2009 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate80of102submissions,78%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 475
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Join processing for flash SSDs: remembering past lessons

DaMoN '09: Proceedings of the Fifth International Workshop on Data Management on New Hardware

ABSTRACT

References

Cited By

Index Terms

Recommendations

Optimizing Nonindexed Join Processing in Flash Storage-Based Systems

Exploiting Internal Parallelism of Flash-based SSDs

Optimizing NAND flash-based SSDs via retention relaxation