skip to main content
10.1145/1565694.1565696acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Join processing for flash SSDs: remembering past lessons

Published:28 June 2009Publication History

ABSTRACT

Flash solid state drives (SSDs) provide an attractive alternative to traditional magnetic hard disk drives (HDDs) for DBMS applications. Naturally there is substantial interest in redesigning critical database internals, such as join algorithms, for flash SSDs. However, we must carefully consider the lessons that we have learnt from over three decades of designing and tuning algorithms for magnetic HDD-based systems, so that we continue to reuse techniques that worked for magnetic HDDs and also work with flash SSDs.

The focus of this paper is on recalling some of these lessons in the context of ad hoc join algorithms. Based on an actual implementation of four common ad hoc join algorithms on both a magnetic HDD and a flash SSD, we show that many of the "surprising" results from magnetic HDD-based join methods also hold for flash SSDs. These results include the superiority of block nested loops join over sort-merge join and Grace hash join in many cases, and the benefits of blocked I/Os. In addition, we find that simply looking at the I/O costs when designing new flash SSD join algorithms can be problematic, as the CPU cost is often a bigger component of the total join cost with SSDs. We hope that these results provide insights and better starting points for researchers designing new join algorithms for flash SSDs.

References

  1. SQLite3. http://www.sqlite.org/.Google ScholarGoogle Scholar
  2. Transaction Processing Performance Council. http://www.tpc.org/.Google ScholarGoogle Scholar
  3. A. Ailamaki, D. DeWitt, M. Hill, and M. Skounakis. Weaving Relations for Cache Performance. In proceedings of the 27th International Conference on Very Large Data Bases (VLDB), pages 169--180, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. L. Bouganim, B. Jonsson, and P. Bonnet. uFLIP: Understanding Flash IO Patterns. In proceedings of the 4th Biennial Conference on Innovative Data Systems Research (CIDR), 2009.Google ScholarGoogle Scholar
  5. K. Bratbergsengen. Hashing Methods and Relational Algebra Operations. In proceedings of the 10th International Conference on Very Large Data Bases (VLDB), pages 323--333, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. G. Graefe. The five-minute rule twenty years later, and how flash memory changes the rules. In proceedings of the 3rd International Workshop on Data Management on New Hardware (DaMoN), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Gray and B. Fizgerald. Flash Disk Opportunity for Server-Applications. ACM QUEUE, 6(4):18--23, July 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Haas, M. Carey, M. Livny, and A. Shukla. SEEKing the truth about ad hoc join costs. The VLDB journal, 6(3):241--256, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Hwang. Nanotechnology Enables a New Memory Growth Model. Proceedings of the IEEE, 91(11):1765--1771, November 2003.Google ScholarGoogle ScholarCross RefCross Ref
  10. M. Kitsuregawa, H. Tanaka, and T. Moto-Oka. Application of Hash to Database Machine and Its Architecture. New Generation Computing, 1(1):63--74, 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. D. Knuth. The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison-Wesley, Reading, Mass, 1973.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Lee and B. Moon. Design of Flash-Based DBMS: An In-Page Logging Approach. In proceedings of the ACM SIGMOD International Conference on Management of Data, pages 55--66, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. S. Lee, B. Moon, C. Park, J. Kim, and S. Kim. A Case for Flash Memory SSD in Enterprise Database Applications. In proceedings of the ACM SIGMOD International Conference on Management of Data, pages 1075--1086, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. D. Myers. On the Use of NAND Flash Memory in High-Performance Relational Databases. Master's Thesis, MIT, 2008.Google ScholarGoogle Scholar
  15. M. Polte, J. Simsa, and G. Gibson. Comparing Performance of Solid State Devices and Mechanical Disks. In proceedings of the 3rd Petascale Data Storage Workshop (PDS Workshop), 2008.Google ScholarGoogle Scholar
  16. M. Shah, S. Harizopoulos, J. Wiener, and G. Graefe. Fast Scans and Joins using Flash Drives. In proceedings of the 4th International Workshop on Data Management on New Hardware (DaMoN), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. L. Shapiro. Join Processing in Database Systems with Large Main Memories. ACM Transactions on Database Systems, 11(3):239--264, September 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Tsirogiannis, S. Harizopoulos, M. Shah, J. Wiener, and G. Graefe. Query Processing Techniques for Solid State Drives. In proceedings of the ACM SIGMOD International Conference on Management of Data, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Join processing for flash SSDs: remembering past lessons

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            DaMoN '09: Proceedings of the Fifth International Workshop on Data Management on New Hardware
            June 2009
            63 pages
            ISBN:9781605587011
            DOI:10.1145/1565694

            Copyright © 2009 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 28 June 2009

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate80of102submissions,78%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader