skip to main content
10.1145/1128022.1128055acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
Article

Database hash-join algorithms on multithreaded computer architectures

Published:03 May 2006Publication History

ABSTRACT

As the performance gap between main memory and modern processors widens, database algorithms must be adapted to be "architecture-aware" for optimal performance. We address this issue using the computation of hash join, one of the most important operations in database query processing, to study the impact of simultaneous multithreading (SMT) and main-memory latency (cache misses) on performance.Prior work [8] has studied cache misses on a simulation based on the Compaq ES40. Our results are obtained by measuring the performance of actual hardware (Intel Pentium and Xeon, and AMD Opteron) first for the single-threaded version of the hash-join algorithm used in the prior work and a new version designed for multiple threads.We found that hardware prefetching from main-memory data into CPU cache as implemented in the architectures we tested significantly reduces the real-world benefit of software prefetching (contrary to prior work on simulated systems). We found that SMT achieved significant speedup for our thread-aware hash join algorithm when compared with a single-threaded execution on the same single processor. Software prefetching also proved beneficial in this environment.

References

  1. Intel multi-core processor architecture development backgrounder. Intel White Paper.Google ScholarGoogle Scholar
  2. Multi-core processors-- the next evolution in computing. AMD White Paper, 2005.Google ScholarGoogle Scholar
  3. A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood. DBMSs on a modern processor: Where does time go? In Proc. of the 25th International Conference on Very Large Data Bases, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Boggs, A. Baktha, J. Hawkins, D. T. Marr, J. A. Miller, P. Roussel, R. Singhal, B. Toll, and K. Venkatraman. The microarchitecture of the Intel Pentium 4 processor on 90nm technology. Intel Technology Journal, (Q1):4--15, 2002.Google ScholarGoogle Scholar
  5. P. Boncz, S. Manegold, and M. L. Kersten. Database architecture optimized for the new bottleneck: Memory access. In Proc. of the 25th International Conference on Very Large Data Bases, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Burger and J. R. Goodman. Billion-transistor architectures: There and back again. IEEE Computer, 37:22--28, Mar. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Carmean. Data management challenges on new computer architectures. In First Int'l Workshop on Data Management on New Hardware (DaMoN), June 2005. Oral Presentation.Google ScholarGoogle Scholar
  8. S. Chen, A. Ailamaki, P. B. Gibbons, and T. C. Mowry. Improving hash join performance through prefetching. In IEEE International Conference on Data Engineering, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Chen, P. B. Gibbons, and T. C. Mowry. Improving index performance through prefetching. In ACM SIGMOD International Conference on the Management of Data, May 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Chen, P. B. Gibbons, T. C. Mowry, and G. Valentin. Fractal prefeching B+-trees: Optimizing both cache and disk performance. In ACM SIGMOD International Conference on the Management of Data, June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, R. L. Stamm, and D. M. Tullsen. Simultaneous multithreading: A platform for next-generation processors. IEEE Micro, 17(5):12--19, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. P. Garcia and H. F. Korth. Multithreaded architectures and the sort benchmark. In First Int'l Workshop on Data Management on New Hardware (DaMoN), June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Hinton, D. Sager, M. Upton, D. Boggs, D. Carmean, A. Kyker, and P. Roussel. The microarchitecture of the Pentium 4 processor. Intel Technology Journal, (Q1), 2001.Google ScholarGoogle Scholar
  14. Intel. Intel Pentium 4 Processor Optimization, 2001.Google ScholarGoogle Scholar
  15. R. Kalla, B. Sinharoy, and J. M. Tendler. IBM Power5 chip: A dual-core multithreaded processor. 2004.Google ScholarGoogle Scholar
  16. M. Kitsuregawa, H. Tanaka, and T. Moto-Oka. Application of hash to data base machine and its architecture. In New Generation Computing, volume 1, pages 63--74, 1983.Google ScholarGoogle ScholarCross RefCross Ref
  17. J. J. Lo, L. A. Barroso, S. Eggers, K. Gharachorloo, H. Levy, and S. Parekh. An analysis of database workload performance on simultaneous multithreaded processors. Technical report, Compaq, July 1998.Google ScholarGoogle Scholar
  18. D. T. Marr, F. Binns, D. L. Hill, G. Hinton, D. A. Koufaty, J. A. Miller, and M. Upton. Hyper-threading technology architecture and microarchitecture. Intel Technology Journal, (Q1):4--15, 2002.Google ScholarGoogle Scholar
  19. L. K. McDowell, S. J. Eggers, and S. D. Gribble. Improving server software support for simultaneous multithreaded processors. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. Microsystems. Throughput computing: Changing the economics and ecology of the data center with innovative SPARCtextregistered technology. White Paper.Google ScholarGoogle Scholar
  21. V. K. Reddy, A. M. Sule, and A. V. Anantaraman. Hyper-threading on the Pentium 4, December 2002.Google ScholarGoogle Scholar
  22. S. Rixner, W. J. Dally, U. J. Kapasi, P. R. Mattson, and J. D. Owens. Memory access scheduling. In ACM/IEEE International Symposium on Computer Architecture (ISCA), pages 128--138, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. K. S. Mangegold, P. Boncz. Generic database cost models for hierarchical memory systems. In Proceedings of the 28th VLDB Conference, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A. Shatdal, C. Kant, and J. F. Naughton. Cache conscious algorithms for relational query processing. In Proc. of the 20th International Conference on Very Large Data Bases, pages 510--521. Morgan Kaufmann Publishers Inc., 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Silberschatz, H. F. Korth, and S. Sudarshan. Database System Concepts, 5th Edition. McGraw Hill, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. M. Tullsen, S. Eggers, and H. M. Levy. Simultaneous multithreading: Maximizing on-chip parallelism. In Proc. 22nd Annual International Symposium on Computer Architecture, June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. M. Tullsen, S. J. Eggers, J. S. Emer, H. M. Levy, J. L. Lo, and R. L. Stamm. Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In ACM/IEEE International Symposium on Computer Architecture (ISCA), pages 191--202, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Zhou, J. Cieslewicz, K. A. Ross, and M. Shah. Improving database performance on simultaneous multithreading processors. In VLDB '05: Proceedings of the 31st International Conference on Very Large Data Bases, pages 49--60. VLDB Endowment, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. J. Zhou and K. A. Ross. Implementing database operations using SIMD instructions. In Proc. ACM SIGMOD International Conference on the Management of Data, June 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Database hash-join algorithms on multithreaded computer architectures

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CF '06: Proceedings of the 3rd conference on Computing frontiers
      May 2006
      430 pages
      ISBN:1595933026
      DOI:10.1145/1128022

      Copyright © 2006 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 May 2006

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate240of680submissions,35%

      Upcoming Conference

      CF '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader