Skip to main content

On the performance of parallel join processing in shared nothing database systems

  • Paper Sessions
  • Conference paper
  • First Online:
Book cover PARLE '93 Parallel Architectures and Languages Europe (PARLE 1993)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 694))

Abstract

Parallel database systems aim at providing high throughput for OLTP transactions as well as short response times for complex and data-intensive queries. Shared nothing systems represent the major architecture for parallel database processing. While the performance of such systems has been extensively analyzed in the past, the corresponding studies have made a number of best-case assumptions. In particular, almost all performance studies on parallel query processing assumed single-user mode, i.e., that the entire system is exclusively reserved for processing a single query. We study the performance of parallel join processing under more realistic conditions, in particular for multi-user mode. Experiments conducted with a detailed simulation model of shared nothing systems demonstrate the need for dynamic load balancing strategies for efficient join processing in multi-user mode. We focus on two major issues: (a) determining the number of processors to be allocated for the execution of join queries, and (b) determining which processors are to be chosen for join processing. For these scheduling decisions, we consider the current resource utilization as well as the size of intermediate results. Even simple dynamic scheduling strategies are shown to outperform static schemes by a large margin.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chen, M.; Yu, P.; Wu, K. 1992: Scheduling and Processor Allocation for Parallel Execution of Multi-Join Queries. Proc. 8th IEEE Data Engineering Conference, 58–67.

    Google Scholar 

  2. Boral, H. et al. 1990: Prototyping Bubba: A Highly Parallel Database System. IEEE Trans. on Knowledge and Data Engineering 2(1), 4–24.

    Article  Google Scholar 

  3. DeWitt, D.J. et al. 1990: The Gamma Database Machine Project. IEEE Trans. on Knowledge and Data Engineering 2(1), 4–62.

    Article  Google Scholar 

  4. DeWitt, D.; Gray, J. 1992: Parallel Database Systems: The Future of High Performance Database Processing. Communications of the ACM 35(6), 85–98.

    Google Scholar 

  5. Englert, S., Gray, I, Kocher, T., Shath, P. 1990: A Benchmark of NonStop SQL Release 2 Demonstrating Near-Linear Speedup and Scale-Up on Large Databases. Proc. ACM SIGMETRICS Conf., 245–246.

    Google Scholar 

  6. Graefe, G; Ward, K. 1989: Dynamic Query Evaluation Plans. Proc. 1989 SIGMOD Conf., 358–366.

    Google Scholar 

  7. Graefe, G. 1990: Volcano, an Extensible and Parallel Query Evaluation System. University of Colorado at Boulder, Department of Computer Science.

    Google Scholar 

  8. Gray, J. (Editor) 1991: The Benchmark Handbook. Morgan Kaufmann Publishers Inc.

    Google Scholar 

  9. Livny, M. 1989: DeNet Users's Guide, Version 1.5. Computer Science Department, University of Wisconsin, Madison.

    Google Scholar 

  10. Marek, R.; Rahm, E. 1992: Performance Evaluation of Parallel Transaction Processing in Shared Nothing Database Systems. Proc. 4th Int. PARLE Conference, LNCS 605, Springer, 295–310.

    Google Scholar 

  11. Mohan, C., Lindsay, B., Obermarck, R. 1986: Transaction Management in the R* Distributed Database Management System. ACM TODS 11 (4), 378–396.

    Google Scholar 

  12. Murphy, M.; Shan, M. 1991: Execution Plan Balancing. Proc. 1st Int. Conf. on Parallel and Distributed Information Systems.

    Google Scholar 

  13. Neches, P.M.1986: The Anatomy of a Database Computer — Revisited. Proc. IEEE CompCon Spring Conf., 374–377.

    Google Scholar 

  14. Özsu, M.T., Valduriez, P. 1991: Principles of Distributed Database Systems. Prentice Hall.

    Google Scholar 

  15. Patel, S. 1990: Performance Estimates of a Join. In: Parallel Database Systems (Proc. PRIMSA Workshop), Lecture Notes in Computer Science 503, Springer Verlag, 124–148.

    Google Scholar 

  16. Pirahesh, H.et al. 1990: Parallelism in Relational Data Base Systems: Architectural Issues and Design Approaches. In Proc. 2nd Int Symposium on Databases in Parallel and Distributed Systems, IEEE Computer Society Press.

    Google Scholar 

  17. Rahm, E.; Marek, R. 1993: Analysis of Dynamic Load Balancing for Parallel Shared Nothing Database Systems. Techn. Report, Univ. of Kaiserslautern, Dept. of Comp. Science, Febr. 1993.

    Google Scholar 

  18. Schneider, D.A., DeWitt, D.J. 1989: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. Proc. ACM SIGMOD Conf., 110–121.

    Google Scholar 

  19. Schneider, D.A., DeWitt, D.J. 1990: Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines. Proc. 16th Int. Conf. on Very Large Data Bases, 469–480.

    Google Scholar 

  20. Silberschatz, A.; Stonebraker, M.; Ullman, J. 1991: Database Systems: Achievements and Opportunities. Communications of the ACM 34(10), 110–120.

    Article  Google Scholar 

  21. Stonebraker, M. 1986: The Case for Shared Nothing. IEEE Database Engineering 9(1), 4–9.

    Google Scholar 

  22. The Tandem Database Group 1988: A Benchmark of NonStop SQL on the Debit Credit Transaction. Proc. ACM SIGMOD Conf., 337–341.

    Google Scholar 

  23. The Tandem Database Group 1989: NonStop SQL, A Distributed, High-Performance, High-Availability Implementation of SQL. Lecture Notes in Computer Science 359, Springer-Verlag, 60–104.

    Google Scholar 

  24. Walton, C.B; Dale A.G.; Jenevein, R.M. 1991: A Taxanomy and Performance Model of Data Skew Effects in Parallel Joins. Proc. 17th Int. Conf. on Very Large Data Bases, 537–548.

    Google Scholar 

  25. Watson, P., Townsend, P. 1991: The EDS Parallel Relational Database System. In: Parallel Database Systems (Proc. PRIMSA Workshop), Lecture Notes in Computer Science 503, Springer-Verlag, 149–168.

    Google Scholar 

  26. Wilschut, A.; Flokstra, J.; Apers, P. 1992: Parallelism in a Main-Memory DBMS: The performance of PRISMA/DB. Proc. 18th Int. Conf. on Very Large Data Bases, 521–532.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Arndt Bode Mike Reeve Gottfried Wolf

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Marek, R., Rahm, E. (1993). On the performance of parallel join processing in shared nothing database systems. In: Bode, A., Reeve, M., Wolf, G. (eds) PARLE '93 Parallel Architectures and Languages Europe. PARLE 1993. Lecture Notes in Computer Science, vol 694. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56891-3_50

Download citation

  • DOI: https://doi.org/10.1007/3-540-56891-3_50

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-56891-9

  • Online ISBN: 978-3-540-47779-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics