Skip to main content

Skew handling in the DBS3 parallel database system

  • Conference paper
  • First Online:
Book cover Parallel Computation (ACPC 1996)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1127))

  • 183 Accesses

Abstract

The gains of parallel query execution can be limited because of high start-up time, interference between execution entities, and poor load balancing. In this paper, we present a solution which reduces these limitations in DBS3, a shared-memory parallel database system. This solution combines static data partitioning and dynamic processor allocation to adapt to the execution context. It makes DBS3 almost insensitive to data skew and allows decoupling the degree of parallelism from the degree of data partitioning. To address the problem of load balancing in the presence of data skew, we analyze three important factors that influence the behavior of our parallel execution model: skew factor, degree of parallelism and degree of partitioning. We report on experiments varying these three parameters with the DBS3 prototype on a 72-node KSR1 multiprocessor. The results demonstrate high performance gains, even with highly skewed data.

This work has been partially funded by the CEC under ESPRIT project IDEA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. B. Bergsten, M. Couprie, P. Valduriez, “Prototyping DBS3, a shared-memory parallel database system”. Int. Conf. on Parallel and Distributed Information Systems, Florida, 1991.

    Google Scholar 

  2. D. Bitton, D. J. DeWitt, C. Turbyfill, “Benchmarking database systems — A systematic approach”, Int. Conf. on VLDB, Firenze, 1983.

    Google Scholar 

  3. H. Boral, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, P. Valduriez, “Prototyping Bubba, A highly parallel database system”. IEEE Knowledge and Data Engineering, 2, 1990.

    Google Scholar 

  4. P. Borla-Salamet, C. Chachaty, B. Dageville, “Compiling Control into Database Queries for Parallel Execution Management”. Int. Conf. on Parallel and Distributed Information Systems, Florida, 1991.

    Google Scholar 

  5. C. Chachaty, P. Borla-Salamet, M. Ward, “A Compositional Approach for the Design of a Parallel Query Processing Language”, Int. Conf. on Parallel Architectures and Language Europe, Paris, 1992.

    Google Scholar 

  6. G. Copeland, W. Alexander, E. Boughter & T. Keller, “Data Placement in bubba”, Int. Conf. ACM-SIGMOD, Chicago, 1988.

    Google Scholar 

  7. B. Dageville, P. Casadessus, P. Borla-Salamet, “The Impact of the KSR1 AllCache Architecture on the Behaviour of the DBS3 Parallel DBMS”, Int. Conf. on Parallel Architectures and Language Europe, Athen, 1994.

    Google Scholar 

  8. D. D. Davis, “Oracle's Parallel Punch for OLTP”, Datamation, 1992.

    Google Scholar 

  9. D. J. DeWitt, S. Ghandeharizadeh, D. Schneider, A. Bricker, H. Hsiao & R. Rasmussen, “The Gamma Database Machine Project”, IEEE Transactions on Knowledge and Data Engineering, 2, 1990.

    Google Scholar 

  10. D.J. DeWitt, J. Gray, “Parallel Database Systems: the Future of High Performance Database Systems”, Comm. of the ACM, 35 (6), 1992.

    Google Scholar 

  11. D.J. DeWitt, J.F. Naughton, D.A. Schneider, S. Seshadri, “Practical Skew Handling in Parallel Joins”, Int. Conf. on VLDB, Vancouver, 1992.

    Google Scholar 

  12. S. Frank, H. Burkhardt, J. Rothnie, “The KSR1: Bridging the Gap Between Shared-Memory and MPPs”, Compcon'93, San Francisco, 1993.

    Google Scholar 

  13. G. Graefe, “Volcano, An Extensible and Parallel Dataflow Query Processing System”, IEEE Transaction on Knowledge and Data Engineering, 6, 1994.

    Google Scholar 

  14. A. Hameurlain, F. Morvan, “Scheduling and Mapping for Parallel Execution of Extended SQL Queries”, Int. Conf. on Information and Knowledge Engineering, Baltimore, 1995.

    Google Scholar 

  15. W. Hong, “Exploiting Inter-Operation Parallelism in XPRS”, Int. Conf. ACM-SIGMOD, San Diego, 1992.

    Google Scholar 

  16. H. Hsiao, M. S. Chen, P. S. Yu, “On Parallel Execution of Multiple Pipelined Hash Joins”, Int. Conf. ACM-SIGMOD, Minneapolis, 1994.

    Google Scholar 

  17. M. Kitsuregawa, Y. Ogawa, “Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer”, Int. Conf on VLDB, Brisbane, Australia, 1990.

    Google Scholar 

  18. R. Lanzelotte, P. Valduriez, M. Zait, M. Ziane, “Industrial-Strength Parallel Query Optimization: issues and lessons”, Information Systems, 19 (4), 1994.

    Google Scholar 

  19. M. Metha, D. DeWitt, “Managing Intra-operator Parallelism in Parallel Database Systems” Int. Conf. on VLDB, Zurich, 1995.

    Google Scholar 

  20. E. Omiecinski, “Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared-Memory Multiprocessor”, Int. Conf on VLDB, Barcelona, 1991.

    Google Scholar 

  21. P. Valduriez, “Parallel Database Systems: open problems and new issues.”, Int. Journal on Distributed and Parallel Databases, 1 (2), 1993.

    Google Scholar 

  22. C.B. Walton, A.G. Dale, R.M. Jenevin, “A taxonomy and Performance Model of Data Skew Effects in Parallel Joins” Int. Conf. on VLDB, Barcelona, 1991.

    Google Scholar 

  23. G. K. Zipf, Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology, Reading, MA, Addison-Wesley, 1949.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

László Böszörményi

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bouganim, L., Florescu, D., Dageville, B. (1996). Skew handling in the DBS3 parallel database system. In: Böszörményi, L. (eds) Parallel Computation. ACPC 1996. Lecture Notes in Computer Science, vol 1127. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61695-0_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-61695-0_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-61695-5

  • Online ISBN: 978-3-540-70645-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics