Abstract
The gains of parallel query execution can be limited because of high start-up time, interference between execution entities, and poor load balancing. In this paper, we present a solution which reduces these limitations in DBS3, a shared-memory parallel database system. This solution combines static data partitioning and dynamic processor allocation to adapt to the execution context. It makes DBS3 almost insensitive to data skew and allows decoupling the degree of parallelism from the degree of data partitioning. To address the problem of load balancing in the presence of data skew, we analyze three important factors that influence the behavior of our parallel execution model: skew factor, degree of parallelism and degree of partitioning. We report on experiments varying these three parameters with the DBS3 prototype on a 72-node KSR1 multiprocessor. The results demonstrate high performance gains, even with highly skewed data.
This work has been partially funded by the CEC under ESPRIT project IDEA.
Preview
Unable to display preview. Download preview PDF.
References
B. Bergsten, M. Couprie, P. Valduriez, “Prototyping DBS3, a shared-memory parallel database system”. Int. Conf. on Parallel and Distributed Information Systems, Florida, 1991.
D. Bitton, D. J. DeWitt, C. Turbyfill, “Benchmarking database systems — A systematic approach”, Int. Conf. on VLDB, Firenze, 1983.
H. Boral, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, P. Valduriez, “Prototyping Bubba, A highly parallel database system”. IEEE Knowledge and Data Engineering, 2, 1990.
P. Borla-Salamet, C. Chachaty, B. Dageville, “Compiling Control into Database Queries for Parallel Execution Management”. Int. Conf. on Parallel and Distributed Information Systems, Florida, 1991.
C. Chachaty, P. Borla-Salamet, M. Ward, “A Compositional Approach for the Design of a Parallel Query Processing Language”, Int. Conf. on Parallel Architectures and Language Europe, Paris, 1992.
G. Copeland, W. Alexander, E. Boughter & T. Keller, “Data Placement in bubba”, Int. Conf. ACM-SIGMOD, Chicago, 1988.
B. Dageville, P. Casadessus, P. Borla-Salamet, “The Impact of the KSR1 AllCache Architecture on the Behaviour of the DBS3 Parallel DBMS”, Int. Conf. on Parallel Architectures and Language Europe, Athen, 1994.
D. D. Davis, “Oracle's Parallel Punch for OLTP”, Datamation, 1992.
D. J. DeWitt, S. Ghandeharizadeh, D. Schneider, A. Bricker, H. Hsiao & R. Rasmussen, “The Gamma Database Machine Project”, IEEE Transactions on Knowledge and Data Engineering, 2, 1990.
D.J. DeWitt, J. Gray, “Parallel Database Systems: the Future of High Performance Database Systems”, Comm. of the ACM, 35 (6), 1992.
D.J. DeWitt, J.F. Naughton, D.A. Schneider, S. Seshadri, “Practical Skew Handling in Parallel Joins”, Int. Conf. on VLDB, Vancouver, 1992.
S. Frank, H. Burkhardt, J. Rothnie, “The KSR1: Bridging the Gap Between Shared-Memory and MPPs”, Compcon'93, San Francisco, 1993.
G. Graefe, “Volcano, An Extensible and Parallel Dataflow Query Processing System”, IEEE Transaction on Knowledge and Data Engineering, 6, 1994.
A. Hameurlain, F. Morvan, “Scheduling and Mapping for Parallel Execution of Extended SQL Queries”, Int. Conf. on Information and Knowledge Engineering, Baltimore, 1995.
W. Hong, “Exploiting Inter-Operation Parallelism in XPRS”, Int. Conf. ACM-SIGMOD, San Diego, 1992.
H. Hsiao, M. S. Chen, P. S. Yu, “On Parallel Execution of Multiple Pipelined Hash Joins”, Int. Conf. ACM-SIGMOD, Minneapolis, 1994.
M. Kitsuregawa, Y. Ogawa, “Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer”, Int. Conf on VLDB, Brisbane, Australia, 1990.
R. Lanzelotte, P. Valduriez, M. Zait, M. Ziane, “Industrial-Strength Parallel Query Optimization: issues and lessons”, Information Systems, 19 (4), 1994.
M. Metha, D. DeWitt, “Managing Intra-operator Parallelism in Parallel Database Systems” Int. Conf. on VLDB, Zurich, 1995.
E. Omiecinski, “Performance Analysis of a Load Balancing Hash-Join Algorithm for a Shared-Memory Multiprocessor”, Int. Conf on VLDB, Barcelona, 1991.
P. Valduriez, “Parallel Database Systems: open problems and new issues.”, Int. Journal on Distributed and Parallel Databases, 1 (2), 1993.
C.B. Walton, A.G. Dale, R.M. Jenevin, “A taxonomy and Performance Model of Data Skew Effects in Parallel Joins” Int. Conf. on VLDB, Barcelona, 1991.
G. K. Zipf, Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology, Reading, MA, Addison-Wesley, 1949.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bouganim, L., Florescu, D., Dageville, B. (1996). Skew handling in the DBS3 parallel database system. In: Böszörményi, L. (eds) Parallel Computation. ACPC 1996. Lecture Notes in Computer Science, vol 1127. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61695-0_9
Download citation
DOI: https://doi.org/10.1007/3-540-61695-0_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61695-5
Online ISBN: 978-3-540-70645-8
eBook Packages: Springer Book Archive