Abstract
Most parallel database query processing methods proposed so far adopt the task-oriented approach: decomposing a query into tasks, allocating tasks to processors, and executing the tasks in parallel. However, this strategy may not be effective when some processors are overloaded with time-consuming tasks caused by some unpredictable factors such as data skew. In this paper, we propose a dynamic and load-balanced task-oriented database query processing approach that minimizes the completion time of user queries. It consists of three phases: task generation, task acquisition and execution and task stealing. Using this approach, a database query is decomposed into a set of tasks. At run-time, these tasks are allocated dynamically to available processors. When a processor completes its assigned tasks and no more new tasks are available, it steals subtasks from other overloaded processors to share their load. A performance study was conducted to demonstrate the feasibility and effectiveness of this approach using join query as an example. The techniques that can be used to select task donors from overloaded processors and to determine the amount of work to be transferred are discussed. The factors that may affect the effectiveness, such as the number of tasks to be decomposed to, is also investigated.
Preview
Unable to display preview. Download preview PDF.
References
Deen, S. M., Kannangara, D. N. P. and Taylor, M. C., “Multi-join on Parallel Processors,” Proc. 2nd Intl. Symp. Databases in Parallel and Distributed Systems, Dublin, Ireland, Jul. 1990, pp. 92–102.
DeWitt, D. J., and Gerber, R., “Multiprocessor Hashed-Based Join Algorithms,” Proc. VLDB 85, Stockholm, Aug. 1985, pp. 151–164.
DeWitt, D. J., et al., “The GAMMA Database Machine Project,” IEEE Trans. Knowledge and Data Engineering, Vol. 2, No. 1, Mar. 1990, pp. 44–62.
Englert, S., et al., “A Benchmark of Nonstop SQL Release 2 Demonstrating Near-linear Speedup and Scaleup on Large Databases,” Tandem Tech. Rep. 89.4, May 1989.
Hua, K. A. and Lee, C., “Handling Data Skew in Multicomputer Database Systems Using Partition Tuning,” to appear in Proc. VLDB 91, Barcelona, Spain, Sept. 1991.
Kitsuregawa, M., Tanaka, H. and Motoka, T., “Application of Hash to Database Machines and its Architecture,” New Generation Computing, Vol. 1, No. 1, 1983, pp. 63–74.
Kitsuregawa, M., Nakayama, M. and Takagi, M., “The Effect of Bucket Size Tuning in the Dynamic Hybrid GRACE Hash Join Method,” Proc. VLDB 89, Amsterdam, Netherlands, Aug. 1989, pp. 257–266.
Kitsuregawa, M. and Ogawa, Y., “Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC),” Proc. VLDB 90, Brisbane, Australia, Aug. 1990, pp. 210–221.
Knuth, D. E., The Art of Programming, Vol. 3: Sorting and Searching, Addison-Wesley, 1973.
Lakshmi, M. S., and Yu, P. S., “Effectiveness of Parallel Joins,” IEEE Trans. Knowledge and Data Engineering, Vol. 2, No. 4, Sept. 1990, pp. 410–424.
Lu, H., Shan, M. C., and Tan, K. L., “Optimization of Multi-Way Join Queries for Parallel Execution,” to appear in Proc. VLDB 91, Barcelona, Spain, Sept 1991.
Murphy, M. C., and Shan, M. C., “Execution Plan Balancing: A Practical Technique for Multiprocessor Query Optimization,” Proc. 7th Intl. Conf. on Data Engineering, Kobe, Japan, Apr. 1991, pp. 698–706.
Nakayama, M. and Kitsuregawa, M., “Hash-partitioned Join Method Using Dynamic Destaging Strategy,” Proc. VLDB 88, Los Angeles, CA., Aug. 1988, pp. 468–478.
Omiecinski, E., “Performance Analysis of a Load Balancing Relational Hash-Join Algorithm for a Shared Memory Multiprocessor,” to appear in Proc. VLDB 91, Barcelona, Spain, Sept. 1991.
Ozkarahan, E., Database Machines and Database Management, Prentice Hall, 1986.
Su, S. Y. W., Database Computers, McGraw-Hill, 1988.
Teradata Corporation, DBC/1012 Data Base Computer Concepts and Facilities, Teradata Document C02-0001-05, Los Angeles, CA, 1988.
Walton, C. B., Dale, A. G., and Jenevein, R. M., “A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins,” to appear in Proc. VLDB 91, Barcelona, Spain, Sept. 1991.
Wolf, J. L., Dias, D. M. and Yu, P. S., “An Effective Algorithm for Parallelizing Sort Merge Joins in the Presence of Data Skew,” Proc. 2nd Intl. Symp. Databases in Parallel and Distributed Systems, Dublin, Ireland, Jul. 1990, pp. 103–115.
Wolf, J. L., Dias, D. M. and Yu, P. S., “An Effective Algorithm for Parallelizing Hash Joins in the Presence of Data Skew,” Proc. 7th Intl. Conf. on Data Engineering, Kobe, Japan, Apr. 1991, pp. 200–209.
Zipf, G. K., Human Behavior and the Principle of Least Effort, Addison Wesley, 1949.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1992 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lu, H., Tan, K.L. (1992). Dynamic and load-balanced task-oriented database query processing in parallel systems. In: Pirotte, A., Delobel, C., Gottlob, G. (eds) Advances in Database Technology — EDBT '92. EDBT 1992. Lecture Notes in Computer Science, vol 580. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032442
Download citation
DOI: https://doi.org/10.1007/BFb0032442
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-55270-3
Online ISBN: 978-3-540-47003-8
eBook Packages: Springer Book Archive