Abstract
We study the performance of various run placement policies on disks for the merge phase of concurrent mergesorts using parallel prefetching. The initial sorted runs (input) of a merge and its final sorted run (output) are stored on multiple disks but each run resides only on a single disk. In this paper, we examine through detailed simulations three different run placement policies and the impact of buffer thrashing. The results show that, with buffer thrashing avoidance, the best performance can be achieved by a run placement policy that uses a proper subset of the disks dedicated for writing the output runs while the rest of the disks are used for prefetching the input runs in parallel. However, the proper number of write disks is workload dependent, and if not carefully chosen, it can adversely affect the system performance. In practice, a reasonably good performance can be achieved by a run placement policy that does not place the output run of a merge on any of the disks that store its own input runs but allows the output run to share the same disk with some of the input runs of other merges.
Similar content being viewed by others
References
A. Aggarwal, J. S. Vitter. The input/output complexity of sorting and related problems, Communications of the ACM 31(9), 1116–1127, 1988.
H. Chou, D. DeWitt. An evaluation of buffer management strategies for relational database systems. In: Proc. of Very Large Data Bases, 1985, pp. 127-141.
D.J. DeWitt, D. Bitton, H. Boral, W. K. Wilkinson. Parallel algorithms for relational database operations, ACM Trans. on Database Systems 8(3), 324–353, 1983.
C. Faloutsos, R. Ng, T. Sellis. Predictive load control for flexible buffer allocation.In: Proc. of Very Large Data Bases, 1991, pp.265-274.
B. R. Iyer, D. M. Dias. System issues in parallel sorting for database systems. In: Proc. of Int. Conf. on Data Engineering, 1990, pp.246-255.
D. E. Knuth. The Art of Computer Programming—Vol 3: Sorting and Searching, Addison-Wesley, 1973.
S. C. Kwan, J. L. Baer. The I/O performance of multiway mergesort and tag sort, IEEE Trans. Computers 34(4), 383–387, 1985.
S. S. Lavenberg (eds.). Computer Performance Modeling Handbook, Academic Press, 1983.
R. Ng, C. Faloutsos, T. Sellis. Flexible buffer allocation based on marginal gains. In: Proc. of ACM SIGMOD Int. Conf. on Management of Data, 1991, pp. 387-396.
V. S. Pai, P. J. Verman. Prefetching with multiple disks for external mergesort: Simulation and analysis. In: Proc. Int. Conf. on Data Engineering, 1992, pp. 273-282.
D. A. Patterson, G. Gibson, R. H. Katz. A case for redundant arrays of inexpensive disks (RAID). In: Proc. of ACM SIGMOD Int. Conf. on Management of Data, 1988, pp. 109-116, 1988.
G. M. Sacco, M. Schkolnick. Buffer management in relational database systems, ACM Trans. Database Systems 11(4), 473–498, 1986.
K. Salem, H. Garcia-Molina. Disk striping. In: Proc. Int. Conf. on Data Engineering, 1986, pp. 336-342.
B. Salzberg. Merging sorted runs using large main memory, Acta Informatica 27, 195–215, 1989.
J. Z. Teng. DB2 buffer pool management, Lecture notes in Dallas DB2 Users Group meeting, January 1992.
K.-L. Wu, P. S. Yu, J.-Y. Chung, J. Z. Teng. A performance study of workfile disk management for concurrent mergesorts in a multiprocessor database system. In: Proc. 1995 VLDB, 1995, pp. 100-109.
K.-L. Wu, P. S. Yu, J. Z. Teng. Performance comparison of thrashing control policies for concurrent mergesorts with parallel prefetching. In: Proc. 1993 ACM SIGMETRICS, 1993, pp. 171-182.
P. S. Yu, D. W. Cornell. Buffer management based on return on consumption in a multi-query environment, VLDB Journal 2(1), 1–37, 1993.
L. Zheng, P.-A. Larson. Speeding up external mergesort, IEEE Trans. Knowledge and Data Engineering 8(2), 322–332, 1996.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Wu, KL., Yu, P.S. & Teng, J.Z. Run Placement Policies for Concurrent Mergesorts Using Parallel Prefetching. Knowledge and Information Systems 1, 435–457 (1999). https://doi.org/10.1007/BF03325109
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/BF03325109