Abstract
Parallel merge sort is useful for sorting a large quantity of data progressively. The merge sort should be parallelized carefully since the conventional algorithm has poor performance due to the successive reduction of the number of participating processors by half, and down to one in the last merging stage. The proposed load-balanced merge sort utilizes all processors throughout the computation. It evenly distributes data to all processors in each stage. Thus every processor is forced to work in all phases. Significant performance enhancement has been achieved up to a speedup of (P−1)/log P where P is the number of processors. Experimental results demonstrate a speedup of 9.6 (upper bound of 10.7) on 32-processor Cray T3E when sorting 4M 32-bit integers, and a speed up of 2.3 (upper bound of 2.8) on an 8-node PC cluster.
Similar content being viewed by others
REFERENCES
K. Batcher, Sorting Networks and Their Applications, Proceedings of the AFIPS Spring Joint Computer Conference 32, Reston, VA, pp. 307-314 (1968).
Y. Kim, M. Jeon, D. Kim, and A. Sohn, Communication-Efficient Bitonic Sort on a Distributed Memory Parallel Computer, International Conference on Parallel and Distributed Systems (ICPADS'2001) (June 2001).
J. S. Huang and Y. C. Chow, Parallel Sorting and Data Partitioning by Sampling, Proceedings of 7th Computer Software and Applications Conference, pp. 627-631 (November 1983).
A. C. Dusseau, D. E. Culler, K. E. Schauser, and R. P. Martin, Fast Parallel Sorting under Log P: Experience with the CM-5, IEEE Transactions on Computers, Vol. 7 (August 1996).
S. J. Lee, M. Jeon, D. Kim, and A. Sohn, Partitioned Parallel Radix Sort, J. Parallel Distr. Comput. (JPDC), 62:656-668 (2002)also in 3rd International Symposium on High Performance Computing (ISHPC'2000), Tokyo, Japan, pp. 160–171 (October 2000).
A. Sohn and Yuetsu Kodama, Load Balanced Parallel Radix Sort, Proceedings of the 12th ACM International Conference on Supercomputing (July 1998).
R. Cole, Parallel Merge Sort, SIAM J. Comput., 17(4):770-785 (1998).
R. Hockney, Performance Parameters and Benchmarking of Supercomputers, Parallel Computing, 17(10/11):1111-1130 (December 1991).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Jeon, M., Kim, D. Parallel Merge Sort with Load Balancing. International Journal of Parallel Programming 31, 21–33 (2003). https://doi.org/10.1023/A:1021734202931
Issue Date:
DOI: https://doi.org/10.1023/A:1021734202931