Abstract.
We have developed a methodology for predicting the performance of parallel algorithms on real parallel machines. The methodology consists of two steps. First, we characterize a machine by enumerating the primitive operations that it is capable of performing along with the cost of each operation. Next, we analyze an algorithm by making a precise count of the number of times the algorithm performs each type of operation. We have used this methodology to evaluate many of the parallel sorting algorithms proposed in the literature. Of these, we selected the three most promising, Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and Valiant's flashsort, and implemented them on the connection Machine model CM-2. This paper analyzes the three algorithms in detail and discusses the issues that led us to our particular implementations. On the CM-2 the predicted performance of the algorithms closely matches the observed performance, and hence our methodology can be used to tune the algorithms for optimal performance. Although our programs were designed for the CM-2, our conclusions about the merits of the three algorithms apply to other parallel machines as well.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
Received April 1996, and in final form June 1996.
Rights and permissions
About this article
Cite this article
Blelloch, G., Leiserson, C., Maggs, B. et al. An Experimental Analysis of Parallel Sorting Algorithms . Theory Comput. Systems 31, 135–167 (1998). https://doi.org/10.1007/s002240000083
Issue Date:
DOI: https://doi.org/10.1007/s002240000083