Abstract
Selection algorithms find the \(k^{\mathrm {th}}\) smallest element from a set of elements. Although there are optimal parallel selection algorithms available for theoretical machines, these algorithms are not only difficult to implement but also inefficient in practice. Consequently, scalable applications can only use few special cases such as minimum and maximum, where efficient implementations exist. To overcome such limitations, we propose a general parallel selection algorithm that scales even on today’s largest supercomputers. Our approach is based on an efficient, unbiased median approximation method, recently introduced as median-of-3 reduction, and Hoare’s sequential QuickSelect idea from \(1961\). The resulting algorithm scales with a time complexity of \(\mathcal {O}(\log ^2 n)\) for \(n\) distributed elements while needing only \(\mathcal {O}(1)\) space. Furthermore, we prove it to be a practical solution by explaining implementation details and showing performance results for up to \(458,752\) processor cores.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
In statistics, the \(k^{th}\) order statistic of a sample is equal to its \(k^{th}\) smallest value, and the position of this value is called rank. Unfortunately, rank is also used in MPI to identify a process. To disambiguate, we use the terms position and MPI rank.
- 2.
We use MPI terminology: assuming \(x_i\) is the input at MPI rank \(i\) then Allreduce computes the sum \(\sum _{j=0}^{p-1} x_j\) and Exscan computes the prefix sum \(\sum _{j=0}^{i-1} x_j\) in parallel.
References
Blum, M., Floyd, R.W., Pratt, V., Rivest, R.L., Tarjan, R.E.: Time bounds for selection. J. Comput. Syst. Sci. 7(4), 448–461 (1973)
Fouz, M., Kufleitner, M., Manthey, B., Jahromi, N.Z.: On smoothed analysis of quicksort and Hoare’s find. Comput. Comb. 5609, 158–167 (2009)
Frazer, W.D., McKellar, A.C.: Samplesort: a sampling approach to minimal storage tree sorting. J. ACM 17(3), 496–507 (1970)
Han, Y.: Optimal parallel selection. ACM Trans. Algorithms 3(4) (2007)
Hoare, C.A.R.: Algorithm 63 (Partition) and Algorithm 65 (Find). Commun. ACM 4(7), 321–322 (1961)
Kirschenhofer, P., Prodinger, H., Martínez, C.: Analysis of Hoare’s FIND algorithm with Median-of-Three partition. Random Struct. Alg. 10, 143–156 (1997)
Rabenseifner, R.: Optimization of collective reduction operations. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3036, pp. 1–9. Springer, Heidelberg (2004)
Sack, P., Gropp, W.: A scalable \({\rm MPI}\_{\rm Comm}\_{\rm split}\) algorithm for exascale computing. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 1–10. Springer, Heidelberg (2010)
Sanders, P., Träff, J.L.: Parallel Prefix (Scan) algorithms for MPI. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 49–57. Springer, Heidelberg (2006)
Siebert, C., Wolf, F.: Parallel sorting with minimal data. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 170–177. Springer, Heidelberg (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Siebert, C. (2014). Scalable and Efficient Parallel Selection. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55224-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-55224-3_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55223-6
Online ISBN: 978-3-642-55224-3
eBook Packages: Computer ScienceComputer Science (R0)