Skip to main content

Scalable and Efficient Parallel Selection

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8384))

Abstract

Selection algorithms find the \(k^{\mathrm {th}}\) smallest element from a set of elements. Although there are optimal parallel selection algorithms available for theoretical machines, these algorithms are not only difficult to implement but also inefficient in practice. Consequently, scalable applications can only use few special cases such as minimum and maximum, where efficient implementations exist. To overcome such limitations, we propose a general parallel selection algorithm that scales even on today’s largest supercomputers. Our approach is based on an efficient, unbiased median approximation method, recently introduced as median-of-3 reduction, and Hoare’s sequential QuickSelect idea from \(1961\). The resulting algorithm scales with a time complexity of \(\mathcal {O}(\log ^2 n)\) for \(n\) distributed elements while needing only \(\mathcal {O}(1)\) space. Furthermore, we prove it to be a practical solution by explaining implementation details and showing performance results for up to \(458,752\) processor cores.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    In statistics, the \(k^{th}\) order statistic of a sample is equal to its \(k^{th}\) smallest value, and the position of this value is called rank. Unfortunately, rank is also used in MPI to identify a process. To disambiguate, we use the terms position and MPI rank.

  2. 2.

    We use MPI terminology: assuming \(x_i\) is the input at MPI rank \(i\) then Allreduce computes the sum \(\sum _{j=0}^{p-1} x_j\) and Exscan computes the prefix sum \(\sum _{j=0}^{i-1} x_j\) in parallel.

References

  1. Blum, M., Floyd, R.W., Pratt, V., Rivest, R.L., Tarjan, R.E.: Time bounds for selection. J. Comput. Syst. Sci. 7(4), 448–461 (1973)

    Article  MATH  MathSciNet  Google Scholar 

  2. Fouz, M., Kufleitner, M., Manthey, B., Jahromi, N.Z.: On smoothed analysis of quicksort and Hoare’s find. Comput. Comb. 5609, 158–167 (2009)

    MathSciNet  Google Scholar 

  3. Frazer, W.D., McKellar, A.C.: Samplesort: a sampling approach to minimal storage tree sorting. J. ACM 17(3), 496–507 (1970)

    Article  MATH  MathSciNet  Google Scholar 

  4. Han, Y.: Optimal parallel selection. ACM Trans. Algorithms 3(4) (2007)

    Google Scholar 

  5. Hoare, C.A.R.: Algorithm 63 (Partition) and Algorithm 65 (Find). Commun. ACM 4(7), 321–322 (1961)

    Article  Google Scholar 

  6. Kirschenhofer, P., Prodinger, H., Martínez, C.: Analysis of Hoare’s FIND algorithm with Median-of-Three partition. Random Struct. Alg. 10, 143–156 (1997)

    Article  MATH  Google Scholar 

  7. Rabenseifner, R.: Optimization of collective reduction operations. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3036, pp. 1–9. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  8. Sack, P., Gropp, W.: A scalable \({\rm MPI}\_{\rm Comm}\_{\rm split}\) algorithm for exascale computing. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS, vol. 6305, pp. 1–10. Springer, Heidelberg (2010)

    Google Scholar 

  9. Sanders, P., Träff, J.L.: Parallel Prefix (Scan) algorithms for MPI. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 49–57. Springer, Heidelberg (2006)

    Google Scholar 

  10. Siebert, C., Wolf, F.: Parallel sorting with minimal data. In: Cotronis, Y., Danalis, A., Nikolopoulos, D.S., Dongarra, J. (eds.) EuroMPI 2011. LNCS, vol. 6960, pp. 170–177. Springer, Heidelberg (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Siebert .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Siebert, C. (2014). Scalable and Efficient Parallel Selection. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8384. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55224-3_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-55224-3_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-55223-6

  • Online ISBN: 978-3-642-55224-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics