Skip to main content

Partitioned Parallel Radix Sort

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1940))

Abstract

Load balanced parallel radix sort solved the load imbalance problem present in parallel radix sort. Redistributing the keys in each round of radix, each processor has exactly the same number of keys, thereby reducing the overall sorting time. Load balanced radix sort is currently known the fastest internal sorting method for distributed-memory multiprocessors. However, as the computation time is balanced, the communication time emerges as the bottleneck of the overall sorting performance due to key redistribution. We present in this report a new parallel radix sorter that solves the communication problem of balanced radix sort, called partitioned parallel radix sort. The new method reduces the communication time by eliminating the redistribution steps. The keys are first sorted in a top-down fashion (left-to-right as opposed to righttoleft) by using some most significant bits. Once the keys are localized to each processor, the rest of sorting is confined within each processor, hence eliminating the need for global redistribution of keys. It enables well balanced communication and computation across processors. The proposed method has been implemented in three different distributedmemory platforms, including IBM SP2, CRAY T3E, and PC Cluster. Experimental results with various key distributions indicate that partitioned parallel radix sort indeed shows significant improvements over balanced radix sort. IBM SP2 shows 13% to 30% improvement while Cray/SGIT3E does 20% to 100% in execution time. PC cluster shows over 2.5 fold improvement in execution time.

This work is partially supported by STEPIgrant no.97-NF-03-04-A-01,KRF rant no.985-0900-003-2,and NSF grant no.INT-9722545.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. E. Batcher, Sorting Networks and their applications, Proceedings of AFIPS Conference, pp. 307–314, 1968. 161

    Google Scholar 

  2. R. Beigel and J. Gill, Sorting n objects with k-sorter, IEEE Transactions on Computers, vol. C-39, pp. 714–716, 1990.

    Article  MathSciNet  Google Scholar 

  3. A. C. Dusseau, D. E. Culler, K. E. Schauser, and R. P. Martin, Fast parallel sorting under LogP: experience with the CM-5, IEEE Trans. Computers, Vol. 7(8), Aug. 1996. 161

    Google Scholar 

  4. D. R. Helman, D. A. Bader, and J. JaJa, Parallel algorithms for personalized communication and sorting with an experimental study, Procs. ACM Symposium on Parallel Algorithms and Architectures, Padua, Italy, pp. 211–220, June 1996. 161, 164, 165

    Google Scholar 

  5. J. S. Huang and Y. C. Chow, Parallel Sorting and Data Partitioning by Sampling, Procs. the 7th Computer Software and Applications Conference, pp. 627–631, November 1983. 161, 164

    Google Scholar 

  6. J. JaJa, Introduction to Parallel Algorithms, Addison-Wesley, 1992. 161

    Google Scholar 

  7. F. T. Leighton, Tight Bounds on the Complexity of Parallel Sorting, IEEE Transactions on Computers, C-34: pp. 344–354, 1985. 161

    Article  MathSciNet  Google Scholar 

  8. F. T. Leighton, Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes, Addison-Wesley, Morgan Kauffman, 1992. 160, 161

    Google Scholar 

  9. W. A. Martin, Sorting, ACM Computing Surveys, Vol. 3(4), p.p. 147–174, 1971. 160

    Article  MATH  Google Scholar 

  10. Sedgewick, Algorithms, Wiley, 1990. 161, 162

    Google Scholar 

  11. A. Sohn and Y. Kodama, Load balanced parallel radix sort, Procs. 12th ACM Int’l Conf. Supercomputing, Melbourne, Australia, July 14–17, 1998. 161, 162, 165

    Google Scholar 

  12. A. Sohn, Y. Kodama, M. Sato, H. Sakane, H. Yamada, S. Sakai, Y. Yamaguchi, Identifying the capability of overlapping computation with communication, Procs. ACM/IEEE Parallel Architecture and Compilation Techniques, Boston, MA, Oct. 1996. 161

    Google Scholar 

  13. Message Passing Interface Forum, MPI: A Message-Passing Interface Standard. Technical report, University of Tennessee, Knoxville, TN, June 1995. 165

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, SJ., Jeon, M., Sohn, A., Kim, D. (2000). Partitioned Parallel Radix Sort. In: Valero, M., Joe, K., Kitsuregawa, M., Tanaka, H. (eds) High Performance Computing. ISHPC 2000. Lecture Notes in Computer Science, vol 1940. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-39999-2_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-39999-2_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41128-4

  • Online ISBN: 978-3-540-39999-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics