Journals & Magazines >IEEE Transactions on Big Data >Volume: 2 Issue: 1

GPU-Accelerated Large-Scale Distributed Sorting Coping with Device Memory Capacity

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Splitter-based parallel sorting algorithms are known to be highly efficient for distributed sorting due to their low communication complexity. Although using GPU accelera...Show More

Metadata

Abstract:

Splitter-based parallel sorting algorithms are known to be highly efficient for distributed sorting due to their low communication complexity. Although using GPU accelerators could help to reduce the computation cost in general, their effectiveness in distributed sorting algorithms remains unclear. We investigate applicability of using GPU devices to the splitter-based algorithms and extend HykSort, an existing splitter-based algorithm by offloading costly computation phases to GPUs. To cope with the volumes of data exceeding the GPU memory capacity, out-of-core local sort is used with small overhead about 7.5 percent when the data size is tripled. We evaluate the performance of our implementation on the TSUBAME2.5 supercomputer that comprises over 4,000 NVIDIA K20x GPUs. Weak scaling analysis shows 389 times speedup with 0.25 TB/s throughput when sorting 4 TB of 64 bit integer values on 1,024 nodes compared to running on one node; this is 1.40 times faster than the reference CPU implementation. Detailed analysis however reveals that the performance is mostly bottlenecked by the CPU-GPU host-to-device bandwidth. With orders of magnitude improvements announced for next generation GPUs, the performance boost will be tremendous in accordance with other successful GPU accelerations.

Published in: IEEE Transactions on Big Data ( Volume: 2, Issue: 1, 01 March 2016)

Page(s): 57 - 69

Date of Publication: 05 January 2016

ISSN Information:

DOI: 10.1109/TBDATA.2015.2511001

Funding Agency:

Contents

References is not available for this document.

GPU-Accelerated Large-Scale Distributed Sorting Coping with Device Memory Capacity

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

GPU-Accelerated Large-Scale Distributed Sorting Coping with Device Memory Capacity

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?