research-article

GPU-Quicksort: A practical Quicksort algorithm for graphics processors

Authors:

Daniel Cederman,

Philippas TsigasAuthors Info & Claims

Journal of Experimental Algorithmics (JEA), Volume 14

Article No.: 4, Pages 1.4 - 1.24

https://doi.org/10.1145/1498698.1564500

Published: 05 January 2010 Publication History

Abstract

In this article, we describe GPU-Quicksort, an efficient Quicksort algorithm suitable for highly parallel multicore graphics processors. Quicksort has previously been considered an inefficient sorting solution for graphics processors, but we show that in CUDA, NVIDIA's programing platform for general-purpose computations on graphical processors, GPU-Quicksort performs better than the fastest-known sorting implementations for graphics processors, such as radix and bitonic sort. Quicksort can thus be seen as a viable alternative for sorting large quantities of data on graphics processors.

References

[1]

Bilardi, G. and Nicolau, A. 1989. Adaptive bitonic sorting: An optimal parallel algorithm for shared memory machines. SIAM J. Comput. 18, 2, 216--228.

Digital Library

[2]

Blelloch, G. E. 1993. Prefix sums and their applications. In Synthesis of Parallel Algorithms, J. H. Reif, Ed. Morgan Kaufmann, San Francisco.

[3]

Cederman, D. and Tsigas, P. 2007. GPU Quicksort Library. http://www.cs.chalmers.se/~dcs/gpuqsortdcs.html.

[4]

Dowd, M., Perl, Y., Rudolph, L., and Saks, M. 1989. The periodic balanced sorting network. J. ACM 36, 4, 738--757.

Digital Library

[5]

Evans, D. J. and Dunbar, R. C. 1982. The parallel Quicksort algorithm Part I - Run time analysis. Int. J. Comput. Math. 12, 19--55.

[6]

Govindaraju, N., Raghuvanshi, N., Henson, M., and Manocha, D. 2005. A cache-efficient sorting algorithm for database and data mining computations using graphics processors. Tech. rep., Univ. of North Carolina-Chapel Hill.

[7]

Govindaraju, N. K., Gray, J., Kumar, R., and Manocha, D. 2006. GPUTeraSort: High-performance graphics coprocessor sorting for large database management. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, New York, 325--336.

Digital Library

[8]

Govindaraju, N. K., Raghuvanshi, N., and Manocha, D. 2005. Fast and approximate stream mining of quantiles and frequencies using graphics processors. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 611--622.

Digital Library

[9]

Gress, A. and Zachmann, G. 2006. GPU-ABiSort: Optimal parallel sorting on stream architectures. In Proceedings of the 20th IEEE International Parallel and Distributed Processing Symposium. IEEE, Los Alamitos.

Digital Library

[10]

Harris, M., Sengupta, S., and Owens, J. D. 2007. Parallel prefix sum (scan) with CUDA. In GPU Gems 3, H. Nguyen, Ed. Addison Wesley, Upper Saddle River.

[11]

Heidelberger, P., Norton, A., and Robinson, J. T. 1990. Parallel Quicksort using fetch-and-add. IEEE Trans. Comput. 39, 1, 133--138.

Digital Library

[12]

Helman, D. R., Bader, D. A., and J&#225;j&#225;, J. 1998. A randomized parallel sorting algorithm with an experimental study. J. Parallel Distrib. Comput. 52, 1, 1--23.

Digital Library

[13]

Hoare, C. A. R. 1961. Algorithm 64: Quicksort. Commun. ACM 4, 7, 321.

Digital Library

[14]

Hoare, C. A. R. 1962. Quicksort. Comput. J. 5, 4, 10--15.

[15]

Jaja, J. 1992. Introduction to Parallel Algorithms. Addison-Wesley, Upper Saddle River.

Digital Library

[16]

Kapasi, U. J., Dally, W. J., Rixner, S., Mattson, P. R., Owens, J. D., and Khailany, B. 2000. Efficient conditional operations for data-parallel architectures. In Proceedings of the 33rd Annual ACM/IEEE International Symposium on Micro-architecture. ACM, New York, 159--170.

Digital Library

[17]

Khronos Group. 2008. OpenCL (Open Computing Language). http://www.khronos.org/opencl/.

[18]

Kipfer, P., Segal, M., and Westermann, R. 2004. UberFlow: A GPU-based particle engine. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware. ACM, New York, 115--122.

Digital Library

[19]

Kipfer, P. and Westermann, R. 2005. Improved GPU sorting. In GPUGems 2, M. Pharr, Ed. Addison-Wesley, Upper Saddle River, 733--746.

[20]

Matsumoto, M. and Nishimura, T. 1998. Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. Trans. Model.Comput. Simul. 8, 1, 3--30.

Digital Library

[21]

Musser, D. R. 1997. Introspective sorting and selection algorithms. Software&#8212;Practice and Experience 27, 8, 983--993.

Digital Library

[22]

Purcell, T. J., Donner, C., Cammarano, M., Jensen, H. W., and Hanrahan, P. 2003. Photon mapping on programmable graphics hardware. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Graphics Hardware. ACM, New York, 41--50.

Digital Library

[23]

Sedgewick, R. 1978. Implementing quicksort programs. Communications of the ACM 21, 10, 847--857.

Digital Library

[24]

Sengupta, S., Harris, M., Zhang, Y., and Owens, J. D. 2007. Scan primitives for GPU computing. In Proceedings of the 22nd ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware. ACM, New York, 97--106.

Digital Library

[25]

Singleton, R. C. 1969. Algorithm 347: An efficient algorithm for sorting with minimal storage. Commun. ACM 12, 3, 185--186.

Digital Library

[26]

Sintorn, E. and Assarsson, U. 2007. Fast parallel GPU-sorting using a hybrid algorithm. In Proceedings of the Workshop on General Purpose Processing on Graphics Processing Units. ACM, New York.

[27]

Stanford. 2008. The Stanford 3D scanning repository. http://www.graphics.stanford.edu/data/3Dscanrep.

[28]

Tsigas, P. and Zhang, Y. 2003. A simple, fast parallel implementation of Quicksort and its performance evaluation on SUN Enterprise 10000. In Proceedings of the 11th Euromicro- Conference on Parallel Distributed and Network-based Processing. IEEE, Los Alamitos, 372--381.

Cited By

Mujić MĆatić IBehić SHadžibajramović ANosović NHrnjić T(2023)Accelerating Sorting on GPUs: A Scalable CUDA Quicksort Revision2023 22nd International Symposium INFOTEH-JAHORINA (INFOTEH)10.1109/INFOTEH57020.2023.10094180(1-5)Online publication date: 15-Mar-2023
https://doi.org/10.1109/INFOTEH57020.2023.10094180
Ćatić IMujić MNosović NHrnjić T(2023)Enhancing Performance of CUDA Quicksort Through Pivot Selection and Branching Avoidance Methods2023 XXIX International Conference on Information, Communication and Automation Technologies (ICAT)10.1109/ICAT57854.2023.10171304(1-5)Online publication date: 11-Jun-2023
https://doi.org/10.1109/ICAT57854.2023.10171304
Gupta SSingh DChoudhary D(2023)New GPU Sorting Algorithm Using Sorted MatrixProcedia Computer Science10.1016/j.procs.2023.01.146218:C(1682-1691)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1016/j.procs.2023.01.146
Show More Cited By

Index Terms

GPU-Quicksort: A practical Quicksort algorithm for graphics processors
1. Mathematics of computing
  1. Mathematical software
2. Theory of computation
  1. Design and analysis of algorithms

Recommendations

Fast in-place sorting with CUDA based on bitonic sort
PPAM'09: Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I

State of the art graphics processors provide high processing power and furthermore, the high programmability of GPUs offered by frameworks like CUDA increases their usability as high-performance coprocessors for general-purpose computing. Sorting is ...
A performance study of general-purpose applications on graphics processors using CUDA

Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly programmable, offering the potential for dramatic speedups for a variety of ...
Accelerating the discontinuous Galerkin method for seismic wave propagation simulations using the graphic processing unit (GPU)-single-GPU implementation

We have successfully ported an arbitrary high-order discontinuous Galerkin (ADER-DG) method for solving the three-dimensional elastic seismic wave equation on unstructured tetrahedral meshes to an Nvidia Tesla C2075 GPU using the Nvidia CUDA programming ...

Comments

Information & Contributors

Information

Published In

cover image ACM Journal of Experimental Algorithmics

ACM Journal of Experimental Algorithmics Volume 14, Issue

2009

613 pages

ISSN:1084-6654

EISSN:1084-6654

DOI:10.1145/1498698

Issue’s Table of Contents

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2010

Accepted: 01 May 2009

Revised: 01 March 2009

Received: 01 December 2008

Published in JEA Volume 14

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

45
Total Citations
View Citations
4,221
Total Downloads

Downloads (Last 12 months)85
Downloads (Last 6 weeks)6

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mujić MĆatić IBehić SHadžibajramović ANosović NHrnjić T(2023)Accelerating Sorting on GPUs: A Scalable CUDA Quicksort Revision2023 22nd International Symposium INFOTEH-JAHORINA (INFOTEH)10.1109/INFOTEH57020.2023.10094180(1-5)Online publication date: 15-Mar-2023
https://doi.org/10.1109/INFOTEH57020.2023.10094180
Ćatić IMujić MNosović NHrnjić T(2023)Enhancing Performance of CUDA Quicksort Through Pivot Selection and Branching Avoidance Methods2023 XXIX International Conference on Information, Communication and Automation Technologies (ICAT)10.1109/ICAT57854.2023.10171304(1-5)Online publication date: 11-Jun-2023
https://doi.org/10.1109/ICAT57854.2023.10171304
Gupta SSingh DChoudhary D(2023)New GPU Sorting Algorithm Using Sorted MatrixProcedia Computer Science10.1016/j.procs.2023.01.146218:C(1682-1691)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1016/j.procs.2023.01.146
Alam MNajafi MTaherinejad N(2022)Sorting in Memristive MemoryACM Journal on Emerging Technologies in Computing Systems10.1145/351718118:4(1-21)Online publication date: 13-Oct-2022
https://dl.acm.org/doi/10.1145/3517181
Maltenberger TIlic ITolovski IRabl TIves ZBonifati AEl Abbadi A(2022)Evaluating Multi-GPU Sorting with Modern InterconnectsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517842(1795-1809)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3517842
Saumya CSundararajah KKulkarni MLee J(2022)DARMProceedings of the 20th IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO53902.2022.9741285(28-40)Online publication date: 2-Apr-2022
https://dl.acm.org/doi/10.1109/CGO53902.2022.9741285
Li HJin HZheng LHuang YLiao X(2022)ReCSA: a dedicated sort accelerator using ReRAM-based content addressable memoryFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-022-1322-317:2Online publication date: 8-Aug-2022
https://dl.acm.org/doi/10.1007/s11704-022-1322-3
Agung MWatanabe YWeber HEgawa RTakizawa H(2021)Preemptive Parallel Job Scheduling for Heterogeneous Systems Supporting Urgent ComputingIEEE Access10.1109/ACCESS.2021.30531629(17557-17571)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3053162
Qin XChen Y(2019)Database Techniques for New HardwareAdvanced Methodologies and Technologies in Network Architecture, Mobile Computing, and Data Analytics10.4018/978-1-5225-7598-6.ch040(546-562)Online publication date: 2019
https://doi.org/10.4018/978-1-5225-7598-6.ch040
Qin XChen Y(2018)Database Techniques for New HardwareEncyclopedia of Information Science and Technology, Fourth Edition10.4018/978-1-5225-2255-3.ch169(1947-1961)Online publication date: 2018
https://doi.org/10.4018/978-1-5225-2255-3.ch169
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents