Skip to main content

Abstract

The paper introduces RADULS, a new parallel sorter based on radix sort algorithm, intended to organize ultra-large data sets efficiently. For example 4 G 16-byte records can be sorted with 16 threads in less than 15 s on Intel Xeon-based workstation. The implementation of RADULS is not only highly optimized to gain such an excellent performance, but also parallelized in a cache friendly manner to make the most of modern multicore architectures. Besides, our parallel scheduler launches a few different procedures at runtime, according to the current parameters of the execution, for proper workload management. All experiments show RADULS to be superior to competing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. MCSTL: The multi-core standard template library (2008). http://algo2.iti.kit.edu/singler/mcstl/

  2. Cho, M., Brand, D., Bordawekar, R., Finkler, U., Kulandaisamy, V., Puri, R.: PARADIS: an efficient parallel algorithm for in-place radix sort. In: Proceedings of the VLDB Endowment—Proceedings of the 41st International Conference on Very, pp. 1518–1529 (2015)

    Google Scholar 

  3. Deorowicz, S., Kokot, M., Grabowski, S., Debudaj-Grabysz, A.: KMC 2: fast and resource-frugal k-mer counting. Bioinformatics 31(10), 1569–1576 (2015). http://dx.doi.org/10.1093/bioinformatics/btv022

    Article  Google Scholar 

  4. Deorowicz, S., Debudaj-Grabysz, A., Grabowski, S.: Disk-based k-mer counting on a PC. BMC Bioinform. 14(1), 160 (2013). http://dx.doi.org/10.1186/1471-2105-14-160

    Article  Google Scholar 

  5. Gray, J., Sundaresan, P., Englert, S., Baclawski, K., Weinberger, P.: Quickly generating billion-record synthetic databases. In: Proceedings of the SIGMOD, pp. 243–252 (1994)

    Google Scholar 

  6. Hoare, C.: Quicksort. Comput. J. 5(1), 10–15 (1962)

    Article  MathSciNet  MATH  Google Scholar 

  7. Intel: Intel Guide for Developing Multithreaded Application, Intel (2011). http://www.intel.com/software/threading-guide

  8. Intel: Threading Building Blocks (2016). https://www.threadingbuildingblocks.org/

  9. Knuth, D.: The Art of Computer Programming. Addison-Wesley, Boston (1968)

    MATH  Google Scholar 

  10. Musser, D.: Introspective sorting and selection algorithms. Softw.: Pract. Exp. 27(8), 983–993 (1997)

    Google Scholar 

  11. Satish, N., Kim, C., Chhugani, J., Nguyen, AD., Lee, V., Kim, D., Dubey, P.: Fast sort on CPUs and GPUs: a case for bandwidth oblivious simd sort. In: Proceedings of the 2010 International Conference on Management of Data, pp. 351–362 (2010)

    Google Scholar 

  12. Sedgewick, R.: Algorithms in C++, Parts 1–4: Fundamentals, Data Structure, Sorting, Searching. Addison-Wesley-Longman, Harlow (1998)

    MATH  Google Scholar 

  13. Shell, D.: A high-speed sorting procedure. Commun. ACM 2(7), 30–32 (1959)

    Article  Google Scholar 

  14. Singler, J., Sanders, P., Putze, F.: MCSTL: the multi-core standard template library. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 682–694. Springer, Heidelberg (2007). doi:10.1007/978-3-540-74466-5_72

    Chapter  Google Scholar 

  15. Williams, J.: Algorithm 232: Heapsort. Commun. ACM 7(6), 347–348 (1964)

    Google Scholar 

Download references

Acknowledgments

The work was supported by the Polish National Science Centre under the project DEC-2013/09/B/ST6/03117 (SD, ADG) and by Silesian University of Technology grant no. BKM507/RAU2/2016 (MK).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Deorowicz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Kokot, M., Deorowicz, S., Debudaj-Grabysz, A. (2017). Sorting Data on Ultra-Large Scale with RADULS. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds) Beyond Databases, Architectures and Structures. Towards Efficient Solutions for Data Analysis and Knowledge Representation. BDAS 2017. Communications in Computer and Information Science, vol 716. Springer, Cham. https://doi.org/10.1007/978-3-319-58274-0_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-58274-0_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-58273-3

  • Online ISBN: 978-3-319-58274-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics