skip to main content
research-article

Concurrent Hash Tables: Fast and General(?)!

Published: 22 February 2019 Publication History

Abstract

Concurrent hash tables are one of the most important concurrent data structures, which are used in numerous applications. For some applications, it is common that hash table accesses dominate the execution time. To efficiently solve these problems in parallel, we need implementations that achieve speedups in highly concurrent scenarios. Unfortunately, currently available concurrent hashing libraries are far away from this requirement, in particular, when adaptively sized tables are necessary or contention on some elements occurs.
Our starting point for better performing data structures is a fast and simple lock-free concurrent hash table based on linear probing that is, however, limited to word-sized key-value types and does not support dynamic size adaptation. We explain how to lift these limitations in a provably scalable way and demonstrate that dynamic growing has a performance overhead comparable to the same generalization in sequential hash tables.
We perform extensive experiments comparing the performance of our implementations with six of the most widely used concurrent hash tables. Ours are considerably faster than the best algorithms with similar restrictions and an order of magnitude faster than the best more general tables. In some extreme cases, the difference even approaches four orders of magnitude.
All our implementations discussed in this publication can be found on github [17].

References

[1]
Lada A. Adamic and Bernardo A. Huberman. 2002. Zipf’s law and the Internet. Glottometrics 3, 1 (2002), 143--150.
[2]
Robert L. Axtell. 2001. Zipf distribution of U.S. firm sizes. Science 293, 5536 (2001), 1818--1820.
[3]
Holger Bast, Stefan Funke, Domagoj Matijevic, Peter Sanders, and Domink Schultes. 2007. In transit to constant time shortest-path queries in road networks. In Proceedings of the Meeting on Algorithm Engineering and Expermiments (ALENEX’07). 46--59.
[4]
Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. 1999. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of the18th Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings (INFOCOM’99). Vol. 1, 126--134.
[5]
Shimin Chen, Anastassia Ailamaki, Phillip B. Gibbons, and Todd C. Mowry. 2007. Improving hash join performance through prefetching. ACM Trans Database Syst. 32, 3 (2007), 17.
[6]
Roman Dementiev, Lutz Kettner, Jens Mehnert, and Peter Sanders. 2004. Engineering a sorted list data structure for 32 bit keys. In Proceedings of the 6th Workshop on Algorithm Engineering 8 Experiments (ALENEX’04). 142--151.
[7]
Martin Dietzfelbinger, Torben Hagerup, Jyrki Katajainen, and Martti Penttonen. 1997. A reliable randomized algorithm for the closest-pair problem. J. Algor. 25, 1 (1997), 19--51.
[8]
Martin Dietzfelbinger and Christoph Weidling. 2007. Balanced allocation and dictionaries with tightly packed constant size bins. Theoret. Comput. Sci. 380, 1--2 (2007), 47--68.
[9]
Hui Gao, Jan Friso Groote, and Wim H. Hesselink. 2005. Lock-free dynamic hash tables with open addressing. Distrib. Comput. 18, 1 (2005).
[10]
Torben Hagerup and Christine Rüb. 1989. Optimal merging and sorting on the EREW-PRAM. Inform. Process. Lett. 33 (1989), 181--185.
[11]
Maurice Herlihy and Nir Shavit. 2012. The Art of Multiprocessor Programming. Elsevier.
[12]
Maurice Herlihy, Nir Shavit, and Moran Tzafrir. 2008. Hopscotch hashing. In Distributed Computing. Springer, 350--364.
[13]
Euihyeok Kim and Min-Soo Kim. 2013. Performance analysis of cache-conscious hashing techniques for multi-core CPUs. Int. J. Control Autom. 6, 2 (2013).
[14]
Donald E. Knuth. 1998. The Art of Computer Programming—Sorting and Searching (2nd ed.). Vol. 3. Addison Wesley.
[15]
Doug Lea. 2003. Hash table util. concurrent. ConcurrentHashMap, revision 1.3. JSR-166, the Proposed Java Concurrency Package. Retrieved from http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent.
[16]
Xiaozhou Li, David G. Andersen, Michael Kaminsky, and Michael J. Freedman. 2014. Algorithmic improvements for fast concurrent cuckoo hashing. In Proceedings of the 9th European Conference on Computer Systems (EuroSys’14). ACM, Article 27.
[17]
Tobias Maier. 2018. GrowT. Retrieved from https://github.com/TooBiased/growt.
[18]
Tobias Maier and Peter Sanders. 2017. Dynamic space efficient hashing. In Proceedings of the European Symposium on Algorithms (ESA’17), Vol. 87. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
[19]
Tobias Maier, Peter Sanders, and Roman Dementiev. 2016. Concurrent hash tables: Fast and general?(!). CoRR abs/1601.04017 (2016). Retrieved from http://arxiv.org/abs/1601.04017.
[20]
Tobias Maier, Peter Sanders, and Roman Dementiev. 2016. Concurrent hash tables: Fast and general(?)!. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’16). Article 34.
[21]
Makoto Matsumoto and Takuji Nishimura. 1998. Mersenne Twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans. Model. Comput. Simul. 8 (1998), 3--30. Retrieved from http://www.math.keio.ac.jp/ matumoto/emt.html.
[22]
Edward M. McCreight. 1976. A space-economical suffix tree construction algorithm. J. ACM 23, 2 (Apr. 1976), 262--272.
[23]
Paul E. McKenney and John D. Slingwine. 1998. Read-copy update: Using execution history to solve concurrency problems. Parallel Distrib. Comput. Syst. (1998), 509--518.
[24]
Kurt Mehlhorn and Peter Sanders. 2008. Algorithms and Data Structures—The Basic Toolbox. Springer.
[25]
Scott Meyers. 2005. Effective C++: 55 Specific Ways to Improve Your Programs and Designs. Pearson Education.
[26]
Ingo Müller, Peter Sanders, Arnaud Lacurie, Wolfgang Lehner, and Franz Färber. 2015. Cache-efficient aggregation: Hashing is sorting. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 1123--1136.
[27]
Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab et al. 2013. Scaling memcache at facebook. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Vol. 13. 385--398.
[28]
Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan Mcelroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, Venkateshwaran Venkataramani, and Facebook Inc. 2018. folly version 57:0. Retrieved from https://github.com/facebook/folly.
[29]
Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan Mcelroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, Venkateshwaran Venkataramani, and Facebook Inc. 2013. Scaling memcached at facebook. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’13).
[30]
Philippe Oechslin. 2003. Making a faster cryptanalytic time-memory trade-off. In Proceeeings of the 23rd Annual International Cryptology Conference on Advances in Cryptology (CRYPTO’03). Springer.
[31]
Jong Soo Park, Ming-Syan Chen, and Philip S. Yu. 1995. An effective hash-based algorithm for mining association rules. In Proceedings of the ACM SIGMOD Conference on Management of Data. 175--186.
[32]
Mathieu Desnoyers Paul E. McKenney and Lai Jiangshan. 2013. LWN: URCU-protected hash tables. Retrieved from http://lwn.net/Articles/573431/.
[33]
Chuck Pheatt. 2008. Intel; threading building blocks. J. Comput. Sci. Coll. 23, 4 (April 2008), 298--298.
[34]
Jeff Preshing. 2016. Junction. Retrieved from https://github.com/preshing/junction.
[35]
Jeff Preshing. 2016. New Concurrent Hash Maps for C++. Retrieved from http://preshing.com/20160201/new-concurrent-hash-maps-for-cpp/.
[36]
Ori Shalev and Nir Shavit. 2006. Split-ordered lists: Lock-free extensible hash tables. J. ACM 53, 3 (May 2006), 379--405.
[37]
Julian Shun and Guy E. Blelloch. 2014. Phase-concurrent hash tables for determinism. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 96--107.
[38]
Julian Shun, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Aapo Kyrola, Harsha Vardhan Simhadri, and Kanat Tangwongsan. 2012. Brief announcement: The problem based benchmark suite. In Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 68--70.
[39]
Alex Stivala, Peter J. Stuckey, Maria Garcia de la Banda, Manuel Hermenegildo, and Anthony Wirth. 2010. Lock-free parallel dynamic programming. J. Parallel and Distrib. Comput. 70, 8 (2010).
[40]
Tony Stornetta and Forrest Brewer. 1996. Implementation of an efficient parallel BDD package. In Proceedings of the 33rd Design Automation Conference. ACM, 641--644.

Cited By

View all
  • (2025)COREC: Concurrent non-blocking single-queue receive driver for low latency networkingComputer Networks10.1016/j.comnet.2024.110982258(110982)Online publication date: Feb-2025
  • (2024)Parallelizing Quantum Simulation With Decision DiagramsIEEE Transactions on Quantum Engineering10.1109/TQE.2024.33645465(1-12)Online publication date: 2024
  • (2024)End-to-End Bayesian Networks Exact Learning in Shared MemoryIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.336647135:4(634-645)Online publication date: Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Parallel Computing
ACM Transactions on Parallel Computing  Volume 5, Issue 4
December 2018
112 pages
ISSN:2329-4949
EISSN:2329-4957
DOI:10.1145/3314574
  • Editor:
  • David Bader
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2019
Accepted: 01 December 2018
Revised: 01 April 2018
Received: 01 September 2016
Published in TOPC Volume 5, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Concurrency
  2. dynamic data structures
  3. experimental analysis
  4. hash table
  5. lock-freedom
  6. transactional memory

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)167
  • Downloads (Last 6 weeks)22
Reflects downloads up to 22 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)COREC: Concurrent non-blocking single-queue receive driver for low latency networkingComputer Networks10.1016/j.comnet.2024.110982258(110982)Online publication date: Feb-2025
  • (2024)Parallelizing Quantum Simulation With Decision DiagramsIEEE Transactions on Quantum Engineering10.1109/TQE.2024.33645465(1-12)Online publication date: 2024
  • (2024)End-to-End Bayesian Networks Exact Learning in Shared MemoryIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.336647135:4(634-645)Online publication date: Apr-2024
  • (2024)OxiDDTools and Algorithms for the Construction and Analysis of Systems10.1007/978-3-031-57256-2_13(255-275)Online publication date: 6-Apr-2024
  • (2023)IcebergHT: High Performance Hash Tables Through Stability and Low AssociativityProceedings of the ACM on Management of Data10.1145/35887271:1(1-26)Online publication date: 30-May-2023
  • (2023)TurboHash: A Hash Table for Key-value Store on Persistent MemoryProceedings of the 16th ACM International Conference on Systems and Storage10.1145/3579370.3594766(35-48)Online publication date: 5-Jun-2023
  • (2023)Building a Compiled Query Engine in PythonProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580264(180-190)Online publication date: 17-Feb-2023
  • (2023)Parallel global edge switching for the uniform sampling of simple graphs with prescribed degreesJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.12.010174(118-129)Online publication date: Apr-2023
  • (2023)Distributed Deep Multilevel Graph PartitioningEuro-Par 2023: Parallel Processing10.1007/978-3-031-39698-4_30(443-457)Online publication date: 24-Aug-2023
  • (2022)Limited Associativity Makes Concurrent Software Caches a BreezeProceedings of the 23rd International Conference on Distributed Computing and Networking10.1145/3491003.3491013(87-96)Online publication date: 4-Jan-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media