research-article

Concurrent Hash Tables: Fast and General(?)!

Authors:

Roman DementievAuthors Info & Claims

ACM Transactions on Parallel Computing (TOPC), Volume 5, Issue 4

Article No.: 16, Pages 1 - 32

https://doi.org/10.1145/3309206

Published: 22 February 2019 Publication History

Abstract

Concurrent hash tables are one of the most important concurrent data structures, which are used in numerous applications. For some applications, it is common that hash table accesses dominate the execution time. To efficiently solve these problems in parallel, we need implementations that achieve speedups in highly concurrent scenarios. Unfortunately, currently available concurrent hashing libraries are far away from this requirement, in particular, when adaptively sized tables are necessary or contention on some elements occurs.

Our starting point for better performing data structures is a fast and simple lock-free concurrent hash table based on linear probing that is, however, limited to word-sized key-value types and does not support dynamic size adaptation. We explain how to lift these limitations in a provably scalable way and demonstrate that dynamic growing has a performance overhead comparable to the same generalization in sequential hash tables.

We perform extensive experiments comparing the performance of our implementations with six of the most widely used concurrent hash tables. Ours are considerably faster than the best algorithms with similar restrictions and an order of magnitude faster than the best more general tables. In some extreme cases, the difference even approaches four orders of magnitude.

All our implementations discussed in this publication can be found on github [17].

References

[1]

Lada A. Adamic and Bernardo A. Huberman. 2002. Zipf’s law and the Internet. Glottometrics 3, 1 (2002), 143--150.

[2]

Robert L. Axtell. 2001. Zipf distribution of U.S. firm sizes. Science 293, 5536 (2001), 1818--1820.

[3]

Holger Bast, Stefan Funke, Domagoj Matijevic, Peter Sanders, and Domink Schultes. 2007. In transit to constant time shortest-path queries in road networks. In Proceedings of the Meeting on Algorithm Engineering and Expermiments (ALENEX’07). 46--59.

Digital Library

[4]

Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. 1999. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of the18th Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings (INFOCOM’99). Vol. 1, 126--134.

[5]

Shimin Chen, Anastassia Ailamaki, Phillip B. Gibbons, and Todd C. Mowry. 2007. Improving hash join performance through prefetching. ACM Trans Database Syst. 32, 3 (2007), 17.

Digital Library

[6]

Roman Dementiev, Lutz Kettner, Jens Mehnert, and Peter Sanders. 2004. Engineering a sorted list data structure for 32 bit keys. In Proceedings of the 6th Workshop on Algorithm Engineering 8 Experiments (ALENEX’04). 142--151.

[7]

Martin Dietzfelbinger, Torben Hagerup, Jyrki Katajainen, and Martti Penttonen. 1997. A reliable randomized algorithm for the closest-pair problem. J. Algor. 25, 1 (1997), 19--51.

Digital Library

[8]

Martin Dietzfelbinger and Christoph Weidling. 2007. Balanced allocation and dictionaries with tightly packed constant size bins. Theoret. Comput. Sci. 380, 1--2 (2007), 47--68.

Digital Library

[9]

Hui Gao, Jan Friso Groote, and Wim H. Hesselink. 2005. Lock-free dynamic hash tables with open addressing. Distrib. Comput. 18, 1 (2005).

Digital Library

[10]

Torben Hagerup and Christine Rüb. 1989. Optimal merging and sorting on the EREW-PRAM. Inform. Process. Lett. 33 (1989), 181--185.

Digital Library

[11]

Maurice Herlihy and Nir Shavit. 2012. The Art of Multiprocessor Programming. Elsevier.

Digital Library

[12]

Maurice Herlihy, Nir Shavit, and Moran Tzafrir. 2008. Hopscotch hashing. In Distributed Computing. Springer, 350--364.

Digital Library

[13]

Euihyeok Kim and Min-Soo Kim. 2013. Performance analysis of cache-conscious hashing techniques for multi-core CPUs. Int. J. Control Autom. 6, 2 (2013).

[14]

Donald E. Knuth. 1998. The Art of Computer Programming—Sorting and Searching (2nd ed.). Vol. 3. Addison Wesley.

[15]

Doug Lea. 2003. Hash table util. concurrent. ConcurrentHashMap, revision 1.3. JSR-166, the Proposed Java Concurrency Package. Retrieved from http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent.

[16]

Xiaozhou Li, David G. Andersen, Michael Kaminsky, and Michael J. Freedman. 2014. Algorithmic improvements for fast concurrent cuckoo hashing. In Proceedings of the 9th European Conference on Computer Systems (EuroSys’14). ACM, Article 27.

Digital Library

[17]

Tobias Maier. 2018. GrowT. Retrieved from https://github.com/TooBiased/growt.

[18]

Tobias Maier and Peter Sanders. 2017. Dynamic space efficient hashing. In Proceedings of the European Symposium on Algorithms (ESA’17), Vol. 87. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.

[19]

Tobias Maier, Peter Sanders, and Roman Dementiev. 2016. Concurrent hash tables: Fast and general?(&excl;). CoRR abs/1601.04017 (2016). Retrieved from http://arxiv.org/abs/1601.04017.

[20]

Tobias Maier, Peter Sanders, and Roman Dementiev. 2016. Concurrent hash tables: Fast and general(?)&excl;. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’16). Article 34.

Digital Library

[21]

Makoto Matsumoto and Takuji Nishimura. 1998. Mersenne Twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans. Model. Comput. Simul. 8 (1998), 3--30. Retrieved from http://www.math.keio.ac.jp/ matumoto/emt.html.

Digital Library

[22]

Edward M. McCreight. 1976. A space-economical suffix tree construction algorithm. J. ACM 23, 2 (Apr. 1976), 262--272.

Digital Library

[23]

Paul E. McKenney and John D. Slingwine. 1998. Read-copy update: Using execution history to solve concurrency problems. Parallel Distrib. Comput. Syst. (1998), 509--518.

[24]

Kurt Mehlhorn and Peter Sanders. 2008. Algorithms and Data Structures—The Basic Toolbox. Springer.

Digital Library

[25]

Scott Meyers. 2005. Effective C++: 55 Specific Ways to Improve Your Programs and Designs. Pearson Education.

Digital Library

[26]

Ingo Müller, Peter Sanders, Arnaud Lacurie, Wolfgang Lehner, and Franz Färber. 2015. Cache-efficient aggregation: Hashing is sorting. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 1123--1136.

Digital Library

[27]

Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab et al. 2013. Scaling memcache at facebook. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI), Vol. 13. 385--398.

Digital Library

[28]

Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan Mcelroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, Venkateshwaran Venkataramani, and Facebook Inc. 2018. folly version 57:0. Retrieved from https://github.com/facebook/folly.

[29]

Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan Mcelroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, Venkateshwaran Venkataramani, and Facebook Inc. 2013. Scaling memcached at facebook. In Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’13).

Digital Library

[30]

Philippe Oechslin. 2003. Making a faster cryptanalytic time-memory trade-off. In Proceeeings of the 23rd Annual International Cryptology Conference on Advances in Cryptology (CRYPTO’03). Springer.

[31]

Jong Soo Park, Ming-Syan Chen, and Philip S. Yu. 1995. An effective hash-based algorithm for mining association rules. In Proceedings of the ACM SIGMOD Conference on Management of Data. 175--186.

Digital Library

[32]

Mathieu Desnoyers Paul E. McKenney and Lai Jiangshan. 2013. LWN: URCU-protected hash tables. Retrieved from http://lwn.net/Articles/573431/.

[33]

Chuck Pheatt. 2008. Intel; threading building blocks. J. Comput. Sci. Coll. 23, 4 (April 2008), 298--298.

Digital Library

[34]

Jeff Preshing. 2016. Junction. Retrieved from https://github.com/preshing/junction.

[35]

Jeff Preshing. 2016. New Concurrent Hash Maps for C++. Retrieved from http://preshing.com/20160201/new-concurrent-hash-maps-for-cpp/.

[36]

Ori Shalev and Nir Shavit. 2006. Split-ordered lists: Lock-free extensible hash tables. J. ACM 53, 3 (May 2006), 379--405.

Digital Library

[37]

Julian Shun and Guy E. Blelloch. 2014. Phase-concurrent hash tables for determinism. In Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 96--107.

Digital Library

[38]

Julian Shun, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Aapo Kyrola, Harsha Vardhan Simhadri, and Kanat Tangwongsan. 2012. Brief announcement: The problem based benchmark suite. In Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 68--70.

Digital Library

[39]

Alex Stivala, Peter J. Stuckey, Maria Garcia de la Banda, Manuel Hermenegildo, and Anthony Wirth. 2010. Lock-free parallel dynamic programming. J. Parallel and Distrib. Comput. 70, 8 (2010).

Digital Library

[40]

Tony Stornetta and Forrest Brewer. 1996. Implementation of an efficient parallel BDD package. In Proceedings of the 33rd Design Automation Conference. ACM, 641--644.

Digital Library

Cited By

Faltelli MBelocchi GQuaglia FBianchi G(2025)COREC: Concurrent non-blocking single-queue receive driver for low latency networkingComputer Networks10.1016/j.comnet.2024.110982258(110982)Online publication date: Feb-2025
https://doi.org/10.1016/j.comnet.2024.110982
Li SKimura YSato HFujita M(2024)Parallelizing Quantum Simulation With Decision DiagramsIEEE Transactions on Quantum Engineering10.1109/TQE.2024.33645465(1-12)Online publication date: 2024
https://doi.org/10.1109/TQE.2024.3364546
Karan SSayed ZZola J(2024)End-to-End Bayesian Networks Exact Learning in Shared MemoryIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.336647135:4(634-645)Online publication date: Apr-2024
https://doi.org/10.1109/TPDS.2024.3366471
Show More Cited By

Index Terms

Recommendations

Phase-concurrent hash tables for determinism
SPAA '14: Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures

We present a deterministic phase-concurrent hash table in which operations of the same type are allowed to proceed concurrently, but operations of different types are not. Phase-concurrency guarantees that all concurrent operations commute, giving a ...
Concurrent hash tables: fast and general?(!)
PPoPP '16

Concurrent hash tables are one of the most important concurrent data structures with numerous applications. Since hash table accesses can dominate the execution time of the overall application, we need implementations that achieve good speedup. ...
Concurrent hash tables: fast and general?(!)
PPoPP '16: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Concurrent hash tables are one of the most important concurrent data structures with numerous applications. Since hash table accesses can dominate the execution time of the overall application, we need implementations that achieve good speedup. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Parallel Computing

ACM Transactions on Parallel Computing Volume 5, Issue 4

December 2018

112 pages

ISSN:2329-4949

EISSN:2329-4957

DOI:10.1145/3314574

Editor:
David Bader
Georgia Institute of Technology, USA

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 February 2019

Accepted: 01 December 2018

Revised: 01 April 2018

Received: 01 September 2016

Published in TOPC Volume 5, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
14,470
Total Downloads

Downloads (Last 12 months)167
Downloads (Last 6 weeks)22

Reflects downloads up to 22 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Faltelli MBelocchi GQuaglia FBianchi G(2025)COREC: Concurrent non-blocking single-queue receive driver for low latency networkingComputer Networks10.1016/j.comnet.2024.110982258(110982)Online publication date: Feb-2025
https://doi.org/10.1016/j.comnet.2024.110982
Li SKimura YSato HFujita M(2024)Parallelizing Quantum Simulation With Decision DiagramsIEEE Transactions on Quantum Engineering10.1109/TQE.2024.33645465(1-12)Online publication date: 2024
https://doi.org/10.1109/TQE.2024.3364546
Karan SSayed ZZola J(2024)End-to-End Bayesian Networks Exact Learning in Shared MemoryIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.336647135:4(634-645)Online publication date: Apr-2024
https://doi.org/10.1109/TPDS.2024.3366471
Husung NDubslaff CHermanns HKöhl M(2024)OxiDDTools and Algorithms for the Construction and Analysis of Systems10.1007/978-3-031-57256-2_13(255-275)Online publication date: 6-Apr-2024
https://dl.acm.org/doi/10.1007/978-3-031-57256-2_13
Pandey PBender MConway AFarach-Colton MKuszmaul WTagliavini GJohnson R(2023)IcebergHT: High Performance Hash Tables Through Stability and Low AssociativityProceedings of the ACM on Management of Data10.1145/35887271:1(1-26)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.1145/3588727
Zhao XZhong CJiang SGilad YKostic DMoatti YBiran O(2023)TurboHash: A Hash Table for Key-value Store on Persistent MemoryProceedings of the 16th ACM International Conference on Systems and Storage10.1145/3579370.3594766(35-48)Online publication date: 5-Jun-2023
https://dl.acm.org/doi/10.1145/3579370.3594766
Shahrokhi HShaikhha AVerbrugge CLhoták OShen X(2023)Building a Compiled Query Engine in PythonProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580264(180-190)Online publication date: 17-Feb-2023
https://dl.acm.org/doi/10.1145/3578360.3580264
Allendorf DMeyer UPenschuck MTran H(2023)Parallel global edge switching for the uniform sampling of simple graphs with prescribed degreesJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.12.010174(118-129)Online publication date: Apr-2023
https://doi.org/10.1016/j.jpdc.2022.12.010
Sanders PSeemaier D(2023)Distributed Deep Multilevel Graph PartitioningEuro-Par 2023: Parallel Processing10.1007/978-3-031-39698-4_30(443-457)Online publication date: 24-Aug-2023
https://doi.org/10.1007/978-3-031-39698-4_30
Adas DEinziger GFriedman R(2022)Limited Associativity Makes Concurrent Software Caches a BreezeProceedings of the 23rd International Conference on Distributed Computing and Networking10.1145/3491003.3491013(87-96)Online publication date: 4-Jan-2022
https://dl.acm.org/doi/10.1145/3491003.3491013
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents