ABSTRACT
Hash tables provide efficient table implementations, achieving O(1), query, insert and delete operations at low loads. However, at moderate or high loads collisions are quite frequent, resulting in decreased performance. In this paper, we propose the segmented hash table architecture, which ensures constant time hash operations at high loads with high probability. To achieve this, the hash memory is divided into N logical segments so that each incoming key has N potential storage locations; the destination segment is chosen so as to minimize collisions. In this way, collisions, and the associated probe sequences, are dramatically reduced. In order to keep memory utilization minimized, probabilistic filters are kept on-chip to allow the N segments to be accessed without in-creasing the number of off-chip memory operations. These filters are kept small and accurate with the help of a novel algorithm, called selective filter insertion, which keeps the segments balanced while minimizing false positive rates (i.e., incorrect filter predictions). The performance of our scheme is quantified via analytical modeling and software simulations. Moreover, we discuss efficient implementations that are easily realizable in modern device technologies. The performance benefits are significant: average search cost is reduced by 40% or more, while the likelihood of requiring more than one memory operation per search is reduced by several orders of magnitude.
- Y. Azar, A. Broder, A. Karlin, E. Upfal. Balanced Allocations, Proc. 26th ACM Symp. on Theory of Computing, 1994, pp. 593--602.]] Google ScholarDigital Library
- M. Mitzenmacher, The Power of Two Choices in Randomized Load Balancing, Ph.D. thesis, University of California, Berkeley, 1996.]] Google ScholarDigital Library
- Y. Azar, A. Z. Broder , A. R. Karlin , E. Upfal, Balanced allocations (extended abstract), Proc. ACM symposium on Theory of computing, May 23-25, 1994, pp. 593--602.]] Google ScholarDigital Library
- M. Adler, S. Chakrabarti, M. Mitzenmacher, L. Rasmussen, Parallel randomized load balancing, Proc. 27th Annual ACM Symposium on Theory of Computing, 1995, pp. 238--247.]] Google ScholarDigital Library
- G. H. Gonnet, Expected length of the longest probe sequence in hash code searching, Journal of ACM, 28 (1981), pp. 289--304.]] Google ScholarDigital Library
- M. L. Fredman, et al., Storing a sparse table with O(1) worst case access time, Journal of ACM, 31 (1984), pp. 538--544.]] Google ScholarDigital Library
- J. L. Carter, M. N. Wegman, Universal Classes of Hash Functions, JCSS 18, No. 2, 1979, pp. 143--154.]]Google ScholarCross Ref
- P. D. Mackenzie, C. G. Plaxton, R. Rajaraman, On contention resolution protocols and associated probabilistic phenomena, Proc. 26th ACM Symp. on Theory of Computing, 1994, pp. 153--162.]] Google ScholarDigital Library
- A. Brodnik, I. Munro, Membership in constant time and almost-minimum space, SIAM J. Comput. 28 (1999) 1627--1640.]] Google ScholarDigital Library
- H. Song, S. Dharmapurikar, J. Turner, J. Lockwood, "Fast Hash Table Lookup Using Extended Bloom Filter: An Aid to Network Processing," SIGCOMM, Philadelphia PA, August 20-26, 2005.]] Google ScholarDigital Library
- B. Vocking, How Asymmetry Helps Load Balancing. Proc. 40th IEEE Symp. on Foundations of Comp. Science, 1999, pp. 131--141.]] Google ScholarDigital Library
- M. Wadvogel, G. Varghese, J. Turner, B. Plattner. Scalable High Speed IP Routing Lookups, Proc. of SIGCOMM 97, 1997.]] Google ScholarDigital Library
- A. Broder, M. Mitzenmacher, "Using Multiple Hash Functions to Improve IP Lookups", IEEE INFOCOM, 2001, pp. 1454--1463.]]Google Scholar
- W. Cunto, P. V. Poblete: Two Hybrid Methods for Collision Resolution in Open Addressing Hashing, SWAT 1988, pp. 113--119.]] Google ScholarDigital Library
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, Introduction to Algorithms, The MIT Press, 1990.]] Google ScholarDigital Library
- P. Larson, Dynamic Hash Tables, CACM, 1988, 31 (4).]] Google ScholarDigital Library
- M. Naor, V. Teague. Anti-persistence: History Independent Data Structures. Proc. 33nd Symp. on Theory of Computing, May 2001.]] Google ScholarDigital Library
- R. Pagh, F. F. Rodler, Cuckoo Hashing, Proc. 9th Annual European Symposium on Algorithms, August 28-31, 2001, pp.121--133.]] Google ScholarDigital Library
- D. E. Knuth, The Art of Computer Programming, volume 3, Addison-Wesley Publishing Co, second edition, 1998.]] Google ScholarDigital Library
- L. C. K. Hui, C. Martel, On efficient unsuccessful search, Proc. 3rd ACM-SIAM Symp. on Discrete Algorithms, 1992, pp. 217--227.]] Google ScholarDigital Library
- Goto, Ida, and Gunji, "Parallel hashing algorithms", Information Processing Letters, Vol. 6, No. 1, Feb. 1977.]]Google Scholar
- Hiraki, Nishida, and Shimada, "Evaluation of associative memory using parallel chained hashing", IEEE Tran. on Software Engineering, Vol. 33, No. 9, pp. 851--855, Sept. 1984.]]Google Scholar
- A. Broder and A. Karlin, "Multilevel adaptive hashing," ACM-SIAM Symposium on Discrete Algorithm, 1990.]] Google ScholarDigital Library
- G. R. Wright, W.R. Stevens, TCP/IP Illustrated, volume 2, Addison-Wesley Publishing Co., 1995.]]Google Scholar
- B. H. Bloom, "Space/time trade-offs in hash coding with allowable errors", Comm. of the ACM, v.13 n.7, p.422--426, July 1970]] Google ScholarDigital Library
- Li Fan , et al., "Summary cache: a scalable wide-area web cache sharing protocol", IEEE/ACM Transactions on Networking (TON), v.8 n.3, p.281--293, June 2000.]] Google ScholarDigital Library
- Vern Paxson, "Bro: A system for detecting network intruders in real time", Computer Networks, December 1999.]] Google ScholarDigital Library
- David V. Schuehler, James Moscola, and John W. Lockwood, "Architecture for a hardware-based TCP/IP content scanning system", In IEEE Hot Interconnects, Stanford, CA, August 2003.]]Google Scholar
- George S. Lueker, Mariko Molodowitch, "More analysis of double hashing," Combinatorica 13(1): 83--96 (1993).]]Google ScholarCross Ref
- Cu-11 standard cell/gate array ASIC, IBM.]]Google Scholar
- Virtex-4 FPGA, Xilinx.]]Google Scholar
Index Terms
- Segmented hash: an efficient hash table implementation for high performance networking subsystems
Recommendations
Fast hash table lookup using extended bloom filter: an aid to network processing
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communicationsHash tables are fundamental components of several network processing algorithms and applications, including route lookup, packet classification, per-flow state management and network monitoring. These applications, which typically occur in the data-path ...
Revisiting persistent hash table design for commercial non-volatile memory
DATE '20: Proceedings of the 23rd Conference on Design, Automation and Test in EuropeEmerging non-volatile memory technologies bring evolution to storage systems and durable data structures. Among them, a proliferation of researches on persistent hash table employ NVM as the storage layer for both fast access and efficient persistence. ...
Fast and deterministic hash table lookup using discriminative bloom filters
Hash tables are widely used in network applications, as they can achieve O(1) query, insert, and delete operations at moderate loads. However, at high loads, collisions are prevalent in the table, which increases the access time and induces non-...
Comments