Abstract
Filters (such as Bloom Filters) are a fundamental data structure that speed up network routing and measurement operations by storing a compressed representation of a set. Filters are very space efficient, but can make bounded one-sided errors: with tunable probability \(\epsilon \), they may report that a query element is stored in the filter when it is not. This is called a false positive. Recent research has focused on designing methods for dynamically adapting filters to false positives, thereby reducing the number of false positives when some elements are queried repeatedly.
Ideally, an adaptive filter would incur a false positive with bounded probability \(\epsilon \) for each new query element, and would incur \(o(\epsilon )\) total false positives over all repeated queries to that element. We call such a filter support optimal.
In this paper we design a new Adaptive Cuckoo Filter, and show that it is support optimal (up to additive logarithmic terms) over any n queries when storing a set of size n.
We complement these bounds with experiments that show that our data structure is effective at fixing false positives on network trace datasets, outperforming previous Adaptive Cuckoo Filters.
Finally, we investigate adversarial adaptivity, a stronger notion of adaptivity in which an adaptive adversary repeatedly queries the filter, using the result of previous queries to drive the false positive rate as high as possible. We prove a lower bound showing that a broad family of filters, including all known Adaptive Cuckoo Filters, can be forced by such an adversary to incur a large number of false positives.
This work was supported in part by ISF grants no. 1278/16 and 1926/19, by a BSF grant no. 2018364, and by an ERC grant MPM under the EU’s Horizon 2020 Research and Innovation Programme (grant no. 683064).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that the filter does not have access to this sequence ahead of time; it must process the queries online.
- 2.
We assume \(\gamma n\) is an integer multiple of bk for simplicity.
- 3.
When treating the hash value as a bit string we assume that N is a power of two for simplicity; this assumption is not necessary for the implementation.
- 4.
- 5.
That is to say, \(\gamma = 2\).
- 6.
The Cyclic ACF does not quite satisfy Definition 1 since its hashes are not independent. However, this only makes it easier for an adversary to find false positives.
References
Bender, M.A., Das, R., Farach-Colton, M., Mo, T., Tench, D., Ping Wang, Y.: Mitigating false positives in filters: to adapt or to cache? In: Symposium on Algorithmic Principles of Computer Systems (APOCS), pp. 16–24. ACM-SIAM (2021)
Bender, M.A., Farach-Colton, M., Goswami, M., Johnson, R., McCauley, S., Singh, S.: Bloom filters, adaptivity, and the dictionary problem. In: Foundations of Computer Science (FOCS), pp. 182–193. IEEE (2018)
Bender, M.A., et al.: Don’t thrash: how to cache your hash on flash. Proc. VLDB Endow. 5(11), 1627–1637 (2012)
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Broder, A., Mitzenmacher, M.: Network applications of bloom filters: a survey. Internet Math. 1(4), 485–509 (2004)
Carter, L., Floyd, R., Gill, J., Markowsky, G., Wegman, M.: Exact and approximate membership testers. In: Symposium on Theory of Computing (STOC), pp. 59–65 (1978)
Eppstein, D.: Cuckoo filter: simplification and analysis. In: Scandinavian Symposium and Workshops on Algorithm Theory (SWAT), vol. 53, pp. 8:1–8:12 (2016)
Eppstein, D., Goodrich, M.T., Mitzenmacher, M., Torres, M.R.: 2–3 cuckoo filters for faster triangle listing and set intersection. In: Principles of Database Systems (PODS), pp. 247–260. ACM (2017)
Fan, B., Andersen, D.G., Kaminsky, M., Mitzenmacher, M.D.: Cuckoo filter: practically better than Bloom. In: International Conference on Emerging Networking Experiments and Technologies (CoNEXT), pp. 75–88. ACM (2014)
Geravand, S., Ahmadi, M.: Bloom filter applications in network security: a state-of-the-art survey. Comput. Netw. 57(18), 4047–4064 (2013)
Jiang, S., Larsen, K.G.: A faster external memory priority queue with decrease keys. In: Symposium on Discrete Algorithms (SODA), pp. 1331–1343. ACM-SIAM (2019)
Lovett, S., Porat, E.: A lower bound for dynamic approximate membership data structures. In: Foundations of Computer Science (FOCS), pp. 797–804. IEEE (2010)
Mitzenmacher, M., Pontarelli, S., Reviriego, P.: Adaptive cuckoo filters. In: Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 36–47 (2018)
Pagh, A., Pagh, R., Rao, S.S.: An optimal bloom filter replacement. In: Symposium on Discrete Algorithms (SODA), pp. 823–829. ACM-SIAM (2005)
Pagh, R., Rodler, F.F.: Cuckoo hashing. J. Algorithms 51(2), 122–144 (2004)
Pandey, P., Bender, M.A., Johnson, R., Patro, R.: A general-purpose counting filter: making every bit count. In: International Conference on Management of Data (SIGMOD), pp. 775–787. ACM (2017)
Porat, E.: An optimal bloom filter replacement based on matrix solving. In: Frid, A., Morozov, A., Rybalchenko, A., Wagner, K.W. (eds.) CSR 2009. LNCS, vol. 5675, pp. 263–273. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03351-3_25
Wang, M., Zhou, M., Shi, S., Qian, C.: Vacuum filters: more space-efficient and faster replacement for bloom and cuckoo filters. Proc. VLDB Endow. 13(2), 197–210 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Kopelowitz, T., McCauley, S., Porat, E. (2021). Support Optimality and Adaptive Cuckoo Filters. In: Lubiw, A., Salavatipour, M., He, M. (eds) Algorithms and Data Structures. WADS 2021. Lecture Notes in Computer Science(), vol 12808. Springer, Cham. https://doi.org/10.1007/978-3-030-83508-8_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-83508-8_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-83507-1
Online ISBN: 978-3-030-83508-8
eBook Packages: Computer ScienceComputer Science (R0)