Abstract:
There has been a long history of finding a spaceefficient data structure to support approximate membership queries, started from Bloom's work in the 1970's. Given a set A...Show MoreMetadata
Abstract:
There has been a long history of finding a spaceefficient data structure to support approximate membership queries, started from Bloom's work in the 1970's. Given a set A of n items and an additional item x from the same universe U of a size m ≫ n, we want to distinguish whether x ∈ A or not, using small (limited) space. The solutions for the membership query are needed for many network applications, such as cache directory, load-balancing, security, etc. If A is static, there exist optimal algorithms to find a randomized data structure to represent A using only (1+ o(1))n log 1/δ bits, which only allows for a small false positive δ but no false negative. However, existing optimal algorithms are not practical for many Internet applications, e.g., social network services, peer-to-peer systems, network traffic monitoring, etc. They are too spaceand time-expensive due to the frequent changes in the set A, because all items are needed to recompute the optimal data structure for each change using a linear running time. In this paper, we propose a novel data structure to support the approximate membership query in the time-decaying window model. In this model, items are inserted one-by-one over a data stream, and we want to determine whether an item is among the most recent w items for any given window size w ≤ n. Our data structure only requires O(n(log 1/δ+logn)) bits and O(1) running time. We also prove a non-trivial space lower bound, i.e. (n - δm) log(n - δm) bits, which guarantees that our data structure is near-optimal. Our data structure has been evaluated using both synthetic and real data sets.
Published in: 2013 Proceedings IEEE INFOCOM
Date of Conference: 14-19 April 2013
Date Added to IEEE Xplore: 25 July 2013
ISBN Information: