Abstract
Recently, lots of work focus on devising one-pass algorithms for processing and querying multiple data streams, such as network monitoring, sensor networks, .etc. Estimating the cardinality of set expressions over streams is perhaps one of the most fundamental problems. Unfortunately, no solution has been devised for this issue over sliding windows. In this paper, we propose a space-efficient algorithmic solution to estimate the cardinality of set expression over sliding windows. Our probabilistic method is based on a new hash based synopsis, termed improved 2-level hash sketch. A thorough experimental evaluation has demonstrated that our methods can solve the problem efficiently.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating the frequency moments. In: Proc. of ACM STOC (1996)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proc. of ACM SIGACT-SIGMOD Symp. on Principles of Database Systems (2002)
Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. In: Proc. of STOC (1998)
Das, A., Ganguly, S., Garofalakis, M., Rastogi, R.: Distributed set-expression cardinality estimation. In: Proc. of VLDB (2004)
Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. In: Proc. of SODA (2002)
Flajolet, P., Martin, G.N.: Probabilistic counting algorithms for data base applications. Journal of Computer and System Sciences 31, 182–209 (1985)
Ganguly, S., Garofalakis, M., Rastogi, R.: Processing set expressions over continuous update streams. In: Proc. of SIGMOD (2003)
Gibbons, P.B., Tirthapura, S.: Estimating simple functions on the union of data streams. In: Proc. of SPAA (2001)
Gibbons, P.B., Tirthapura, S.: Distributed streams algorithms for sliding windows. In: Proc. of SPAA (2002)
Indyk, P.: A small approximately min-wise independent family of hash functions. In: Proc. of SODA (1999)
Zhu, Y., Shasha, D.: Statstream: Statistical monitoring of thousands of data streams in real time. In: Proc. of VLDB (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jin, C., Zhou, A. (2005). Distinct Estimate of Set Expressions over Sliding Windows. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds) Web Technologies Research and Development - APWeb 2005. APWeb 2005. Lecture Notes in Computer Science, vol 3399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31849-1_52
Download citation
DOI: https://doi.org/10.1007/978-3-540-31849-1_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25207-8
Online ISBN: 978-3-540-31849-1
eBook Packages: Computer ScienceComputer Science (R0)