Abstract
We present two novel algorithms for tracking the number of distinct items over high speed data streams consisting of insertion and deletion operations that improves on the space and time complexity of existing algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Alon, N., Matias, Y., Szegedy, M.: The Space Complexity of Approximating the Frequency Moments. In: Proceedings of the 28th Annual ACM Symposium on the Theory of Computing STOC 1996, Philadelphia, Pennsylvania, May 1996, pp. 20–29 (1996)
Alon, N., Matias, Y., Szegedy, M.: The space complexity of approximating frequency moments. Journal of Computer Systems and Sciences 58(1), 137–147 (1998)
Bar-Yossef, Z., Jayram, T.S., Kumar, R., Sivakumar, D., Trevisan, L.: Counting distinct elements in a data stream. In: Rolim, J.D.P., Vadhan, S.P. (eds.) RANDOM 2002. LNCS, vol. 2483, p. 1. Springer, Heidelberg (2002)
Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations (Extended Abstract). In: Proceedings of the 30th Annual ACM Symposium on the Theory of Computing STOC 1998, Dallas, Texas, May 1998, pp. 327–336 (1998)
Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. Journal of Computer Systems and Sciences 60(3), 630–659 (2000)
Dubhashi, D., Priebe, V., Ranjan, D.: Negative Dependence through the FKG Inequality. Basic Research in Computer Science, Report Series, BRICSRS-96-27
Flajolet, P., Martin, G.N.: Probabilistic Counting Algorithms for Database Applications. Journal of Computer Systems and Sciences 31(2), 182–209 (1985)
Ganguly, S., Garofalakis, M., Rastogi, R.: Processing Set Expressions over Continuous Update Streams. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, San Diego, CA (2003)
Ganguly, S., Garofalakis, M., Rastogi, R., Sabnani, K.: Streaming Algorithms for Robust, Real-Time Detection of DDoS Attacks. Bell Laboratories Technical Memorandum (2004)
Gibbons, P.B.: Distinct Sampling for Highly-accurate Answers to Distinct Values Queries and Event Reports. In: Proceedings of the 27th International Conference on Very Large Data Bases, Roma, Italy (September 2001)
Gibbons, P.B., Tirthapura, S.: Estimating simple functions on the union of data streams. In: Proceedings of the 13th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 2001, Heraklion, Crete, Greece, July 2001, pp. 281–291 (2001)
Gibbons, P.B., Tirthapura, S.: Distributed streams algorithms for sliding windows. In: Proceedings of the 14th Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA 2002, Winnipeg, Manitoba, Canada, August 2002, pp. 63–72 (2002)
Indyk, P., Woodruff, D.: Tight Lower Bounds for the Distinct Elements Problem. In: Proceedings of the 35th ACM Symposium on Theory of Computing (STOC 2003), San Diego, CA (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ganguly, S. (2005). Counting Distinct Items over Update Streams. In: Deng, X., Du, DZ. (eds) Algorithms and Computation. ISAAC 2005. Lecture Notes in Computer Science, vol 3827. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11602613_51
Download citation
DOI: https://doi.org/10.1007/11602613_51
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30935-2
Online ISBN: 978-3-540-32426-3
eBook Packages: Computer ScienceComputer Science (R0)