Skip to main content

Dynamically Maintaining Duplicate-Insensitive and Time-Decayed Sum Using Time-Decaying Bloom Filter

  • Conference paper
Algorithms and Architectures for Parallel Processing (ICA3PP 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5574))

  • 1814 Accesses

Abstract

The duplicate-insensitive and time-decayed sum of an arbitrary subset in a stream is an important aggregation for various analyses in many distributed stream scenarios. In general, precisely providing this sum in an unbounded and high-rate stream is infeasible. Therefore, we target at this problem and introduce a sketch, namely, time-decaying Bloom Filter (TDBF). The TDBF can detect duplicates in a stream and meanwhile dynamically maintain decayed-weight of all distinct elements in the stream according to a user-specified decay function. For a query for the current decayed sum of a subset in the stream, TDBF provides an effective estimation. In our theoretical analysis, a provably approximate guarantee has been given for the error of the estimation. In addition, the experimental results on synthetic stream validate our theoretical analysis.

This work is supported by Chinese Academy of Science "100 Talents" Project and National Science Foundation of China under its General Projects funding #60772034. Corresponding author: H. Shen.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cohen, E., Strauss, M.: Maintaining time-decaying stream aggregates. In: Proc. Principles of Database Systems (PODS), San Diego, California, June 2003, pp. 223–233 (2003)

    Google Scholar 

  2. Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM J. on Computing 31(6), 1794–1813 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  3. Babcock, B., Babu, S., Datar, M., Windom, J.: Model and issues in data stream systems. In: Proc. Principles of Database Systems (PODS), Wisconsin, June 2002, pp. 1–16 (2002)

    Google Scholar 

  4. Golab, L., Ozsu, M.T.: Issues in data stream management. SIGMOD Record 32(2), 5–14 (2003)

    Article  Google Scholar 

  5. Garcia-Molina, H., Ullman, J.D., Widom, J.: Database System Implementation. Prentice Hall, Englewood Cliffs (2000)

    Google Scholar 

  6. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Communications of the ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  7. Fan, L., Cao, P., Almeida, J., Broder, A.Z.: Summary cache: A scalable wide-area Web cache sharing protocol. IEEE/ACM Trans. net2working 8(3), 281–293 (2000)

    Article  Google Scholar 

  8. Cohen, S., Matias, Y.: Spectral bloom filters. In: Proc. ACM SIGMOD Conf., California, June 2003, pp. 241–252 (2003)

    Google Scholar 

  9. Muthukrishnan, S.: Data Streams: Algorithms and Applications. Foundations and Trends in Theoretical Computer Science. Now Publishers (August 2005)

    Google Scholar 

  10. Arasu, A., Manku, G.: Approximate counts and quantiles over sliding windows. In: Proc. Principles of Database Systems(PODS), Paris, France, June 2004, pp. 286–296 (2004)

    Google Scholar 

  11. Metwally, A., Agrawal, D., Abbadi, A.E.: Duplicate detection in click streams. In: Proc. 14th Int. Conf. World Wide Web, Chiba, Japan, May 2005, pp. 12–21 (2005)

    Google Scholar 

  12. Deng, F., Rafiei, D.: Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters. In: Proc. ACM SIGMOD Conf., New York, June 2006, pp. 25–36 (2006)

    Google Scholar 

  13. Cormode, G., Tirthapura, S., Xu, B.: Time-decaying sketches for sensor data aggregation. In: Proc. Principles of distributed computing (PODC), Portland, Oregon, May 2007, pp. 215–224 (2007)

    Google Scholar 

  14. Cheng, K., Xiang, L., Iwaihara, M.: Time-Decaying Bloom Filters for Data Streams with Skewed Distributions. In: Proc. 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications (RIDE-SDMA) (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Y., Shen, H., Tian, H., Zhang, X. (2009). Dynamically Maintaining Duplicate-Insensitive and Time-Decayed Sum Using Time-Decaying Bloom Filter. In: Hua, A., Chang, SL. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2009. Lecture Notes in Computer Science, vol 5574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03095-6_70

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03095-6_70

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03094-9

  • Online ISBN: 978-3-642-03095-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics