Skip to main content

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window

  • Conference paper
STACS 2007 (STACS 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4393))

Included in the following conference series:

Abstract

We consider the problem of maintaining aggregates over recent elements of a massive data stream. Motivated by applications involving network data, we consider asynchronous data streams, where the observed order of data may be different from the order in which the data was generated. The set of recent elements is modeled as a sliding timestamp window of the stream, whose elements are changing continuously with time. We present the first deterministic algorithms for maintaining a small space summary of elements in a sliding timestamp window of an asynchronous data stream. The summary can return approximate answers for the following fundamental aggregates: basic count, the number of elements within the sliding window, and sum, the sum of all element values within the sliding window. For basic counting, the space taken by our summary is O(logW ·logB ·(logW + logB)/ε) bits, where B is an upper bound on the value of the basic count, W is an upper bound on the width of the timestamp window, and ε is the desired relative error. Our algorithms are based on a novel data structure called splittable histogram. Prior to this work, randomized algorithms were known for this problem, which provide weaker guarantees than those provided by our deterministic algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Arasu, A., Manku, G.: Approximate counts and quantiles over sliding windows. In: Proc. ACM Symposium on Principles of Database Systems (PODS), pp. 286–296. ACM Press, New York (2004)

    Google Scholar 

  2. Babcock, B., et al.: Maintaining variance and k-medians over data stream windows. In: Proc. 22nd ACM Symp. on Principles of Database Systems (PODS), June 2003, pp. 234–243. ACM Press, New York (2003)

    Google Scholar 

  3. Busch, C., Tirthapura, S.: A deterministic algorithm for summarizing asynchronous streams over a sliding window. Technical report, Iowa State University (2006), Available at http://archives.ece.iastate.edu/view/year/2006.html

  4. Cormode, G., et al.: Space- and time-efficient deterministic algorithms for biased quantiles over data streams. In: Proc. ACM Symposium on Principles of Database Systems, pp. 263–272. ACM Press, New York (2006)

    Google Scholar 

  5. Datar, M., et al.: Maintaining stream statistics over sliding windows. SIAM Journal on Computing 31(6), 1794–1813 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  6. Feigenbaum, J., Kannan, S., Zhang, J.: Computing diameter in the streaming and sliding-window models. Algorithmica 41, 25–41 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  7. Gibbons, P., Tirthapura, S.: Distributed streams algorithms for sliding windows. Theory of Computing Systems 37, 457–478 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  8. Guha, S., Gunopulos, D., Koudas, N.: Correlating synchronous and asynchronous data streams. In: Proc.9th ACM International Conference on Knowledge Discovery and Data Mining (KDD), pp. 529–534. ACM Press, New York (2003)

    Google Scholar 

  9. Manjhi, A., et al.: Finding (recently) frequent items in distributed data streams. In: Proc. IEEE International Conference on Data Engineering (ICDE), pp. 767–778. IEEE Computer Society Press, Los Alamitos (2005)

    Google Scholar 

  10. Muthukrishnan, S.: Data Streams: Algorithms and Applications. In: Foundations and Trends in Theoretical Computer Science, Now Publishers, Hanover (Aug. 2005)

    Google Scholar 

  11. Srivastava, U., Widom, J.: Flexible time management in data stream systems. In: Proc. 23rd ACM Symposium on Principles of Database Systems (PODS), pp. 263–274. ACM Press, New York (2004)

    Google Scholar 

  12. Tirthapura, S., Xu, B., Busch, C.: Sketching asynchronous streams over a sliding window. In: Proc. 25th annual ACM symposium on Principles of distributed computing (PODC), pp. 82–91. ACM Press, New York (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Wolfgang Thomas Pascal Weil

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Busch, C., Tirthapura, S. (2007). A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window. In: Thomas, W., Weil, P. (eds) STACS 2007. STACS 2007. Lecture Notes in Computer Science, vol 4393. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70918-3_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70918-3_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70917-6

  • Online ISBN: 978-3-540-70918-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics