Abstract
In an asynchronous data stream, the data items may be out of order with respect to their original timestamps. This paper gives a space-efficient data structure to maintain such a data stream so that it can approximate the frequent item set over a sliding time window with sufficient accuracy. Prior to our work, Cormode et al. [3] have the best solution, with space complexity \(O(\frac{1}{\varepsilon} \log W \log (\frac{\varepsilon B}{\log W}) \min\{\log W, \frac{1}{\varepsilon}\}\log U)\), where ε is the given error bound, W and B are parameters of the sliding window, and U is the number of all possible item names. Our solution reduces the space to \(O(\frac{1}{\varepsilon} \log W \log (\frac{\varepsilon B}{\log W}))\). We also unify the study of synchronous and asynchronous data stream by quantifying the delay of the data items. When the delay is zero, our solution matches the space complexity of the best solution to the synchronous data streams [8].
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Arasu, A., Manku, G.: Approximate counts and quantiles over sliding windows. In: Proc. PODS, pp. 286–296 (2004)
Busch, C., Tirthapua, S.: A deterministic algorithm for summarizing asynchronous streams over a sliding window. In: Thomas, W., Weil, P. (eds.) STACS 2007. LNCS, vol. 4393, pp. 465–476. Springer, Heidelberg (2007)
Cormode, G., Korn, F., Tirthapura, S.: Time-decaying aggregates in out-of-order streams. In: Proc. PODS, pp. 89–98 (2008)
Datar, M., Gionis, A., Indyk, P., Motwani, R.: Maintaining stream statistics over sliding windows. SIAM Journal on Computing 31(6), 1794–1813 (2002)
Demaine, E., Lopez-Ortiz, A., Munro, J.: Frequency estimation of internet packet streams with limited space. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 348–360. Springer, Heidelberg (2002)
Karp, R., Shenker, S., Papadimitriou, C.: A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Database Systems 28(1), 51–55 (2003)
Lee, L.K., Ting, H.F.: Maintaining significant stream statistics over sliding windows. In: Proc. SODA, pp. 724–732 (2006)
Lee, L.K., Ting, H.F.: A simpler and more efficient deterministic scheme for finding frequent items over sliding windows. In: Proc. PODS, pp. 290–297 (2006)
Misra, J., Gries, D.: Finding repeated elements. Science of Computer Programming 2(2), 143–152 (1982)
Shrivastava, N., Buragohain, C., Agrawal, D., Suri, S.: Medians and beyond: new aggregation techniques for sensor networks. In: Proc. SenSys, pp. 239–249 (2004)
Tirthapura, S., Xu, B., Busch, C.: Sketching asynchronous streams over a sliding window. In: PODC, pp. 82–91 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chan, HL., Lam, TW., Lee, LK., Ting, HF. (2010). Approximating Frequent Items in Asynchronous Data Stream over a Sliding Window. In: Bampis, E., Jansen, K. (eds) Approximation and Online Algorithms. WAOA 2009. Lecture Notes in Computer Science, vol 5893. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12450-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-12450-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12449-5
Online ISBN: 978-3-642-12450-1
eBook Packages: Computer ScienceComputer Science (R0)