ABSTRACT
We present a new sketch for summarizing network data. The sketch has the following properties which make it useful in communication-efficient aggregation in distributed streaming scenarios, such as sensor networks: the sketch is duplicate-insensitive, i.e. re-insertions of the same data will not affect the sketch, and hence the estimates of aggregates. Unlike previous duplicate-insensitive sketches for sensor data aggregation [26,12], it is also time-decaying, so that the weight of a data item in the sketch can decrease with time according to a user-specified decay function. The sketch can give provably approximate guarantees for various aggregates of data, including the sum, median, quantiles, and frequent elements. The size of the sketch and the time taken to update it are both polylogarithmic in the size of the relevant data. Further, multiple sketches computed over distributed data can be combined without losing the accuracy guarantees. To our knowledge, this is the first sketch that combines all the above properties.
- C. C. Aggarwal. On biased reservoir sampling in the presence of stream evolution. In VLDB, 2006. Google ScholarDigital Library
- I. Akyildiz, W. Su, Y. Sankarasubramaniam and E. Cayirci. A survey on sensor networks. IEEE Commun. Mag. 40 (8) (2002) 102--114 Google ScholarDigital Library
- N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and System Sciences, 58(1):137--147, 1999. Google ScholarDigital Library
- A. Arasu and G. Manku. Approximate counts and quantiles over sliding windows. In PODS, 2004. Google ScholarDigital Library
- B. Babcock, M. Datar, and R. Motwani. Sampling from a moving window over streaming data. In SODA, 2002. Google ScholarDigital Library
- B. Babcock, M. Datar, R. Motwani, and L. O'Callaghan. Maintaining variance and k-medians over data stream windows. In PODS, 2003. Google ScholarDigital Library
- A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher. Min-wise independent permutations. In STOC, 1998. Google ScholarDigital Library
- C. Busch and S. Tirthapura. A deterministic algorithm for summarizing asynchronous streams over a sliding window. In STACS, 2007. Google ScholarDigital Library
- J. L. Carter and M. L. Wegman. Universal classes of hash functions. J. of Comp. and System Sciences, 18(2):143--154, 1979.Google ScholarCross Ref
- Y. Chen, H. V. Leong, M. Xu, Jiannong Cao, K. C. C Chan and A. T. S Chan. In-network data processing for wireless sensor networks. In MDM (International Conference on Mobile Data Management) 2006 Google ScholarDigital Library
- E. Cohen and M. Strauss. Maintaining time-decaying stream aggregates. In PODS, 2003. Google ScholarDigital Library
- J. Considine, F. Li, G. Kollios, and J. Byers. Approximate aggregation techniques for sensor databases. In ICDE, 2004. Google ScholarDigital Library
- G. Cormode and S. Muthukrishnan. Space efficient mining of multigraph streams. In PODS, 2005. Google ScholarDigital Library
- G. Cormode, S. Tirthapura, B. Xu. Time-Decaying Sketches for Sensor Data Aggregation. Technical Report TR-2007-06-0, Department of Electrical and Computer Engineering, Iowa State University.Google Scholar
- M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. SIAM J. on Computing, 31(6):1794--1813, 2002. Google ScholarDigital Library
- P. Flajolet and G. N. Martin. Probabilistic counting. In FOCS, 1983.Google ScholarDigital Library
- P. Gibbons and S. Tirthapura. Estimating simple functions on the union of data streams. In SPAA, 2001. Google ScholarDigital Library
- P. Gibbons and S. Tirthapura. Distributed streams algorithms for sliding windows. In SPAA, 2002. Google ScholarDigital Library
- P. Indyk, D. Woodruff. Tight lower bounds for the distinct elements problem. In FOCS, 2003. Google ScholarDigital Library
- N. Kimura and S. Latifi. A survey on data compression in wireless sensor networks. In ITCC International Conference on Information Technology Coding and Computing, 2005 Google ScholarDigital Library
- E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University Press, 1997. Google ScholarDigital Library
- L. K. Lee and H. F. Ting. A simpler and more efficient deterministic scheme for finding frequent items over sliding windows. In PODS, 2006. Google ScholarDigital Library
- S. Madden, M. Franklin, J. Hellerstein, and W. Hong. Tag: a tiny aggregation service for ad-hoc sensor networks. SIGOPS Operating Systems Review, 36(SI):131--146, 2002. Google ScholarDigital Library
- J. I. Munro and M. S. Paterson. Selection and sorting with limited storage. Theoretical Computer Science, 12(3):315--323, 1980.Google ScholarCross Ref
- S. Muthukrishnan. Data Streams: Algorithms and Applications. Foundations and Trends in Theoretical Computer Science. Now Publishers, August 2005. Google ScholarDigital Library
- S. Nath, P. B. Gibbons, S. Seshan, and Z. Anderson. Synopsis diffusion for robust aggregation in sensor networks. In SENSYS, 2004. Google ScholarDigital Library
- A. Pavan and S. Tirthapura. Range-efficient computation of F0 over massive data streams. In SIAM Journal on Computing, 37(2):359--379, 2007. Google ScholarDigital Library
- S. Tirthapura, B. Xu, and C. Busch. Sketching asynchronous streams over a sliding window. In PODC, 2006. Google ScholarDigital Library
Index Terms
- Time-decaying sketches for sensor data aggregation
Recommendations
Time-decaying Sketches for Robust Aggregation of Sensor Data
We present a new sketch for summarizing network data. The sketch has the following properties which make it useful in communication-efficient aggregation in distributed streaming scenarios, such as sensor networks: the sketch is duplicate insensitive, ...
Sketching asynchronous streams over a sliding window
PODC '06: Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computingWe study the problem of maintaining sketches of recent elements of a data stream. Motivated by applications involving network data, we consider streams that are asynchronous, in which the observed order of data is not the same as the time order in which ...
An Efficient Data Streams Mining Method for Wireless Sensor Network's Data Aggregation
ETCS '09: Proceedings of the 2009 First International Workshop on Education Technology and Computer Science - Volume 03Wireless distributed sensor systems will enable the reliable monitoring of a variety of environments for both civil and military applications. The data model generated by sensor network is data streams. Because of the rapid data arriving speed and huge ...
Comments