skip to main content
10.1145/1265530.1265562acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
Article

Variance estimation over sliding windows

Published: 11 June 2007 Publication History

Abstract

Capturing characteristics of large data streams has received considerable attention. The constraints in space and time restrict the data stream processing to only one pass (or a small number of passes). Processing data streams over sliding windows make the problem more difficult and challenging. In this paper, we address the problem of maintaining ∈-approximate variance of data streams over sliding windows. To our knowledge, the best existing algorithm requires O(1/∈2 log N) space, though the lower bound for this problem is Ω(1/∈ log N). We propose the first ∈-approximation algorithm to this problem that is optimal in both space and worst case time. Our algorithm requires O(1/∈ log N) space. Furthermore, its running time is O(1) in worst case.

References

[1]
A. Arasu and G. S. Manku. Approximate counts and quantiles over sliding windows. In Proceedings of the 23rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2004), Paris, France, June 2004.
[2]
B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2002), pages 1--16, Madison, USA, June 2002.
[3]
B. Babcock, M. Datar, R. Motwani, and L. O'Callaghan. Maintaining variance and k-medians over data stream windows. In Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2003), pages 234--243, San Diego, USA, June 2003.
[4]
M. Datar, A. Gionis, P. Indyk, and R. Motwani. Maintaining stream statistics over sliding windows. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2002), pages 635--644, San Francisco, USA, Jan. 2002.
[5]
J. Feigenbaum, S. Kannan, and J. Zhang. Computing diameter in the streaming and sliding-window models. Algorithmica, 41(1):25--41, 2004.
[6]
P. B. Gibbons and S. Tirthapura. Distributed streams algorithms for sliding windows. In Proceedings of the 14th ACM Symposium on Parallel Algorithms and Architectures (SPAA 2002), Winnipeg, Manitoba, Canada, Aug. 2002.
[7]
L. Golab, D. DeHaan, E. D. Demaine, A. López-Ortiz, and J. I. Munro. Identifying frequent items in sliding windows over on-line packet streams. In Proceedings of the Internet Measurement Conference (IMC 2003), Miami, USA, Oct. 2003.
[8]
X. Lin, H. Lu, J. Xu, and J. X. Yu. Continuously maintaining quantile summaries of the most recent N elements over a data stream. In Proceedings of the 20th International Conference on Data Engineering (ICDE 2004), Boston, USA, Mar. 2004.
[9]
S. Muthukrishnan. Data streams: Algorithms and applications. Technical report, Rutgers University, Piscataway, USA, 2003.
[10]
S. Tirthapura, B. Xu, and C. Busch. Sketching asynchronous streams over sliding windows. In Proceedings of the 25th ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC 2006), Denver, USA, July 2006.
[11]
Y. Zhu and D. Shasha. Statstream: Statistical monitoring of thousands of data streams in real time. In Proceedings of 28th International Conference on Very Large Data Bases (VLDB 2002), Hong Kong, China, Aug. 2002.

Cited By

View all
  • (2024)Fault Diagnostics Based on the Analysis of Probability Distributions Estimated Using a Particle FilterSensors10.3390/s2403071924:3(719)Online publication date: 23-Jan-2024
  • (2023)Relaxation to Quantum Equilibrium and the Born Rule in Nelson’s Stochastic DynamicsFoundations of Physics10.1007/s10701-023-00730-w53:6Online publication date: 6-Nov-2023
  • (2020)Stratified random sampling from streaming and stored dataDistributed and Parallel Databases10.1007/s10619-020-07315-w39:3(665-710)Online publication date: 23-Oct-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '07: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2007
328 pages
ISBN:9781595936851
DOI:10.1145/1265530
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data mining
  2. data streams
  3. sliding windows
  4. variance estimation

Qualifiers

  • Article

Conference

SIGMOD/PODS07
Sponsor:

Acceptance Rates

PODS '07 Paper Acceptance Rate 28 of 187 submissions, 15%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Fault Diagnostics Based on the Analysis of Probability Distributions Estimated Using a Particle FilterSensors10.3390/s2403071924:3(719)Online publication date: 23-Jan-2024
  • (2023)Relaxation to Quantum Equilibrium and the Born Rule in Nelson’s Stochastic DynamicsFoundations of Physics10.1007/s10701-023-00730-w53:6Online publication date: 6-Nov-2023
  • (2020)Stratified random sampling from streaming and stored dataDistributed and Parallel Databases10.1007/s10619-020-07315-w39:3(665-710)Online publication date: 23-Oct-2020
  • (2018)HEARTDROID—Rule engine for mobile and context‐aware expert systemsExpert Systems10.1111/exsy.1232836:1Online publication date: 23-Aug-2018
  • (2017)Rules in Mobile Context-Aware SystemsModeling with Rules Using Semantic Knowledge Engineering10.1007/978-3-319-66655-6_17(403-430)Online publication date: 5-Oct-2017
  • (2016)Identifying correlated heavy-hitters in a two-dimensional data streamData Mining and Knowledge Discovery10.1007/s10618-015-0438-630:4(797-818)Online publication date: 1-Jul-2016
  • (2015)Efficiently Summarizing Data Streams over Sliding WindowsProceedings of the 2015 IEEE 14th International Symposium on Network Computing and Applications (NCA)10.1109/NCA.2015.46(151-158)Online publication date: 28-Sep-2015
  • (2014)Mining frequent items in data stream using time fading modelInformation Sciences: an International Journal10.1016/j.ins.2013.09.007257(54-69)Online publication date: 1-Feb-2014
  • (2010)Sketch-Based Streaming PCA Algorithm for Network-Wide Traffic Anomaly DetectionProceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems10.1109/ICDCS.2010.45(807-816)Online publication date: 21-Jun-2010
  • (2010)The frequent items problem, under polynomial decay, in the streaming modelTheoretical Computer Science10.1016/j.tcs.2010.04.029411:34-36(3048-3054)Online publication date: 1-Jul-2010
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media