skip to main content
column

The continuous distributed monitoring model

Published: 01 May 2013 Publication History

Abstract

In the model of continuous distributed monitoring, a number of observers each see a stream of observations. Their goal is to work together to compute a function of the union of their observations. This can be as simple as counting the total number of observations, or more complex non-linear functions such as tracking the entropy of the induced distribution. Assuming that it is too costly to simply centralize all the observations, it becomes quite challenging to design solutions which provide a good approximation to the current answer, while bounding the communication cost of the observers, and their other resources such as their space usage. This survey introduces this model, and describe a selection results in this setting, from the simple counting problem to a variety of other functions that have been studied.

References

[1]
C. Arackaparambil, S. Bratus, J. Brody, and A. Shubina. Distributed monitoring of conditional entropy for anomaly detection in streams. In IPDPS Workshops, 2010.
[2]
C. Arackaparambil, J. Brody, and A. Chakrabarti. Functional monitoring without monotonicity. In International Colloquium on Automata, Languages and Programming (ICALP), 2009.
[3]
B. Babcock and C. Olston. Distributed top-k monitoring. In ACM SIGMOD International Conference on Management of Data, 2003.
[4]
V. Braverman, R. Ostrovsky, and C. Zaniolo. Optimal sampling from sliding windows. In ACM Principles of Database Systems, 2009.
[5]
H.-L. Chan, T.-W. Lam, L.-K. Lee, and H.-F. Ting. Continuous monitoring of distributed data streams over a time-based sliding window. In Symposium on Theoretical Aspects of Computer Science (STACS), 2010.
[6]
G. Cormode and M. Garofalakis. Efficient strategies for continuous distributed tracking tasks. IEEE Data Engineering Bulletin, 28(1):33--39, March 2005.
[7]
G. Cormode and M. Garofalakis. Sketching streams through the net: Distributed approximate query tracking. In International Conference on Very Large Data Bases, 2005.
[8]
G. Cormode and M. Garofalakis. Streaming in a connected world: Querying and tracking distributed data streams. In ACM SIGMOD International Conference on Management of Data, 2007.
[9]
G. Cormode, M. Garofalakis, S. Muthukrishnan, and R. Rastogi. Holistic aggregates in a networked world: Distributed tracking of approximate quantiles. In ACM SIGMOD International Conference on Management of Data, 2005.
[10]
G. Cormode, S. Muthukrishnan, and K. Yi. Algorithms for distributed, functional monitoring. In ACM-SIAM Symposium on Discrete Algorithms, 2008.
[11]
G. Cormode, S. Muthukrishnan, K. Yi, and Q. Zhang. Continuous sampling from distributed streams. J. ACM, 59(2):25, 2012.
[12]
G. Cormode, S. Muthukrishnan, and W. Zhuang. What's different: Distributed, continuous monitoring of duplicate resilient aggregates on data streams. In IEEE International Conference on Data Engineering, 2006.
[13]
G. Cormode, S. Muthukrishnan, and W. Zhuang. Conquering the divide: Continuous clustering of distributed data streams. In IEEE International Conference on Data Engineering, 2007.
[14]
G. Cormode and K. Yi. Tracking distributed aggregates over time-based sliding windows. In ACM Conference on Principles of Distributed Computing (PODC), 2011.
[15]
A. Das, S. Ganguly, M. Garofalakis, and R. Rastogi. Distributed set-expression cardinality estimation. In International Conference on Very Large Data Bases, 2004.
[16]
M. Dilman and D. Raz. Efficient reactive monitoring. In IEEE INFOCOMM, 2001.
[17]
J. Feldman, S. Muthukrishnan, A. Sidiropoulos, C. Stein, and Z. Svitkina. On distributing symmetric streaming computations. In ACMSIAM Symposium on Discrete Algorithms, 2008.
[18]
S. Ganguly and B. Lakshminath. Estimating entropy over data streams. In European Symposium on Algorithms (ESA), 2006.
[19]
N. Giatrakos, A. Deligiannakis, M. N. Garofalakis, I. Sharfman, and A. Schuster. Prediction-based geometric monitoring over distributed data streams. In ACM SIGMOD International Conference on Management of Data, pages 265--276, 2012.
[20]
P. Gibbons and S. Tirthapura. Estimating simple functions on the union of data streams. In ACM Symposium on Parallel Algorithms and Architectures (SPAA), pages 281--290, 2001.
[21]
P. Gibbons and S. Tirthapura. Distributed streams algorithms for sliding windows. In ACM Symposium on Parallel Algorithms and Architectures (SPAA), 2002.
[22]
N. J. A. Harvey, J. Nelson, and K. Onak. Sketching and streaming entropy via approximation theory. In IEEE Conference on Foundations of Computer Science, 2008.
[23]
K. Heffner and G. Malecha. Design and implementation of generalized functional monitoring, 2009. http://www.people.fas.harvard.edu/¿gmalecha/proj/funkymon.pdf,
[24]
L. Huang, M. N. Garofalakis, A. D. Joseph, and N. Taft. Communication-efficient tracking of distributed cumulative triggers. In ICDCS, 2007.
[25]
L. Huang, X. Nguyen, M. Garofalakis, J. Hellerstein, A. D. Joseph, M. Jordan, and N. Taft. Communication-efficient online detection of network-wide anomalies. In IEEE INFOCOMM, 2007.
[26]
A. Jain, J. Hellerstein, S. Ratnasamy, and D. Wetherall. A wakeup call for internet monitoring systems: The case for distributed triggers. In Proceedings of the 3rd Workshop on Hot Topics in Networks (Hotnets), 2004.
[27]
N. Jain, M. Dahlin, Y. Zhang, D. Kit, P. Mahajan, and P. Yalagandula. STAR: Self-tuning aggregation for scalable monitoring. In International Conference on Very Large Data Bases, 2007.
[28]
R. Keralapura, G. Cormode, and J. Ramamirtham. Communication-efficient distributed monitoring of thresholded counts. In ACM SIGMOD International Conference on Management of Data, 2006.
[29]
E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University Press, 1997.
[30]
A. Lakhina, M. Crovella, and C. Diot. Mining anomalies using traffic feature distributions. In ACM SIGCOMM, 2005.
[31]
Z. Liu, B. Radunovic, and M. Vojnovic. Continuous distributed counting for non-monotonic streams. In ACM Principles of Database Systems, pages 307--318, 2012.
[32]
S. Muthukrishnan. Data streams: Algorithms and applications. In ACM-SIAM Symposium on Discrete Algorithms, 2003.
[33]
S. Muthukrishnan. Some algorithmic problems and results in compressed sensing. In Allerton Conference, 2006.
[34]
C. Olston, J. Jiang, and J. Widom. Adaptive filters for continuous queries over distributed data streams. In ACM SIGMOD International Conference on Management of Data, 2003.
[35]
I. Sharfman, A. Schuster, and D. Keren. A geometric approach to monitoring threshold functions over distributed data streams. In ACM SIGMOD International Conference on Management of Data, 2006.
[36]
I. Sharfman, A. Schuster, and D. Keren. Shape sensitive geometric monitoring. In ACM Principles of Database Systems, 2008
[37]
D. Slepian and J. Wolf. Noiseless coding of correlated information sources. IEEE Transactions on on Information Theory, 19:471--480, 1973.
[38]
S. Tirthapura and D. P. Woodruff. Optimal random sampling from distributed streams revisited. In DISC, 2011.
[39]
D. P. Woodruff and Q. Zhang. Tight bounds for distributed functional monitoring. In ACM Symposium on Theory of Computing, pages 941--960, 2012.
[40]
K. Yi and Q. Zhang. Optimal tracking of distributed heavy hitters and quantiles. In ACM Principles of Database Systems, 2009.

Cited By

View all
  • (2025)The Communication Complexity of Distributed MaximizationComputing and Combinatorics10.1007/978-981-96-1090-7_40(494-504)Online publication date: 5-Mar-2025
  • (2024)Distributed Thresholded Counting with Limited InteractionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671868(4664-4675)Online publication date: 25-Aug-2024
  • (2023)Near-optimal k-clustering in the sliding window modelProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667116(22934-22960)Online publication date: 10-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMOD Record
ACM SIGMOD Record  Volume 42, Issue 1
March 2013
51 pages
ISSN:0163-5808
DOI:10.1145/2481528
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2013
Published in SIGMOD Volume 42, Issue 1

Check for updates

Qualifiers

  • Column

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)The Communication Complexity of Distributed MaximizationComputing and Combinatorics10.1007/978-981-96-1090-7_40(494-504)Online publication date: 5-Mar-2025
  • (2024)Distributed Thresholded Counting with Limited InteractionProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671868(4664-4675)Online publication date: 25-Aug-2024
  • (2023)Near-optimal k-clustering in the sliding window modelProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667116(22934-22960)Online publication date: 10-Dec-2023
  • (2023)Distributed Data StreamsAlgorithms for Big Data10.1007/978-3-031-21534-6_10(179-195)Online publication date: 18-Jan-2023
  • (2022)Communication Efficient Algorithms for Bounding and Approximating the Empirical Entropy in Distributed SystemsEntropy10.3390/e2411161124:11(1611)Online publication date: 5-Nov-2022
  • (2022)Parallel Weighted Random SamplingACM Transactions on Mathematical Software10.1145/354993448:3(1-40)Online publication date: 10-Sep-2022
  • (2022)Truly Perfect Samplers for Data Streams and Sliding WindowsProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3524139(29-40)Online publication date: 12-Jun-2022
  • (2022)AutoMon: Automatic Distributed Monitoring for Arbitrary Multivariate FunctionsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517866(310-324)Online publication date: 10-Jun-2022
  • (2022)Enhancing distributed functional monitoring with quantum protocolsQuantum Information Processing10.1007/s11128-019-2484-218:12(1-25)Online publication date: 11-Mar-2022
  • (2021)Efficient Approximate Algorithms for Empirical Entropy and Mutual InformationProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457255(274-286)Online publication date: 9-Jun-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media