Abstract
Systems for processing continuous monitoring queries over data streams must be adaptive because data streams are often bursty and data characteristics may vary over time. In this chapter, we focus on one particular type of adaptivity: the ability to gracefully degrade performance via “load shedding” (dropping unprocessed tuples to reduce system load) when the demands placed on the system cannot be met in full given available resources. Focusing on aggregation queries, we present algorithms that determine at what points in a query plan should load shedding be performed and what amount of load should be shed at each point in order to minimize the degree of inaccuracy introduced into query answers. We also discuss strategies for load shedding for other types of queries (set-valued queries, join queries, and classification queries).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
B. Babcock. Processing Continuous Queries over Streaming Data With Limited System Resources. PhD thesis, Stanford University, Department of Computer Science, 2005.
B. Babcock, M. Datar, and R. Motwani. Load shedding for aggregation queries over data streams. In Proceedings of the 2004 International Conference on Data Engineering, pages 350–361, March 2004.
D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, G. Seidman, M. Stonebraker, N. Tatbul, and S. Zdonik. Monitoring streams-a new class of data management applications. In Proc. 28th Intl. Conf. on Very Large Data Bases, August 2002.
Y. Chi, P. S. Yu, H. Wang, and R. R. Muntz. Loadstar: A load shedding scheme for classifying data streams. In Proceedings of the 2005 SIAM International Data Mining Conference, April 2005.
A. Das, J. Gehrke, and M. Riedwald. Approximate join processing over data streams. In Proceedings of the 2003 ACM SIGMOD International Conf. on Management of Data, pages 40–51, 2003.
W. Hoeffding. Probability inequalities for sums of bounded random variables. In Journal of the American Statistical Association, volume 58, pages 13–30, March 1963.
J. Kang, J. F. Naughton, and S. Viglas. Evaluating window joins over unbounded streams. In Proceedings of the 2003 International Conference on Data Engineering, March 2003.
R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma. Query processing, approximation, and resource management in a data stream management system. In Proc. First Biennial Conf. on Innovative Data Systems Research (CIDR), January 2003.
N. Tatbul, U. Cetintemel, S. Zdonik, M. Cherniack, and M. Stonebraker. Load shedding in a data stream manager. In Proceedings of the 2003 International Conference on Very Large Data Bases, pages 309–320, September 2003.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Babcock, B., Datar, M., Motwani, R. (2007). Load Shedding in Data Stream Systems. In: Aggarwal, C.C. (eds) Data Streams. Advances in Database Systems, vol 31. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-47534-9_7
Download citation
DOI: https://doi.org/10.1007/978-0-387-47534-9_7
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-28759-1
Online ISBN: 978-0-387-47534-9
eBook Packages: Computer ScienceComputer Science (R0)