skip to main content
10.1145/1007568.1007615acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Adaptive ordering of pipelined stream filters

Published: 13 June 2004 Publication History

Abstract

We consider the problem of pipelined filters, where a continuous stream of tuples is processed by a set of commutative filters. Pipelined filters are common in stream applications and capture a large class of multiway stream joins. We focus on the problem of ordering the filters adaptively to minimize processing cost in an environment where stream and filter characteristics vary unpredictably over time. Our core algorithm, A-Greedy (for Adaptive Greedy), has strong theoretical guarantees: If stream and filter characteristics were to stabilize, A-Greedy would converge to an ordering within a small constant factor of optimal. (In experiments A-Greedy usually converges to the optimal ordering.) One very important feature of A-Greedy is that it monitors and responds to selectivities that are correlated across filters (i.e., that are nonindependent), which provides the strong quality guarantee but incurs run-time overhead. We identify a three-way tradeoff among provable convergence to good orderings, run-time overhead, and speed of adaptivity. We develop a suite of variants of A-Greedy that lie at different points on this tradeoff spectrum. We have implemented all our algorithms in the STREAM prototype Data Stream Management System and a thorough performance evaluation is presented.

References

[1]
R. Avnur and J. Hellerstein. Eddies: Continuously adaptive query processing. In Proc. of the 2000 ACM SIGMOD Intl. Conf. on Management of Data, pages 261--272, May 2000.
[2]
S. Babu, R. Motwani, K. Munagala, I. Nishizawa, and J. Widom. Adaptive ordering of pipelined stream filters. Technical report, Stanford University Database Group, Nov. 2003. Available at http://dbpubs.stanford.edu/pub/2003-69.
[3]
S. Babu, K. Munagala, J. Widom, and R. Motwani. Adaptive caching for continuous queries. Technical report, Stanford University Database Group, Mar. 2004, Available at http://dbpubs.stanford.edu/pub/2004-14.
[4]
N. Bruno and S. Chaudhuri. Exploiting statistics on query expressions for optimization. In Proc. of the 2002 ACM SIGMOD Intl. Conf. on Management of Data, pages 263--274, June 2002.
[5]
D. Carney et al. Monitoring streams-a new class of data management applications. In Proc. of the 2002 Intl. Conf. on Very Large Data Bases, Aug. 2002.
[6]
S. Chandrasekaran et al. TelegraphCQ: Continuous dataflow processing for an uncertain world. In Proc. First Biennial Conf. on Innovative Data Systems Research, Jan. 2003.
[7]
S. Chaudhuri and K. Shim. Optimization of queries with user-defined predicates. ACM Trans. on Database Systems, 24(2):177--228, 1999.
[8]
J. Chen, D. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. In Proc. of the 2000 ACM SIGMOD Intl. Conf. on Management of Data, pages 379--390, May 2000.
[9]
S. Christodoulakis. Implications of certain assumptions in database performance evaluation. ACM Trans. on Database Systems, 9(2):163--186, 1984.
[10]
E. Cohen, A. Fiat, and H. Kaplan. Efficient sequences of trials. In Proc. of the 2003 Annual ACM-SIAM Symp. on Discrete Algorithms, Jan. 2003.
[11]
C. Cranor, T. Johnson, O. Spataschek, and V. Shkapenyuk. Gigascope: A stream database for network applications. In Proc. of the 2003 ACM SIGMOD Intl. Conf. on Management of Data, pages 647--651, June 2003.
[12]
A. Deshpande. An initial study of overheads of eddies. SIGMOD Record, 32(4), Dec. 2003.
[13]
N. Duffield and M. Grossglauser. Trajectory sampling for direct traffic observation. In Proc. of the 2000 ACM SIGCOMM, pages 271--284, Sept. 2000.
[14]
F. Fabret et al. Filtering algorithms and implementation for very fast publish/subscribe. In Proc. of the 2001 ACM SIGMOD Intl. Conf. on Management of Data, June 2001.
[15]
U. Feige, L. Lovász, and P. Tetali. Approximating min-sum set cover. In Proc. of the 5th Intl. Workshop on Approximation Algorithms for Combinatorial Optimization (APPROX), Sept. 2002.
[16]
L. Golab and T. Ozsu. Processing sliding window multi-joins in continuous queries over data streams. In Proc. of the 2003 Intl. Conf. on Very Large Data Bases, Sept. 2003.
[17]
M. Hammad, W. Aref, and A. Elmagarmid. Stream window join: Tracking moving objects in sensor-network databases. In Proc. of the 2003 Intl. Conf. on Scientific and Statistical Database Management, June 2003.
[18]
J. Hellerstein. Optimization techniques for queries with expensive methods. ACM Trans. on Database Systems, 23(2):113--157, 1998.
[19]
Z. Ives. Efficient Query Processing for Data Integration. PhD thesis, University of Washington, Seattle, WA, USA, Aug. 2002.
[20]
Z. Ives, D. Florescu, M. Friedman, A. Levy, and D. Weld. An adaptive query execution system for data integration. In Proc. of the 1999 ACM SIGMOD Intl. Conf. on Management of Data, pages 299--310, June 1999.
[21]
N. Kabra and D. DeWitt. Efficient mid-query re-optimization of sub-optimal query execution plans. In Proc. of the 1998 ACM SIGMOD Intl. Conf. on Management of Data, pages 106--117, June 1998.
[22]
J. Kang, J. Naughton, and S. Viglas. Evaluating window joins over unbounded streams. In Proc. of the 2003 Intl. Conf. on Data Engineering, Mar. 2003.
[23]
A. Kemper, G. Moerkotte, and M. Steinbrunn. Optimizing boolean expressions in object-bases. In Proc. of the 1992 Intl. Conf. on Very Large Data Bases, pages 79--90, Aug. 1992.
[24]
R. Krishnamurthy, H. Boral, and C. Zaniolo. Optimization of nonrecursive queries. In Proc. of the 1986 Intl. Conf. on Very Large Data Bases, pages 128--137, Aug. 1986.
[25]
S. Madden, M. Shah, J. Hellerstein, and V. Raman. Continuously adaptive continuous queries over streams. In Proc. of the 2002 ACM SIGMOD Intl. Conf. on Management of Data, pages 49--60, June 2002.
[26]
R. Motwani, J. Widom, et al. Query processing, approximation, and resource management in a data stream management system. In Proc. First Biennial Conf. on Innovative Data Systems Research (CIDR), Jan. 2003.
[27]
K. Munagala, S. Babu, R. Motwani, and J. Widom. The pipelined set cover problem. Technical report, Stanford University Database Group, Oct. 2003. Available at http://dbpubs.stanford.edu/pub/2003-65.
[28]
V. Raman, A. Deshpande, and J. Hellerstein. Using state modules for adaptive query processing. In Proc. of the 2003 Intl. Conf. on Data Engineering, Mar. 2003.
[29]
K. Ross. Conjunctive selection conditions in main memory. In Proc. of the 2002 ACM Symp. on Principles of Database Systems, June 2002.
[30]
M. Stillger. G. Lohman, V. Markl, and M. Kandil. LEO - DB2's LEarning Optimizer. In Proc. of the 2001 Intl. Conf. on Very Large Data Bases, pages 9--28, Sept. 2001.
[31]
T. Urhan, M. J. Franklin, and L. Amsaleg. Cost based query scrambling for initial delays. In Proc. of the 1998 ACM SIGMOD Intl. Conf. on Management of Data, pages 130--141, June 1998.
[32]
S. Viglas, J. Naughton, and J. Burger. Maximizing the output rate of multi-join queries over streaming information sources. In Proc. of the 2003 Intl. Conf. on Very Large Data Bases, Sept. 2003.

Cited By

View all
  • (2024)POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least ResistanceProceedings of the VLDB Endowment10.14778/3648160.364817517:6(1350-1363)Online publication date: 1-Feb-2024
  • (2023)Simple Adaptive Query Processing vs. Learned Query Optimizers: Observations and AnalysisProceedings of the VLDB Endowment10.14778/3611479.361150116:11(2962-2975)Online publication date: 24-Aug-2023
  • (2023)A survey on the evolution of stream processing systemsThe VLDB Journal10.1007/s00778-023-00819-833:2(507-541)Online publication date: 22-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '04: Proceedings of the 2004 ACM SIGMOD international conference on Management of data
June 2004
988 pages
ISBN:1581138598
DOI:10.1145/1007568
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2004

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS04
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)4
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least ResistanceProceedings of the VLDB Endowment10.14778/3648160.364817517:6(1350-1363)Online publication date: 1-Feb-2024
  • (2023)Simple Adaptive Query Processing vs. Learned Query Optimizers: Observations and AnalysisProceedings of the VLDB Endowment10.14778/3611479.361150116:11(2962-2975)Online publication date: 24-Aug-2023
  • (2023)A survey on the evolution of stream processing systemsThe VLDB Journal10.1007/s00778-023-00819-833:2(507-541)Online publication date: 22-Nov-2023
  • (2022)Demonstration of accelerating machine learning inference queries with correlative proxy modelsProceedings of the VLDB Endowment10.14778/3554821.355488715:12(3734-3737)Online publication date: 1-Aug-2022
  • (2022)Optimizing machine learning inference queries with correlative proxy modelsProceedings of the VLDB Endowment10.14778/3547305.354731015:10(2032-2044)Online publication date: 1-Jun-2022
  • (2022)Adaptive SQL Query Optimization in Distributed Stream Processing: A Preliminary StudySoftware Foundations for Data Interoperability10.1007/978-3-030-93849-9_7(96-109)Online publication date: 19-Jan-2022
  • (2021)Improved approximations for min sum vertex cover and generalized min sum set coverProceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3458064.3458126(986-1005)Online publication date: 10-Jan-2021
  • (2021)Synchronization SchemasProceedings of the 40th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3452021.3458317(1-18)Online publication date: 20-Jun-2021
  • (2021)GraphsurgeProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452837(1518-1530)Online publication date: 9-Jun-2021
  • (2021)Small Selectivities Matter: Lifting the Burden of Empty SamplesProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452805(697-709)Online publication date: 9-Jun-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media