research-article

A generic flow algorithm for shared filter ordering problems

Authors:
Zhen Liu

IBM T.J. Watson Research Center, Hawthorne, NY, USA

IBM T.J. Watson Research Center, Hawthorne, NY, USA
View Profile

,
Srinivasan Parthasarathy

IBM T.J. Watson Research Center, Hawthorne, NY, USA

IBM T.J. Watson Research Center, Hawthorne, NY, USA
View Profile

,
Anand Ranganathan

IBM T.J. Watson Research Center, Hawthorne, NY, USA

IBM T.J. Watson Research Center, Hawthorne, NY, USA
View Profile

,
Hao Yang

IBM T.J. Watson Research Center, Hawthorne, NY, USA

IBM T.J. Watson Research Center, Hawthorne, NY, USA
View Profile

PODS '08: Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systemsJune 2008Pages 79–88https://doi.org/10.1145/1376916.1376929

Published:09 June 2008Publication History

PODS '08: Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

Pages 79–88

ABSTRACT

We consider a fundamental flow maximization problem that arises during the evaluation of multiple overlapping queries defined on a data stream, in a heterogenous parallel environment. Each query is a conjunction of boolean filters, and each filter could be shared across multiple queries. We are required to design an evaluation plan that evaluates filters against stream items in order to determine the set of queries satisfied by each item. The evaluation plan specifies for each item: (i) the subset of filters evaluated for this item and the order of their evaluations, and (ii) the processor on which each filter evaluation occurs. Our goal is to design an evaluation plan which maximizes the total throughput (flow) of the stream handled by the plan, without violating the processor capacities.

Filter ordering has received extensive attention in single-processor settings, with the objective of minimizing the total cost of filter evaluations: in particular, efficient (approximation) algorithms are known for various important versions of min-cost filter ordering. Min-cost filter ordering problem for a single processor is a special case of our flow maximization for parallel processors. Our main contribution in this work is a generic flow-maximization algorithm, which assumes the availability of a min-cost filter ordering algorithm for a single processor, and uses this to iteratively construct a solution to the flow-maximization problem for heterogenous parallel processors. We show that the approximation ratio of our flow-maximization strategy is essentially the same as that of the underlying min-cost filter ordering algorithm. Our result, along with existing results on min-cost filter ordering, enables the optimization of several important versions of filter ordering in parallel environments.

References

Ron Avnur and Joseph M. Hellerstein. Eddies: continuously adaptive query processing. SIGMOD Rec., 29(2):261--272, 2000.]] Google ScholarDigital Library
Shivnath Babu, Rajeev Motwani, Kamesh Munagala, Itaru Nishizawa, and Jennifer Widom. Adaptive ordering of pipelined stream filters. In SIGMOD, pages 407--418, New York, NY, USA, 2004. ACM Press.]] Google ScholarDigital Library
Amotz Bar-Noy, Mihir Bellare, Magn´us M. Halld´orsson, Hadas Shachnai, and Tami Tamir. On chromatic sums and distributed resource allocation. Inf. Comput, 140(2):183--202, 1998.]] Google ScholarDigital Library
Surajit Chaudhuri, Umeshwar Dayal, and Tak W. Yan. Join queries with external text sources: execution and optimization techniques. In SIGMOD '95: Proceedings of the 1995 ACM SIGMOD international conference on Management of data, pages 410--422, New York, NY, USA, 1995. ACM.]] Google ScholarDigital Library
H. Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Annals of Mathematical Statistics, 23:493--509, 1952.]]Google ScholarCross Ref
Edith Cohen, Amos Fiat, and Haim Kaplan. Efficient sequences of trials. In SODA, pages 737--746, 2003.]] Google ScholarDigital Library
Anne Condon, Amol Deshpande, Lisa Hellerstein, and Ning Wu. Flow algorithms for two pipelined filter ordering problems. In PODS, pages 193--202, New York, NY, USA, 2006. ACM Press.]] Google ScholarDigital Library
Amol Deshpande, Carlos Guestrin, Wei Hong, and Samuel Madden. Exploiting correlated attributes in acquisitional query processing. In ICDE '05: Proceedings of the 21st International Conference on Data Engineering, pages 143--154, Washington, DC, USA, 2005. IEEE Computer Society.]] Google ScholarDigital Library
Oren Etzioni, Steve Hanks, Tao Jiang, Richard M. Karp, Omid Madani, and Orli Waarts. Efficient information gathering on the internet (extended abstract). In FOCS, pages 234--243, 1996.]] Google ScholarDigital Library
Uriel Feige and Prasad Tetali. Approximating min sum set cover. Algorithmica, 40(4):219--234, 2004.]] Google ScholarDigital Library
N. Garg and J. Koenemann. Faster and simpler algorithms for multicommodity flow and other fractional packing problems. In Proceedings of the 39th Annual Symposium on Foundations of Computer Science, page 300. IEEE Computer Science Society, 1998.]] Google ScholarDigital Library
Roy Goldman and Jennifer Widom. Wsq/dsq: a practical approach for combined querying of databases and the web. In SIGMOD '00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 285--296, New York, NY, USA, 2000. ACM.]] Google ScholarDigital Library
J. Hellerstein and M. Stonebraker. Predicate migration: Optimizing queries with expensive predicates. In Proc. SIGMOD, 1993.]] Google ScholarDigital Library
W. Hoeffding. Probability inequalities for sums of bounded random variables. American Statistical Association Journal, 58:13--30, 1963.]]Google ScholarCross Ref
T. Ibaraki and T. Kameda. On the optimal nesting order for computing n-relational joins. ACM Trans. on Database Systems, 9(3):482--502, 1984.]] Google ScholarDigital Library
Haim Kaplan, Eyal Kushilevitz, and Yishay Mansour. Learning with attribute costs. In STOC '05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 356--365, New York, NY, USA, 2005. ACM.]] Google ScholarDigital Library
Murali S. Kodialam. The throughput of sequential testing. In Proceedings of the 8th International IPCO Conference on Integer Programming and Combinatorial Optimization, pages 280--292, London, UK, 2001. Springer-Verlag.]] Google ScholarDigital Library
Ravi Krishnamurthy, Haran Boral, and Carlo Zaniolo. Optimization of nonrecursive queries. In Proc. VLDB, pages 128--137, 1986.]] Google ScholarDigital Library
Zhen Liu, Srinivasan Parthasarathy, Anand Ranganathan, and Hao Yang. Near-optimal algorithms for shared filter evaluation in data stream systems. In Proc. of ACM SIGMOD (to appear), 2008.]] Google ScholarDigital Library
Kamesh Munagala, Shivnath Babu, Rajeev Motwani, and Jennifer Widom. The pipelined set cover problem. In 10th International Conference on Database Theory - ICDT, pages 83--98, 2005.]] Google ScholarDigital Library
Kamesh Munagala, Utkarsh Srivastava, and Jennifer Widom. Optimization of continuous queries with shared expensive filters. In PODS '07: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 215--224, New York, NY, USA, 2007. ACM.]] Google ScholarDigital Library
Serge A. Plotkin, David B. Shmoys, and Éva Tardos. Fast approximation algorithms for fractional packing and covering problems. In Proceedings of the 32nd annual symposium on Foundations of computer science, pages 495--504, Los Alamitos, CA, USA, 1991. IEEE Computer Society Press.]] Google ScholarDigital Library
H. Simon and J. Kadane. Optimal problem-solving search: All-or-none solutions. Artificial Intelligence, 6:235--247, 1975.]]Google ScholarCross Ref
Vijay V. Vazirani. Approximation algorithms. Springer-Verlag New York, Inc., New York, NY, USA, 2001.]] Google ScholarDigital Library
Neal E. Young. Sequential and parallel algorithms for mixed packing and covering. In Proceedings of IEEE Symposium on Foundations of Computer Science, 2001.]] Google ScholarDigital Library

Index Terms

A generic flow algorithm for shared filter ordering problems
1. Theory of computation
  1. Design and analysis of algorithms

Recommendations

Flow algorithms for two pipelined filter ordering problems
PODS '06: Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

Pipelined filter ordering is a central problem in database query optimization, and has received renewed attention recently in the context of environments such as the web, continuous high-speed data streams and sensor networks. We present algorithms for ...
Read More
Parallel pipelined filter ordering with precedence constraints

In the parallel pipelined filter ordering problem, we are given a set of n filters that run in parallel. The filters need to be applied to a stream of elements, to determine which elements pass all filters. Each filter has a rate limit r_i on the number ...
Read More
Near-optimal algorithms for shared filter evaluation in data stream systems
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data

We consider the problem of evaluating multiple overlapping queries defined on data streams, where each query is a conjunction of multiple filters and each filter may be shared across multiple queries. Efficient support for overlapping queries is a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
PODS '08: Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
June 2008
330 pages
ISBN:9781605581521
DOI:10.1145/1376916
General Chair:
Phokion Kolaitis
IBM Almaden Research Center, USA
,
Program Chair:
Maurizio Lenzerini
SAPIENZA University of Rome, Italy
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 June 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
flow maximization
parallel
query optimization
shared filter ordering
Qualifiers
- research-article
Conference

Acceptance Rates
PODS '08 Paper Acceptance Rate28of159submissions,18%Overall Acceptance Rate642of2,707submissions,24%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 305
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A generic flow algorithm for shared filter ordering problems

PODS '08: Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Flow algorithms for two pipelined filter ordering problems

Parallel pipelined filter ordering with precedence constraints

Near-optimal algorithms for shared filter evaluation in data stream systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A generic flow algorithm for shared filter ordering problems

PODS '08: Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Flow algorithms for two pipelined filter ordering problems

Parallel pipelined filter ordering with precedence constraints

Near-optimal algorithms for shared filter evaluation in data stream systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media