skip to main content
research-article

Parallel pipelined filter ordering with precedence constraints

Published: 04 October 2012 Publication History

Abstract

In the parallel pipelined filter ordering problem, we are given a set of n filters that run in parallel. The filters need to be applied to a stream of elements, to determine which elements pass all filters. Each filter has a rate limit ri on the number of elements it can process per unit time, and a selectivity pi, which is the probability that a random element will pass the filter. The goal is to maximize throughput. This problem appears naturally in a variety of settings, including parallel query optimization in databases and query processing over Web services.
We present an O(n3) algorithm for this problem, given tree-structured precedence constraints on the filters. This extends work of Condon et al. [2009] and Kodialam [2001], who presented algorithms for solving the problem without precedence constraints. Our algorithm is combinatorial and produces a sparse solution. Motivated by join operators in database queries, we also give algorithms for versions of the problem in which “filter” selectivities may be greater than or equal to 1.
We prove a strong connection between the more classical problem of minimizing total work in sequential filter ordering (A), and the parallel pipelined filter ordering problem (B). More precisely, we prove that A is solvable in polynomial time for a given class of precedence constraints if and only if B is as well. This equivalence allows us to show that B is NP-Hard in the presence of arbitrary precedence constraints (since A is known to be NP-Hard in that setting).

References

[1]
Avnur, R. and Hellerstein, J. M. 2000. Eddies: Continuously adaptive query processing. In SIGMOD '00: Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 261--272.
[2]
Babu, S., Motwani, R., Munagala, K., Nishizawa, I., and Widom, J. 2004. Adaptive ordering of pipelined stream filters. In SIGMOD '04: Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM Press, 407--418.
[3]
Bland, R. G., Goldfarb, D., and Todd, M. J. 1981. The ellipsoid method: A survey. Oper. Res. 29(6), 1039--1091.
[4]
Burge, J., Mingala, K., and Srivastava, U. 2005. Ordering pipelined query operators with precedence constraints. Tech. Rep. 2005-40, Stanford University.
[5]
Chaudhuri, S., Dayal, U., and Yan, T. W. 1995. Join queries with external text sources: Execution and optimization techniques. In SIGMOD '95: Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM Press, 410--422.
[6]
Condon, A., Deshpande, A., Hellerstein, L., and Wu, N. 2006. Flow algorithms for two pipelined filter ordering problems. In PODS '06: Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, June 26-28, 2006, Chicago, Illinois, USA. ACM, 193--202.
[7]
Condon, A., Deshpande, A., Hellerstein, L., and Wu, N. 2009. Algorithms for distributional and adversarial pipelined filter ordering problems. ACM Transactions on Algorithms 5, 2.
[8]
Dantzig, G. and Thapa, M. 2003. Linear Programming 2: Theory and Extensions. Springer.
[9]
Deshpande, A., Guestrin, C., Hong, W., and Madden, S. 2005. Exploiting correlated attributes in acquisitional query processing. In ICDE '05: Proceedings of the 21st International Conference on Data Engine. IEEE Computer Society, 143--154.
[10]
Deshpande, A. and Hellerstein, L. 2008. Flow algorithms for parallel query optimization. In ICDE '08: Proceedings of the 24th International Conference on Data Engineering. IEEE, 754--763.
[11]
Etzioni, O., Hanks, S., Jiang, T., Karp, R. M., Madani, O., and Waarts, O. 1996. Efficient information gathering on the internet. In FOCS '96: Proceedings of the 37th IEEE Symposium on Foundations of Computer Science. IEEE Computer Society, 234--243.
[12]
Feige, U., Lovász, L., and Tetali, P. 2004. Approximating min-sum set cover. Algorithmica 40, 4, 219--234.
[13]
Goldman, R. and Widom, J. 2000. WSQ/DSQ: A practical approach for combined querying of databases and the web. In SIGMOD '00: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. ACM, 285--296.
[14]
Ibaraki, T. and Kameda, T. 1984. On the optimal nesting order for computing n-relational joins. ACM Trans. Database Syst. 9, 3, 482--502.
[15]
Kaplan, H., Kushilevitz, E., and Mansour, Y. 2005. Learning with attribute costs. In STOC' 05: The Thirty-Seventh Annual ACM Symposium on Theory of Computing. ACM Press, 356--365.
[16]
Khachiyan, L. G. 1979. A polynomial algorithm in linear programming. Soviet Math. Dokl. 20(1), 191--194.
[17]
Kodialam, M. 2001. The throughput of sequential testing. In Proceedings of the Conference on Integer Programming and Combinatorial Optimization (IPCO). Lecture Notes in Computer Science, vol. 2081. Springer-Verlag, 280--292.
[18]
Krishnamurthy, R., Boral, H., and Zaniolo, C. 1986. Optimization of nonrecursive queries. In VLDB '86: Proceedings of the 12th International Conference on Very Large Data Bases. Morgan Kaufmann, 128--137.
[19]
Lawler, E. L. 1978. Sequencing jobs to minimize total weighted completion time subject to precedence constraints. Ann. Disc. Math. 2, 75--90.
[20]
Liu, Z., Parthasarathy, S., Ranganathan, A., and Yang, H. 2008. A generic flow algorithm for shared filter ordering problems. In PODS '08: Proceedings of the 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. ACM, 79--88.
[21]
Megiddo, N. and Chandrasekaran, R. 1989. On the epsilon-perturbation method for avoiding degeneracy. Oper. Res. Lett. 8, 305--308.
[22]
Schrijver, A. 1986. Theory of Linear and Integer Programming. Wiley-Interscience.
[23]
Shayman, M. and Fernandez-Gaucherand, E. 2001. Risk-sensitive decision-theoretic diagnosis. IEEE Trans. Automat. Cont. 46, 1166--1171.
[24]
Simon, H. and Kadane, J. 1975. Optimal problem-solving search: All-or-none solutions. Artif. Intell. 6, 235--247.
[25]
Srivastava, U., Munagala, K., and Widom, J. 2005. Operator placement for in-network stream query processing. In PODS '05: Proceedings of the 24th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. ACM, 250--258.
[26]
Srivastava, U., Munagala, K., Widom, J., and Motwani, R. 2006. Query optimization over web services. In VLDB '06: Proceedings of the 32nd International Conference on Very Large Data Bases. ACM, 355--366.

Cited By

View all
  • (2021)Evaluation of Heuristics for Product Data ModelsBusiness Process Management Workshops10.1007/978-3-030-66498-5_26(355-366)Online publication date: 19-Jan-2021
  • (2020)Ranking queries optimization over external data sources2020 2nd International Conference on Mathematics and Information Technology (ICMIT)10.1109/ICMIT47780.2020.9046996(180-185)Online publication date: Feb-2020
  • (2020) Top- query optimization over data services Future Generation Computer Systems10.1016/j.future.2020.06.052Online publication date: Jul-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Algorithms
ACM Transactions on Algorithms  Volume 8, Issue 4
September 2012
276 pages
ISSN:1549-6325
EISSN:1549-6333
DOI:10.1145/2344422
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2012
Accepted: 01 November 2010
Revised: 01 October 2010
Received: 01 February 2010
Published in TALG Volume 8, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Pipelined filter ordering
  2. flow algorithms
  3. parallel databases
  4. query optimization
  5. sequential testing

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)Evaluation of Heuristics for Product Data ModelsBusiness Process Management Workshops10.1007/978-3-030-66498-5_26(355-366)Online publication date: 19-Jan-2021
  • (2020)Ranking queries optimization over external data sources2020 2nd International Conference on Mathematics and Information Technology (ICMIT)10.1109/ICMIT47780.2020.9046996(180-185)Online publication date: Feb-2020
  • (2020) Top- query optimization over data services Future Generation Computer Systems10.1016/j.future.2020.06.052Online publication date: Jul-2020
  • (2020)On Knowledge Transfer from Cost-Based Optimization of Data-Centric Workflows to Business Process RedesignTransactions on Large-Scale Data- and Knowledge-Centered Systems XLIII10.1007/978-3-662-62199-8_3(62-85)Online publication date: 13-Aug-2020
  • (2019)Optimization of data flow execution in a parallel environmentDistributed and Parallel Databases10.1007/s10619-018-7243-337:3(385-410)Online publication date: 1-Sep-2019
  • (2018)The many faces of data-centric workflow optimization: a surveyInternational Journal of Data Science and Analytics10.1007/s41060-018-0107-06:2(81-107)Online publication date: 6-Mar-2018
  • (2017)Max-Throughput for (Conservative) k-of-n TestingAlgorithmica10.1007/s00453-015-0089-477:2(595-618)Online publication date: 1-Feb-2017
  • (2017)Cost Optimization of Data Flows Based on Task Re-orderingTransactions on Large-Scale Data- and Knowledge-Centered Systems XXXIII10.1007/978-3-662-55696-2_4(113-145)Online publication date: 8-Aug-2017
  • (2017)Modeling Data Flow Execution in a Parallel EnvironmentBig Data Analytics and Knowledge Discovery10.1007/978-3-319-64283-3_14(183-196)Online publication date: 3-Aug-2017

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media