skip to main content
10.1145/1376916.1376929acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

A generic flow algorithm for shared filter ordering problems

Published:09 June 2008Publication History

ABSTRACT

We consider a fundamental flow maximization problem that arises during the evaluation of multiple overlapping queries defined on a data stream, in a heterogenous parallel environment. Each query is a conjunction of boolean filters, and each filter could be shared across multiple queries. We are required to design an evaluation plan that evaluates filters against stream items in order to determine the set of queries satisfied by each item. The evaluation plan specifies for each item: (i) the subset of filters evaluated for this item and the order of their evaluations, and (ii) the processor on which each filter evaluation occurs. Our goal is to design an evaluation plan which maximizes the total throughput (flow) of the stream handled by the plan, without violating the processor capacities.

Filter ordering has received extensive attention in single-processor settings, with the objective of minimizing the total cost of filter evaluations: in particular, efficient (approximation) algorithms are known for various important versions of min-cost filter ordering. Min-cost filter ordering problem for a single processor is a special case of our flow maximization for parallel processors. Our main contribution in this work is a generic flow-maximization algorithm, which assumes the availability of a min-cost filter ordering algorithm for a single processor, and uses this to iteratively construct a solution to the flow-maximization problem for heterogenous parallel processors. We show that the approximation ratio of our flow-maximization strategy is essentially the same as that of the underlying min-cost filter ordering algorithm. Our result, along with existing results on min-cost filter ordering, enables the optimization of several important versions of filter ordering in parallel environments.

References

  1. Ron Avnur and Joseph M. Hellerstein. Eddies: continuously adaptive query processing. SIGMOD Rec., 29(2):261--272, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Shivnath Babu, Rajeev Motwani, Kamesh Munagala, Itaru Nishizawa, and Jennifer Widom. Adaptive ordering of pipelined stream filters. In SIGMOD, pages 407--418, New York, NY, USA, 2004. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Amotz Bar-Noy, Mihir Bellare, Magn´us M. Halld´orsson, Hadas Shachnai, and Tami Tamir. On chromatic sums and distributed resource allocation. Inf. Comput, 140(2):183--202, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Surajit Chaudhuri, Umeshwar Dayal, and Tak W. Yan. Join queries with external text sources: execution and optimization techniques. In SIGMOD '95: Proceedings of the 1995 ACM SIGMOD international conference on Management of data, pages 410--422, New York, NY, USA, 1995. ACM.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. H. Chernoff. A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Annals of Mathematical Statistics, 23:493--509, 1952.]]Google ScholarGoogle ScholarCross RefCross Ref
  6. Edith Cohen, Amos Fiat, and Haim Kaplan. Efficient sequences of trials. In SODA, pages 737--746, 2003.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Anne Condon, Amol Deshpande, Lisa Hellerstein, and Ning Wu. Flow algorithms for two pipelined filter ordering problems. In PODS, pages 193--202, New York, NY, USA, 2006. ACM Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Amol Deshpande, Carlos Guestrin, Wei Hong, and Samuel Madden. Exploiting correlated attributes in acquisitional query processing. In ICDE '05: Proceedings of the 21st International Conference on Data Engineering, pages 143--154, Washington, DC, USA, 2005. IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Oren Etzioni, Steve Hanks, Tao Jiang, Richard M. Karp, Omid Madani, and Orli Waarts. Efficient information gathering on the internet (extended abstract). In FOCS, pages 234--243, 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Uriel Feige and Prasad Tetali. Approximating min sum set cover. Algorithmica, 40(4):219--234, 2004.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Garg and J. Koenemann. Faster and simpler algorithms for multicommodity flow and other fractional packing problems. In Proceedings of the 39th Annual Symposium on Foundations of Computer Science, page 300. IEEE Computer Science Society, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Roy Goldman and Jennifer Widom. Wsq/dsq: a practical approach for combined querying of databases and the web. In SIGMOD '00: Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 285--296, New York, NY, USA, 2000. ACM.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Hellerstein and M. Stonebraker. Predicate migration: Optimizing queries with expensive predicates. In Proc. SIGMOD, 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Hoeffding. Probability inequalities for sums of bounded random variables. American Statistical Association Journal, 58:13--30, 1963.]]Google ScholarGoogle ScholarCross RefCross Ref
  15. T. Ibaraki and T. Kameda. On the optimal nesting order for computing n-relational joins. ACM Trans. on Database Systems, 9(3):482--502, 1984.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Haim Kaplan, Eyal Kushilevitz, and Yishay Mansour. Learning with attribute costs. In STOC '05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computing, pages 356--365, New York, NY, USA, 2005. ACM.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Murali S. Kodialam. The throughput of sequential testing. In Proceedings of the 8th International IPCO Conference on Integer Programming and Combinatorial Optimization, pages 280--292, London, UK, 2001. Springer-Verlag.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ravi Krishnamurthy, Haran Boral, and Carlo Zaniolo. Optimization of nonrecursive queries. In Proc. VLDB, pages 128--137, 1986.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Zhen Liu, Srinivasan Parthasarathy, Anand Ranganathan, and Hao Yang. Near-optimal algorithms for shared filter evaluation in data stream systems. In Proc. of ACM SIGMOD (to appear), 2008.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kamesh Munagala, Shivnath Babu, Rajeev Motwani, and Jennifer Widom. The pipelined set cover problem. In 10th International Conference on Database Theory - ICDT, pages 83--98, 2005.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kamesh Munagala, Utkarsh Srivastava, and Jennifer Widom. Optimization of continuous queries with shared expensive filters. In PODS '07: Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 215--224, New York, NY, USA, 2007. ACM.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Serge A. Plotkin, David B. Shmoys, and Éva Tardos. Fast approximation algorithms for fractional packing and covering problems. In Proceedings of the 32nd annual symposium on Foundations of computer science, pages 495--504, Los Alamitos, CA, USA, 1991. IEEE Computer Society Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. H. Simon and J. Kadane. Optimal problem-solving search: All-or-none solutions. Artificial Intelligence, 6:235--247, 1975.]]Google ScholarGoogle ScholarCross RefCross Ref
  24. Vijay V. Vazirani. Approximation algorithms. Springer-Verlag New York, Inc., New York, NY, USA, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Neal E. Young. Sequential and parallel algorithms for mixed packing and covering. In Proceedings of IEEE Symposium on Foundations of Computer Science, 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A generic flow algorithm for shared filter ordering problems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PODS '08: Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
      June 2008
      330 pages
      ISBN:9781605581521
      DOI:10.1145/1376916

      Copyright © 2008 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 June 2008

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      PODS '08 Paper Acceptance Rate28of159submissions,18%Overall Acceptance Rate642of2,707submissions,24%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader