Skip to main content

Projection Pushing Revisited

  • Conference paper
Advances in Database Technology - EDBT 2004 (EDBT 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2992))

Included in the following conference series:

Abstract

The join operation, which combines tuples from multiple relations, is the most fundamental and, typically, the most expensive operation in database queries. The standard approach to join-query optimization is cost based, which requires developing a cost model, assigning an estimated cost to each query-processing plan, and searching in the space of all plans for a plan of minimal cost. Two other approaches can be found in the database-theory literature. The first approach, initially proposed by Chandra and Merlin, focused on minimizing the number of joins rather then on selecting an optimal join order. Unfortunately, this approach requires a homomorphism test, which itself is NP-complete, and has not been pursued in practical query processing. The second, more recent, approach focuses on structural properties of the query in order to find a project-join order that will minimize the size of intermediate results during query evaluation. For example, it is known that for Boolean project-join queries a project-join order can be found such that the arity of intermediate results is the treewidth of the join graph plus one.

In this paper we pursue the structural-optimization approach, motivated by its success in the context of constraint satisfaction. We chose a setup in which the cost-based approach is rather ineffective; we generate project-join queries with a large number of relations over databases with small relations. We show that a standard SQL planner (we use PostgreSQL) spends an exponential amount of time on generating plans for such queries, with rather dismal results in terms of performance. We then show how structural techniques, including projection pushing and join reordering, can yield exponential improvements in query execution time. Finally, we combine early projection and join reordering in an implementation of the bucket-elimination method from constraint satisfaction to obtain another exponential improvement.

Work supported in part by NSF grants CCR-9988322, CCR-0124077, CCR-0311326, IIS-9908435, IIS-9978135, and EIA-0086264.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of databases. Addison-Wesley, Reading (1995)

    MATH  Google Scholar 

  2. Aho, A., Sagiv, Y., Ullman, J.D.: Efficient optimization of a class of relational expressions. ACM Trans. on Database Systems 4, 435–454 (1979)

    Article  Google Scholar 

  3. Aho, A., Sagiv, Y., Ullman, J.D.: Equivalence of relational expressions. SIAM Journal on Computing 8, 218–246 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  4. Apers, P., Hevner, A., Yao, S.: Optimization algorithms for distributed queries. IEEE Trans. Software Engineering 9(1), 57–68 (1983)

    Article  Google Scholar 

  5. Arnborg, S., Corneil, D.G., Proskurowski, A.: Complexity of finding embeddings in a k-tree. SIAM Journal of Algebraic and Discrete Methods 8(2), 277–284 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  6. Bodlaender, H.L.: A tourist guide through treewidth. Acta Cybernetica 11, 1–21 (1993)

    MATH  MathSciNet  Google Scholar 

  7. Bouquet, F.: Gestion de la dynamicité et énumération d’implicants premiers: une approche fondée sur les Diagrammes de Décision Binaire. PhD thesis, Université de Provence, France (1999)

    Google Scholar 

  8. Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational databases. In: Proc. 9th ACM Symp. on Theory of Computing, pp. 77–90 (1977)

    Google Scholar 

  9. Chauhan, P., Clarke, E.M., Jha, S., Kukula, J.H., Veith, H., Wang, D.: Using combinatorial optimization methods for quantification scheduling. In: Proc. 11th Conf. on Correct Hardware Design and Verification Methods, pp. 293–309 (2001)

    Google Scholar 

  10. Chekuri, C., Ramajaran, A.: Conjunctive query containment revisited. Technical report, Stanford University (November 1998)

    Google Scholar 

  11. Dalmau, V., Kolaitis, P.G., Vardi, M.Y.: Constraint satisfaction, bounded treewidth, and finite-variable logics. In: Van Hentenryck, P. (ed.) CP 2002. LNCS, vol. 2470, pp. 311–326. Springer, Heidelberg (2002)

    Google Scholar 

  12. Dechter, R.: Mini-buckets: A general scheme for generating approximations in automated reasoning. In: International Joint Conference on Artificial Intelligence, pp. 1297–1303 (1997)

    Google Scholar 

  13. Dechter, R.: Bucket elimination: a unifying framework for reasoning. Artificial Intelligence 113(1-2), 41–85 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  14. Dechter, R.: Constraint Processing. Morgan Kaufmann, San Francisco (2003)

    Google Scholar 

  15. Dechter, R., Pearl, J.: Network-based heuristics for constraint-satisfaction problems. Artificial Intelligence 34, 1–38 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  16. Downey, R.G., Fellows, M.R.: Parametrized Complexity. Springer, Heidelberg (1999)

    Google Scholar 

  17. Freuder, E.C.: Complexity of k-tree structured constraint satisfaction problems. In: Proc. AAAI 1990, pp. 4–9 (1990)

    Google Scholar 

  18. Freytag, J.C.: A rule-based view of query optimization. In: Proceedings of the 1987 ACM SIGMOD international conference on Management of data, pp. 173–180 (1987)

    Google Scholar 

  19. Garcia-Molina, H., Ullman, J.D., Widom, J.: Database System Implementation. Prentice-Hall, Englewood Cliffs (2000)

    Google Scholar 

  20. Garey, M.R., Johnson, D.S.: Computers and Intractability, A Guide to the Theory of NP-Completeness. W. H. Freeman, New York (1979)

    MATH  Google Scholar 

  21. Gottlob, G., Leone, N., Scarcello, F.: Hypertree decompositions and tractable queries. In: Proc. 18th ACM Symp. on Principles of Database Systems, pp. 21–32 (1999)

    Google Scholar 

  22. Griffiths, P.P., Astrahan, M.M., Chamberlin, D.D., Lorie, R.A., Price, T.G.: Access path selection in a relational database management system. In: ACM SIGMOD International Conference on Management of Data, pp. 23–34 (1979)

    Google Scholar 

  23. Halevy, A.: Answering queries using views: A survey. VLDB Journal, 270–294 (2001)

    Google Scholar 

  24. Hojati, R., Krishnan, S.C., Brayton, R.K.: Early quantification and partitioned transition relations. In: Proc. 1996 Int’l Conf. on Computer Design, pp. 12–19 (1996)

    Google Scholar 

  25. Ioannidis, Y., Wong, E.: Query optimization by simulated annealing. In: ACM SIGMOD International Conference on Management of Data, pp. 9–22 (1987)

    Google Scholar 

  26. Kolaitis, P.G., Vardi, M.Y.: Conjunctive-query containment and constraint satisfaction. Journal of Computer and System Sciences, 302–332 (2000); Earlier version in: Proc. 17th ACM Symp. on Principles of Database Systems (PODS 1998) (1998)

    Google Scholar 

  27. Kunen, I.K., Suciu, D.: A scalable algorithm for query minimization. Technical report, University of Washington (2002)

    Google Scholar 

  28. Ramakrishnan, R., Beeri, C., Krishnamurthi, R.: Optimizing existential datalog queries. In: Proceedings of the ACM Symposium on Principles of Database Systems, pp. 89–102 (1988)

    Google Scholar 

  29. Rish, I., Dechter, R.: Resolution versus search: Two strategies for SAT. Journal of Automated Reasoning 24(1/2), 225–275 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  30. San Miguel Aguirre, A., Vardi, M.Y.: Random 3-SAT and BDDs – the plot thickens further. In: Walsh, T. (ed.) CP 2001. LNCS, vol. 2239, pp. 121–136. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  31. Tarjan, R.E., Yannakakis, M.: Simple linear-time algorithms to tests chordality of graphs, tests acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM J. on Computing 13(3), 566–579 (1984)

    Article  MATH  MathSciNet  Google Scholar 

  32. Ullman, J.D.: Database and Knowledge-Base Systems, vol. I and II. Computer Science Press, Rockville (1989)

    Google Scholar 

  33. Vardi, M.Y.: On the complexity of bounded-variable queries. In: Proc. 14th ACM Symp. on Principles of Database Systems, pp. 266–276 (1995)

    Google Scholar 

  34. Wong, E., Youssefi, K.: Decomposition - a strategy for query processing. ACM Trans. on Database Systems 1(3), 223–241 (1976)

    Article  Google Scholar 

  35. Yannakakis, M.: Algorithms for acyclic database schemes. In: Proc. 7 Int’l Conf. on Very Large Data Bases, pp. 82–94 (1981)

    Google Scholar 

  36. Yerneni, R., Li, C., Ullman, J.D., Garcia-Molina, H.: Optimizing large join queries in mediation systems. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 348–364. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

McMahan, B.J., Pan, G., Porter, P., Vardi, M.Y. (2004). Projection Pushing Revisited. In: Bertino, E., et al. Advances in Database Technology - EDBT 2004. EDBT 2004. Lecture Notes in Computer Science, vol 2992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24741-8_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24741-8_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21200-3

  • Online ISBN: 978-3-540-24741-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics