skip to main content
10.1145/2745754.2745776acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article

Joins via Geometric Resolutions: Worst-case and Beyond

Published:20 May 2015Publication History

ABSTRACT

We present a simple geometric framework for the relational join. Using this framework, we design an algorithm that achieves the fractional hypertree-width bound, which generalizes classical and recent worst-case algorithmic results on computing joins. In addition, we use our framework and the same algorithm to show a series of what are colloquially known as beyond worst-case results. The framework allows us to prove results for data stored in Btrees, multidimensional data structures, and even multiple indices per table. A key idea in our framework is formalizing the inference one does with an index as a type of geometric resolution; transforming the algorithmic problem of computing joins to a geometric problem. Our notion of geometric resolution can be viewed as a geometric analog of logical resolution. In addition to the geometry and logic connections, our algorithm can also be thought of as backtracking search with memoization.

References

  1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. M. Abo Khamis, H. Q. Ngo, C. Ré, and A. Rudra. Joins via Geometric Resolutions: Worst-case and Beyond. ArXiv e-prints, Feb. 2015.Google ScholarGoogle Scholar
  3. P. Afshani, J. Barbay, and T. M. Chan. Instance-optimal geometric algorithms. In FOCS, pages 129--138, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Alon. On the number of subgraphs of prescribed type of graphs with a given number of edges. Israel J. Math., 38(1--2):116--130, 1981.Google ScholarGoogle ScholarCross RefCross Ref
  5. S. Arnborg and A. Proskurowski. Linear time algorithms for NP-hard problems restricted to partial k-trees. Discrete Appl. Math., 23(1):11--24, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Atserias, M. Grohe, and D. Marx. Size bounds and query plans for relational joins. In FOCS, pages 739--748. IEEE Computer Society, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J. Barbay and C. Kenyon. Adaptive intersection and t-threshold problems. In SODA, pages 390--399, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. Barbay and C. Kenyon. Alternation and redundancy analysis of the intersection problem. ACM Transactions on Algorithms, 4(1), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. S. Blanas, Y. Li, and J. M. Patel. Design and evaluation of main memory hash join algorithms for multi-core CPUs. In SIGMOD, pages 37--48. ACM, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Chaudhuri. An overview of query optimization in relational systems. In PODS, pages 34--43. ACM, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Chekuri and A. Rajaraman. Conjunctive query containment revisited. Theor. Comput. Sci., 239(2):211--229, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. Dechter and J. Pearl. Tree-clustering schemes for constraint-processing. In H. E. Shrobe, T. M. Mitchell, and R. G. Smith, editors, AAAI, pages 150--154. AAAI Press / The MIT Press, 1988.Google ScholarGoogle Scholar
  13. R. Dechter and J. Pearl. Tree clustering for constraint networks. Artificial Intelligence, 38(3):353--366, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. D. Demaine, A. López-Ortiz, and J. I. Munro. Adaptive set intersections, unions, and differences. In SODA, pages 743--752, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Fagin. Degrees of acyclicity for hypergraphs and relational database schemes. J. ACM, 30(3):514--550, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS '01, pages 102--113, New York, NY, USA, 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. Friedgut and J. Kahn. On the number of copies of one hypergraph in another. Israel J. Math., 105:251--256, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  18. A. Gajentaan and M. H. Overmars. On a class of o(n2) problems in computational geometry. Comput. Geom. Theory Appl., 45(4):140--152, May 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G. Gottlob, N. Leone, and F. Scarcello. Robbers, marshals, and guards: game theoretic and logical characterizations of hypertree width. J. Comput. Syst. Sci., 66(4):775--808, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73--170, June 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Grohe and D. Marx. Constraint solving via fractional edge covers. In SODA, pages 289--298. ACM Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Gyssens, P. Jeavons, and D. A. Cohen. Decomposing constraint satisfaction problems using database techniques. Artif. Intell., 66(1):57--89, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Gyssens and J. Paredaens. A decomposition methodology for cyclic databases. In Advances in Data Base Theory, pages 85--122, 1982.Google ScholarGoogle Scholar
  24. C. Kim, T. Kaldewey, V. W. Lee, E. Sedlar, A. D. Nguyen, N. Satish, J. Chhugani, A. Di Blas, and P. Dubey. Sort vs. hash revisited: fast join implementation on modern multi-core CPUs. Proc. VLDB Endow., 2(2):1378--1389, Aug. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. P. G. Kolaitis and M. Y. Vardi. Conjunctive-query containment and constraint satisfaction. J. Comput. Syst. Sci., 61(2):302--332, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Maier. The Theory of Relational Databases. Computer Science Press, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. D. Marx. Tractable hypergraph properties for constraint satisfaction and conjunctive queries. In STOC, pages 735--744, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. Marx. Tractable structures for constraint satisfaction with truth tables. Theory Comput. Syst., 48(3):444--464, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. D. Marx. Tractable hypergraph properties for constraint satisfaction and conjunctive queries. J. ACM, 60(6):42, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network motifs: simple building blocks of complex networks. Science, 298(5594):824--827, October 2002.Google ScholarGoogle ScholarCross RefCross Ref
  31. H. Q. Ngo, D. T. Nguyen, C. Ré, and A. Rudra. Beyond worst-case analysis for joins with Minesweeper. In PODS, pages 234--245, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. H. Q. Ngo, E. Porat, C. Ré, and A. Rudra. Worst-case optimal join algorithms: {extended abstract}. In PODS, pages 37--48, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H. Q. Ngo, C. Ré, and A. Rudra. Skew strikes back: New developments in the theory of join algorithms. In SIGMOD RECORD, pages 5--16, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. D. Nguyen, M. Aref, M. Bravenboer, G. Kollias, H. Q. Ngo, C. Ré, and A. Rudra. Join Processing for Graph Patterns: An Old Dog with New Tricks. ArXiv e-prints, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D. Olteanu and J. Zavodny. Size bounds for factorised representations of query results. ACM Transactions on Database Systems, 2014. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. C. H. Papadimitriou and M. Yannakakis. On the complexity of database queries. In PODS, pages 12--19, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. N. Przulj, D. G. Corneil, and I. Jurisica. Modeling interactome: scale-free or geometric? Bioinformatics, 20(18):3508--3515, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. N. Robertson and P. D. Seymour. Graph minors. II. Algorithmic aspects of tree-width. J. Algorithms, 7(3):309--322, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  39. F. Scarcello. Query answering exploiting structural properties. SIGMOD Record, 34(3):91--99, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. S. Suri and S. Vassilvitskii. Counting triangles and the curse of the last reducer. In WWW, pages 607--614, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. C. E. Tsourakakis. Fast counting of triangles in large real networks without counting: Algorithms and laws. In ICDM, pages 608--617. IEEE Computer Society, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. J. D. Ullman. Principles of Database and Knowledge-Base Systems, Volume II. Computer Science Press, 1989. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. T. L. Veldhuizen. Triejoin: A simple, worst-case optimal join algorithm. In ICDT, pages 96--106, 2014.Google ScholarGoogle Scholar
  44. M. Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82--94, 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Joins via Geometric Resolutions: Worst-case and Beyond

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        PODS '15: Proceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
        May 2015
        358 pages
        ISBN:9781450327572
        DOI:10.1145/2745754
        • General Chair:
        • Tova Milo,
        • Program Chair:
        • Diego Calvanese

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 May 2015

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        PODS '15 Paper Acceptance Rate25of80submissions,31%Overall Acceptance Rate642of2,707submissions,24%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader