ABSTRACT
We present a simple geometric framework for the relational join. Using this framework, we design an algorithm that achieves the fractional hypertree-width bound, which generalizes classical and recent worst-case algorithmic results on computing joins. In addition, we use our framework and the same algorithm to show a series of what are colloquially known as beyond worst-case results. The framework allows us to prove results for data stored in Btrees, multidimensional data structures, and even multiple indices per table. A key idea in our framework is formalizing the inference one does with an index as a type of geometric resolution; transforming the algorithmic problem of computing joins to a geometric problem. Our notion of geometric resolution can be viewed as a geometric analog of logical resolution. In addition to the geometry and logic connections, our algorithm can also be thought of as backtracking search with memoization.
- S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarDigital Library
- M. Abo Khamis, H. Q. Ngo, C. Ré, and A. Rudra. Joins via Geometric Resolutions: Worst-case and Beyond. ArXiv e-prints, Feb. 2015.Google Scholar
- P. Afshani, J. Barbay, and T. M. Chan. Instance-optimal geometric algorithms. In FOCS, pages 129--138, 2009. Google ScholarDigital Library
- N. Alon. On the number of subgraphs of prescribed type of graphs with a given number of edges. Israel J. Math., 38(1--2):116--130, 1981.Google ScholarCross Ref
- S. Arnborg and A. Proskurowski. Linear time algorithms for NP-hard problems restricted to partial k-trees. Discrete Appl. Math., 23(1):11--24, 1989. Google ScholarDigital Library
- A. Atserias, M. Grohe, and D. Marx. Size bounds and query plans for relational joins. In FOCS, pages 739--748. IEEE Computer Society, 2008. Google ScholarDigital Library
- J. Barbay and C. Kenyon. Adaptive intersection and t-threshold problems. In SODA, pages 390--399, 2002. Google ScholarDigital Library
- J. Barbay and C. Kenyon. Alternation and redundancy analysis of the intersection problem. ACM Transactions on Algorithms, 4(1), 2008. Google ScholarDigital Library
- S. Blanas, Y. Li, and J. M. Patel. Design and evaluation of main memory hash join algorithms for multi-core CPUs. In SIGMOD, pages 37--48. ACM, 2011. Google ScholarDigital Library
- S. Chaudhuri. An overview of query optimization in relational systems. In PODS, pages 34--43. ACM, 1998. Google ScholarDigital Library
- C. Chekuri and A. Rajaraman. Conjunctive query containment revisited. Theor. Comput. Sci., 239(2):211--229, 2000. Google ScholarDigital Library
- R. Dechter and J. Pearl. Tree-clustering schemes for constraint-processing. In H. E. Shrobe, T. M. Mitchell, and R. G. Smith, editors, AAAI, pages 150--154. AAAI Press / The MIT Press, 1988.Google Scholar
- R. Dechter and J. Pearl. Tree clustering for constraint networks. Artificial Intelligence, 38(3):353--366, 1989. Google ScholarDigital Library
- E. D. Demaine, A. López-Ortiz, and J. I. Munro. Adaptive set intersections, unions, and differences. In SODA, pages 743--752, 2000. Google ScholarDigital Library
- R. Fagin. Degrees of acyclicity for hypergraphs and relational database schemes. J. ACM, 30(3):514--550, 1983. Google ScholarDigital Library
- R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS '01, pages 102--113, New York, NY, USA, 2001. ACM. Google ScholarDigital Library
- E. Friedgut and J. Kahn. On the number of copies of one hypergraph in another. Israel J. Math., 105:251--256, 1998.Google ScholarCross Ref
- A. Gajentaan and M. H. Overmars. On a class of o(n2) problems in computational geometry. Comput. Geom. Theory Appl., 45(4):140--152, May 2012. Google ScholarDigital Library
- G. Gottlob, N. Leone, and F. Scarcello. Robbers, marshals, and guards: game theoretic and logical characterizations of hypertree width. J. Comput. Syst. Sci., 66(4):775--808, 2003. Google ScholarDigital Library
- G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73--170, June 1993. Google ScholarDigital Library
- M. Grohe and D. Marx. Constraint solving via fractional edge covers. In SODA, pages 289--298. ACM Press, 2006. Google ScholarDigital Library
- M. Gyssens, P. Jeavons, and D. A. Cohen. Decomposing constraint satisfaction problems using database techniques. Artif. Intell., 66(1):57--89, 1994. Google ScholarDigital Library
- M. Gyssens and J. Paredaens. A decomposition methodology for cyclic databases. In Advances in Data Base Theory, pages 85--122, 1982.Google Scholar
- C. Kim, T. Kaldewey, V. W. Lee, E. Sedlar, A. D. Nguyen, N. Satish, J. Chhugani, A. Di Blas, and P. Dubey. Sort vs. hash revisited: fast join implementation on modern multi-core CPUs. Proc. VLDB Endow., 2(2):1378--1389, Aug. 2009. Google ScholarDigital Library
- P. G. Kolaitis and M. Y. Vardi. Conjunctive-query containment and constraint satisfaction. J. Comput. Syst. Sci., 61(2):302--332, 2000. Google ScholarDigital Library
- D. Maier. The Theory of Relational Databases. Computer Science Press, 1983. Google ScholarDigital Library
- D. Marx. Tractable hypergraph properties for constraint satisfaction and conjunctive queries. In STOC, pages 735--744, 2010. Google ScholarDigital Library
- D. Marx. Tractable structures for constraint satisfaction with truth tables. Theory Comput. Syst., 48(3):444--464, 2011. Google ScholarDigital Library
- D. Marx. Tractable hypergraph properties for constraint satisfaction and conjunctive queries. J. ACM, 60(6):42, 2013. Google ScholarDigital Library
- R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon. Network motifs: simple building blocks of complex networks. Science, 298(5594):824--827, October 2002.Google ScholarCross Ref
- H. Q. Ngo, D. T. Nguyen, C. Ré, and A. Rudra. Beyond worst-case analysis for joins with Minesweeper. In PODS, pages 234--245, 2014. Google ScholarDigital Library
- H. Q. Ngo, E. Porat, C. Ré, and A. Rudra. Worst-case optimal join algorithms: {extended abstract}. In PODS, pages 37--48, 2012. Google ScholarDigital Library
- H. Q. Ngo, C. Ré, and A. Rudra. Skew strikes back: New developments in the theory of join algorithms. In SIGMOD RECORD, pages 5--16, 2013. Google ScholarDigital Library
- D. Nguyen, M. Aref, M. Bravenboer, G. Kollias, H. Q. Ngo, C. Ré, and A. Rudra. Join Processing for Graph Patterns: An Old Dog with New Tricks. ArXiv e-prints, 2015.Google ScholarDigital Library
- D. Olteanu and J. Zavodny. Size bounds for factorised representations of query results. ACM Transactions on Database Systems, 2014. To appear. Google ScholarDigital Library
- C. H. Papadimitriou and M. Yannakakis. On the complexity of database queries. In PODS, pages 12--19, 1997. Google ScholarDigital Library
- N. Przulj, D. G. Corneil, and I. Jurisica. Modeling interactome: scale-free or geometric? Bioinformatics, 20(18):3508--3515, 2004. Google ScholarDigital Library
- N. Robertson and P. D. Seymour. Graph minors. II. Algorithmic aspects of tree-width. J. Algorithms, 7(3):309--322, 1986.Google ScholarCross Ref
- F. Scarcello. Query answering exploiting structural properties. SIGMOD Record, 34(3):91--99, 2005. Google ScholarDigital Library
- S. Suri and S. Vassilvitskii. Counting triangles and the curse of the last reducer. In WWW, pages 607--614, 2011. Google ScholarDigital Library
- C. E. Tsourakakis. Fast counting of triangles in large real networks without counting: Algorithms and laws. In ICDM, pages 608--617. IEEE Computer Society, 2008. Google ScholarDigital Library
- J. D. Ullman. Principles of Database and Knowledge-Base Systems, Volume II. Computer Science Press, 1989. Google ScholarDigital Library
- T. L. Veldhuizen. Triejoin: A simple, worst-case optimal join algorithm. In ICDT, pages 96--106, 2014.Google Scholar
- M. Yannakakis. Algorithms for acyclic database schemes. In VLDB, pages 82--94, 1981. Google ScholarDigital Library
Index Terms
- Joins via Geometric Resolutions: Worst-case and Beyond
Recommendations
Joins via Geometric Resolutions: Worst Case and Beyond
Invited Paper from EDBT 2015, Invited Paper from PODS 2015 and Regular PapersWe present a simple geometric framework for the relational join. Using this framework, we design an algorithm that achieves the fractional hypertree-width bound, which generalizes classical and recent worst-case algorithmic results on computing joins. ...
The SB-index and the HSB-index: efficient indices for spatial data warehouses
Spatial data warehouses (SDWs) allow for spatial analysis together with analytical multidimensional queries over huge volumes of data. The challenge is to retrieve data related to ad hoc spatial query windows according to spatial predicates, avoiding ...
HSJ-Solver: a new method based on GHD for answering conjunctive queries and solving constraint satisfaction problems
AbstractEvaluating conjunctive queries (CQs) is NP-hard in general; however, acyclic CQs or nearest acyclic CQs can be evaluated in polynomial time. Many structural methods for characterising such classes are proposed in the literature. However, ...
Comments