ABSTRACT
We consider the complexity of join problems, focusing on equijoins, spatial-overlap joins, and set-containment joins. We use a graph pebbling model to characterize these joins combinatorially, by the length of their optimal pebbling strategies and computationally, by the complexity of discovering these strategies. Our results show that equijoins are the easiest of all joins, with optimal pebbling strategies that meet the lower bound over all join problems and that can be found in linear time. By contrast, spatial-overlap and set-containment joins are the hardest joins, with instances where optimal pebbling strategies reach the upper bound over all join problems and with the problem of discovering optimal pebbling strategies being NP-complete. For set-containment joins, we show that discovering the optimal pebbling is also MAX-SNP-Complete. As a consequence, we show that unless NP = P, there is a constant ∈o, such that this problem cannot be approximated within a factor of 1 + ∈Ο in polynomial time. Our results shed some light on the difficulty the applied community has had in finding “good” algorithms for spatial-overlap and set-containment joins.
- 1.S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and hardness of approximation problems. In 33rd Annual Symposium on Foundations of Computer Science, pages 14-23, 1992.Google ScholarDigital Library
- 2.G. Graefe. Query evaluation techniques for large databases. ACM Computing Surveys, 25(2):73-170, 1993. Google ScholarDigital Library
- 3.O. G. unther. Efficient computation of spatial joins. In Proceedings of the Ninth International Conference on Data Engineering, pages 50-59, 1993. Google ScholarDigital Library
- 4.F. Harary. Graph theory. Addison-Wesley, 1969.Google ScholarCross Ref
- 5.S. Helmer and G. Moerkotte. Evaluation of main memory join algorithms for joins with set comparison join predicates. In VLDB'97, Proceedings of 23rd International Conference on Very Large Data Bases, pages 386-395, 1997. Google ScholarDigital Library
- 6.T. H. Merrett, Y. Kambayashi, and H. Yasuura. Scheduling of page-fetches in join operations. In Very Large Data Bases, 7th International Conference, Proceedings, pages 488-498, 1981.Google Scholar
- 7.G. Neyer and P. Widmayer. Singularities make spatial join scheduling hard. In International Symposium on Algorithms and Computation (ISAAC), 8th International Symposium, Proceedings, pages 293-302, 1997. Google ScholarDigital Library
- 8.J. A. Orenstein. Spatial query processing in an object-oriented database system. In Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, pages 326-336, 1986. Google ScholarDigital Library
- 9.C. H. Papadimitriou. Computational complexity. Addison-Wesley, 1994.Google Scholar
- 10.C. H. Papadimitriou and K. Steiglitz. Combinatorial optimization: Algorithms and complexity. Prentice Hall, 1982. Google ScholarDigital Library
- 11.C. H. Papadimitriou and M. Yannakakis. Optimization, approximation, and complexity classes (extended abstract). In Proceedings of the Twentieth Annual ACM Symposium on Theory of Computing, pages 229-234, 1988. Google ScholarDigital Library
- 12.C. H. Papadimitriou and M. Yannakakis. The travelling salesman problem with distances 1 and 2. In Mathematics of Operations Research, pages 1-11, 1993. Google ScholarDigital Library
- 13.J. M. Patel and D. J. DeWitt. Partition based spatial-merge join. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 259-270, 1996. Google ScholarDigital Library
- 14.K. Ramasamy, J. M. Patel, J. F. Naughton, and R. Kaushik. Set containment joins: The good, the bad and the ugly. In VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, pages 351-362, 2000. Google ScholarDigital Library
- 15.M. Stonebraker. Object relational DBMS: The next great wave. Morgan Kaufmann, 1996. Google ScholarDigital Library
Index Terms
- On the complexity of join predicates
Recommendations
Worst-case optimal join algorithms: [extended abstract]
PODS '12: Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database SystemsEfficient join processing is one of the most fundamental and well-studied tasks in database research. In this work, we examine algorithms for natural join queries over many relations and describe a novel algorithm to process these queries optimally in ...
Hypergraph based reorderings of outer join queries with complex predicates
Complex queries containing outer joins are, for the most part, executed by commercial DBMS products in an "as written" manner. Only a very few reorderings of the operations are considered and the benefits of considering comprehensive reordering schemes ...
Multi-way spatial join selectivity for the ring join graph
Efficient spatial query processing is very important since the applications of the spatial DBMS (e.g. GIS, CAD/CAM, LBS) handle massive amount of data and consume much time. Many spatial queries contain the multi-way spatial join due to the fact that ...
Comments