Abstract
Query optimization is the process of finding an efficient query execution plan for a given SQL query. The runtime difference between a good and a bad plan can be tremendous. For example, in the case of TPC-H query 5, a query with 5 joins, the difference between the best and the worst plan is more than 10,000×. Therefore, it is vital to avoid bad plans. The dominating factor which differentiates a good from a bad plan is their join order and whether this join order avoids large intermediate results.
Recommendations
Revisiting Runtime Dynamic Optimization for Join Queries in Big Data Management Systems
Effective query optimization remains an open problem for Big Data Management Systems. In this work, we revisit an old idea, runtime dynamic optimization, and adapt it to a big data management system, AsterixDB. The approach runs in stages (re-...
Exploiting Sharing Join Opportunities in Big Data Multiquery Optimization with Flink
Multiway join queries incur high-cost I/Os operations over large-scale data. Exploiting sharing join opportunities among multiple multiway joins could be beneficial to reduce query execution time and shuffled intermediate data. Although multiway join ...
Comments