skip to main content
10.1145/3035918.3064053acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

Optimizing Iceberg Queries with Complex Joins

Published: 09 May 2017 Publication History

Abstract

Iceberg queries, commonly used for decision support, find groups whose aggregate values are above or below a threshold. In practice, iceberg queries are often posed over complex joins that are expensive to evaluate. This paper proposes a framework for combining a number of techniques---a-priori, memoization, and pruning---to optimize iceberg queries with complex joins. A-priori pushes partial GROUP BY and HAVING condition before a join to reduce its input size. Memoization caches and reuses join computation results. Pruning uses cached results to infer that certain tuples cannot contribute to the final query result, and short-circuits join computation. We formally derive conditions for correctly applying these techniques. Our practical rewrite algorithm produces highly efficient SQL that can exploit combinations of optimization opportunities in ways previously not possible. We evaluate our PostgreSQL-based implementation experimentally and show that it outperforms both baseline PostgreSQL and a commercial database system.

References

[1]
http://www.500hrc.com/500-hrc-articles/cubs-teammates-compete-to-be-next-to-300.html.
[2]
http://www.seanlahman.com/baseball-archive/statistics/. 2017.
[3]
R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In VLDB, pages 487--499, 1994.
[4]
F. Bancilhon, D. Maier, Y. Sagiv, and J. D. Ullman. Magic sets and other strange ways to implement logic programs. In SIGMOD, pages 1--15. ACM, 1985.
[5]
K. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cube. In ACM SIGMOD Record, volume 28, pages 359--370. ACM, 1999.
[6]
C.-Y. Chan, H. V. Jagadish, K.-L. Tan, A. K. H. Tung, and Z. Zhang. Finding k-dominant skylines in high dimensional space. In SIGMOD, pages 503--514, 2006.
[7]
J. Chomicki, P. Ciaccia, and N. Meneghetti. Skyline queries, front and back. SIGMOD Rec., 42(3):6--18, Oct. 2013.
[8]
M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J. D. Ullman. Computing iceberg queries efficiently. In VLDB. Stanford InfoLab, 1999.
[9]
J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Min. Knowl. Discov., 1(1):29--53, Jan. 1997.
[10]
A. Kalinin, U. Cetintemel, and S. Zdonik. Searchlight: Enabling integrated search and exploration over large multi-dimensional data. In VLDB, volume 8, pages 1094--1105. VLDB Endowment, 2015.
[11]
L. Khachiyan. Fourier--Motzkin elimination method, pages 1074--1077. Encyclopedia of Optimization, Springer US, Boston, MA, 2009.
[12]
I. S. Mumick, S. J. Finkelstein, H. Pirahesh, and R. Ramakrishnan. Magic is relevant. In ACM SIGMOD Record, volume 19, pages 247--258. ACM, 1990.
[13]
R. T. Ng, L. V. S. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. SIGMOD Rec., 27(2):13--24, June 1998.
[14]
D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. ACM Trans. Database Syst., 30(1):41--82, Mar. 2005.
[15]
S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. SIGMOD Rec., 27(2):343--354, June 1998.
[16]
P. Seshadri, J. M. Hellerstein, H. Pirahesh, T. Leung, R. Ramakrishnan, D. Srivastava, P. J. Stuckey, and S. Sudarshan. Cost-based optimization for magic: Algebra and implementation. In ACM SIGMOD Record, volume 25, pages 435--446. ACM, 1996.
[17]
Y. Shou, N. Mamoulis, H. Cao, D. Papadias, and D. W. Cheung. Evaluation of iceberg distance joins. In SSTD, pages 270--288. Springer, 2003.
[18]
D. Tsur, J. D. Ullman, S. Abiteboul, C. Clifton, R. Motwani, S. Nestorov, and A. Rosenthal. Query flocks: A generalization of association-rule mining. In ACM SIGMOD Record, volume 27, pages 1--12. ACM, 1998.
[19]
B. Walenz and J. Yang. Perturbation analysis of database queries. In VLDB, volume 9, pages 1635--1646, 2016.
[20]
Wolfram Research Inc. Mathematica Online, 2016.
[21]
W. Yan and P.-A. Larson. Interchanging the order of grouping and join. Technical report, Technical Report CS 95-09, Dept. of Computer Science, University of Waterloo, Canada, 1995.
[22]
W. P. Yan and P.-A. Larson. Performing group-by before join. In Data Engineering, 1994. Proceedings. 10th International Conference, pages 89--100. IEEE, 1994.
[23]
W. P. Yan, P.-B. Larson, et al. Eager aggregation and lazy aggregation. In VLDB, volume 95, pages 345--357, 1995.

Cited By

View all
  • (2024)Thorough Data Pruning for Join Query in Database SystemIEEE Transactions on Sustainable Computing10.1109/TSUSC.2023.32793829:3(409-421)Online publication date: May-2024
  • (2023)Scaling a Declarative Cluster Manager Architecture with Query Optimization TechniquesProceedings of the VLDB Endowment10.14778/3603581.360359916:10(2618-2631)Online publication date: 8-Aug-2023
  • (2023)SecBerg: Secure and Practical Iceberg Queries in CloudIEEE Transactions on Services Computing10.1109/TSC.2023.326471016:5(3696-3710)Online publication date: Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
May 2017
1810 pages
ISBN:9781450341974
DOI:10.1145/3035918
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 May 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. databases
  2. iceberg
  3. iceberg queries
  4. postgresql
  5. query optimization

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS'17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)101
  • Downloads (Last 6 weeks)18
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Thorough Data Pruning for Join Query in Database SystemIEEE Transactions on Sustainable Computing10.1109/TSUSC.2023.32793829:3(409-421)Online publication date: May-2024
  • (2023)Scaling a Declarative Cluster Manager Architecture with Query Optimization TechniquesProceedings of the VLDB Endowment10.14778/3603581.360359916:10(2618-2631)Online publication date: 8-Aug-2023
  • (2023)SecBerg: Secure and Practical Iceberg Queries in CloudIEEE Transactions on Services Computing10.1109/TSC.2023.326471016:5(3696-3710)Online publication date: Sep-2023
  • (2022)Toward interpretable and actionable data analysis with explanations and causalityProceedings of the VLDB Endowment10.14778/3554821.355490215:12(3812-3820)Online publication date: 1-Aug-2022
  • (2021)SIAProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457262(2169-2181)Online publication date: 9-Jun-2021
  • (2021)Data-induced predicates for sideways information passing in query optimizersThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00693-231:6(1263-1290)Online publication date: 29-Aug-2021
  • (2019)Pushing data-induced predicates through joins in big-data clustersProceedings of the VLDB Endowment10.14778/3368289.336829213:3(252-265)Online publication date: 1-Nov-2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media