Article

Selectivity-based partitioning: a divide-and-union paradigm for effective query optimization

Author:
Neoklis Polyzotis

Univ. of California, Santa Cruz

Univ. of California, Santa Cruz
View Profile

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge managementOctober 2005Pages 720–727https://doi.org/10.1145/1099554.1099730

Published:31 October 2005Publication History

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

Pages 720–727

ABSTRACT

Modern query optimizers select an efficient join ordering for a physical execution plan based essentially on the average join selectivity factors among the referenced tables. In this paper, we argue that this "monolithic" approach can miss important opportunities for the effective optimization of relational queries. We propose selectivity-based partitioning, a novel optimization paradigm that takes into account the join correlations among relation fragments in order to essentially enable multiple (and more effective) join orders for the evaluation of a single query. In a nutshell, the basic idea is to carefully partition a relation according to the selectivities of the join operations, and subsequently rewrite the query as a union of constituent queries over the computed partitions. We provide a formal definition of the related optimization problem and derive properties that characterize the set of optimal solutions. Based on our analysis, we develop a heuristic algorithm for computing efficiently an effective partitioning of the input query. Results from a preliminary experimental study verify the effectiveness of the proposed approach and demonstrate its potential as an effective optimization technique.

References

S. Babu, R. Motwani, K. Munagala, I. Nishizawa, and J. Widom. Adaptive ordering of pipelined stream filters. In ACM SIGMOD, 2004. Google ScholarDigital Library
K. Chakrabarti, M. Garofalakis, R. Rastogi, and K. Shim. Approximate Query Processing Using Wavelets. In VLDB, 2000. Google ScholarDigital Library
S. Chaudhuri and K. Shim. Optimization of queries with user-defined predicates. In VLDB, 1996. Google ScholarDigital Library
A.Dobra, M.Garofalakis, J.Gehrke, and R. Rastohi. Processing complex aggregate queries over data streams. In ACM SIGMOD, 2002. Google ScholarDigital Library
S. Cluet and G. Moerkotte. On the complexity of generating optimal left-deep processing trees with cross products. In ICDT, 1995. Google ScholarDigital Library
A. Deshpande and J. M. Hellerstein. Lifting the burden of history from adaptive query processing. In VLDB, 2004. Google ScholarDigital Library
D. J. DeWitt, R. H. Gerber, G. Graefe, M. L. Heytens, K. B. Kumar, and M. Muralikrishna. GAMMA - A High Performance Dataflow Database Machine. In VLDB, 1986. Google ScholarDigital Library
D. J. DeWitt and J. Gray. Parallel database systems: The future of high performance database systems. CACM, 35(6), 1992. Google ScholarDigital Library
G. Graefe and D. J. DeWitt. The exodus optimizer generator. In ACM SIGMOD, 1987. Google ScholarDigital Library
A. Halevy. Answering queries using views: A survey. Intl. Journal on Very Large Data Bases, 10(4), 2001. Google ScholarDigital Library
K. A. Hua and C.Lee. An adaptive data placement scheme for parallel database computer systems. In VLDB, 1990. Google ScholarDigital Library
T. Ibaraki and T. Kameda. On the optimal nesting order for computing n-relational joins. ACM Transactions on Database Systems, 9(3):482--502, 1984. Google ScholarDigital Library
Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. In ACM SIGMOD, 1990. Google ScholarDigital Library
Y. E. Ioannidis and V. Poosala. Histogram-Based Approximation of Set-Valued Query Answers. In VLDB, 1999. Google ScholarDigital Library
Y. E. Ioannidis. "Universality of Serial Histograms". In VLDB, 1993. Google ScholarDigital Library
N. Kabra and D. J. DeWitt. Efficient mid-query re-optimization of sub-optimal query execution plans. In SIGMOD, 1998. Google ScholarDigital Library
R. Krishnamurthy, B. Boral, and C. Zaniolo. Optimization of nonrecursive queries. In VLDB, 1986. Google ScholarDigital Library
H. Pirahesh, J. M. Hellerstein, and W. Hasan. Extensible/rule based query rewrite optimization in starburst. In ACM SIGMOD, 1992. Google ScholarDigital Library
P. G. Selinger, M. M. Astrahan, R. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In ACM SIGMOD, 1979. Google ScholarDigital Library
T. K. Sellis. Multiple-query optimization. ACM TODS, 13(1), 1988. Google ScholarDigital Library

Index Terms

Selectivity-based partitioning: a divide-and-union paradigm for effective query optimization
1. Information systems
  1. Data management systems
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory
      1. Database query processing and optimization (theory)

Recommendations

Graph-based synopses for relational selectivity estimation
SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data

This paper introduces the Tuple Graph (TUG) synopses, a new class of data summaries that enable accurate selectivity estimates for complex relational queries. The proposed summarization framework adopts a "semi-structured" view of the relational ...
Read More
Multi-way spatial join selectivity for the ring join graph

Efficient spatial query processing is very important since the applications of the spatial DBMS (e.g. GIS, CAD/CAM, LBS) handle massive amount of data and consume much time. Many spatial queries contain the multi-way spatial join due to the fact that ...
Read More
Improved selectivity estimator for XML queries based on structural synopsis

With the increasing popularity of XML database applications, the use of efficient XML query optimizers is becoming very essential. The performance of an XML query optimizer depends heavily on the query selectivity estimators it uses to find the best ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management
October 2005
854 pages
ISBN:1595931406
DOI:10.1145/1099554
General Chair:
Otthein Herzog
University of Bremen, Germany
,
Program Chairs:
Hans-Jörg Schek
University for Health Sciences, Medical Informatics and Technology, Austria
,
Norbert Fuhr
University of Duisburg-Essen, Germany
,
Abdur Chowdhury
America Online, USA
,
Wilfried Teiken
IBM T.J. Watson Research Center, USA
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 October 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
CIKM '05 Paper Acceptance Rate77of425submissions,18%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 22
  Total Citations
  View Citations
- 400
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Selectivity-based partitioning: a divide-and-union paradigm for effective query optimization

CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Graph-based synopses for relational selectivity estimation

Multi-way spatial join selectivity for the ring join graph

Improved selectivity estimator for XML queries based on structural synopsis