skip to main content
10.1145/1376616.1376627acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

ARCube: supporting ranking aggregate queries in partially materialized data cubes

Published: 09 June 2008 Publication History

Abstract

Supporting ranking queries in database systems has been a popular research topic recently. However, there is a lack of study on supporting ranking queries in data warehouses where ranking is on multidimensional aggregates instead of on measures of base facts. To address this problem, we propose a query execution model to answer different types of ranking aggregate queries based on a unified, partial cube structure, ARCube. The query execution model follows a candidate generation and verification framework, where the most promising candidate cells are generated using a set of high-level guiding cells. We also identify a bounding principle for effective pruning: once a guiding cell is pruned, all of its children candidate cells can be pruned. We further address the problem of efficient online candidate aggregation and verification by developing a chunk-based execution model to verify a bulk of candidates within a bounded memory buffer. Our extensive performance study shows that the new framework not only leads to an order of magnitude performance improvements over the state-of-the-art method, but also is much more flexible in terms of the types of ranking aggregate queries supported.

References

[1]
DBLP. http://www.informatik.uni-trier.de/~ley/db/.
[2]
TPC-H. http://www.tpc.org/tpch/.
[3]
R. Agrawal, R. Rantzau, and E. Terzi. Context-sensitive ranking. In SIGMOD Conference, pages 383--394, 2006.
[4]
H. Bast, D. Majumdar, R. Schenkel, M. Theobald, and G. Weikum. Io-top-k: Index-access optimized top-k query processing. In VLDB, pages 475-486, 2006.
[5]
K. S. Beyer and R. Ramakrishnan. Bottom-up computation of sparse and iceberg cubes. In SIGMOD Conference, pages 359--370, 1999.
[6]
N. Bruno, L. Gravano, and A. Marian. Evaluating top-k queries over web-accessible databases. In ICDE, pages 369--380, 2002.
[7]
M. J. Carey and D. Kossmann. On saying "enough already!" in sql. In SIGMOD Conference, pages 219--230, 1997.
[8]
K. Chakrabarti, V. Ganti, J. Han, and D. Xin. Ranking objects based on relationships. In SIGMOD Conference, pages 371--382, 2006.
[9]
K. C.-C. Chang and S. won Hwang. Minimal probing: supporting expensive predicates for top-k queries. In SIGMOD Conference, pages 346--357, 2002.
[10]
S. Chaudhuri and U. Dayal. An overview of data warehousing and olap technology. SIGMOD Record, 26(1):65--74, 1997.
[11]
J. Clauben, A. Kemper, D. Kossmann, and C. Wiesner. Exploiting early sorting and early partitioning for decision support query processing. VLDB J., 9(3):190--213, 2000.
[12]
G. Das, D. Gunopulos, N. Koudas, and D. Tsirogiannis. Answering top-k queries using views. In VLDB, pages 451--462, 2006.
[13]
P. Deshpande, K. Ramasamy, A. Shukla, and J. F. Naughton. Caching multidimensional queries using chunks. In SIGMOD Conference, pages 259--270, 1998.
[14]
R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, 2001.
[15]
M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J. D. Ullman. Computing iceberg queries efficiently. In VLDB, pages 299--310, 1998.
[16]
V. Gaede and O. GAunther. Multidimensional access methods. ACM Comput. Surv., 30(2):170--231, 1998.
[17]
J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub totals. Data Min. Knowl. Discov., 1(1):29--53, 1997.
[18]
J. Han, J. Pei, G. Dong, and K. Wang. Efficient computation of iceberg cubes with complex measures. In SIGMOD Conference, pages 1--12, 2001.
[19]
V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD Conference, pages 205--216, 1996.
[20]
M. Hua, J. Pei, A. W.-C. Fu, X. Lin, and H. fung Leung. Efficiently answering top-k typicality queries on large databases. In VLDB, pages 890--901, 2007.
[21]
I. F. Ilyas, R. Shah, W. G. Aref, J. S. Vitter, and A. K. Elmagarmid. Rank-aware query optimization. In SIGMOD Conference, pages 203--214, 2004.
[22]
L. V. S. Lakshmanan, J. Pei, and J. Han. Quotient cube: How to summarize the semantics of a data cube. In VLDB, pages 778--789, 2002.
[23]
C. Li, K. C.-C. Chang, and I. F. Ilyas. Supporting ad-hoc ranking aggregates. In SIGMOD Conference, pages 61--72, 2006.
[24]
C. Li, K. C.-C. Chang, I. F. Ilyas, and S. Song. Ranksql: Query algebra and optimization for relational top-k queries. In SIGMOD Conference, pages 131--142, 2005.
[25]
H.-G. Li, H. Yu, D. Agrawal, and A. E. Abbadi. Progressive ranking of range aggregates. In DaWaK, pages 179--189, 2005.
[26]
X. Li, J. Han, and H. Gonzalez. High-dimensional olap: A minimal cubing approach. In VLDB, pages 528--539, 2004.
[27]
Y. Luo, X. Lin, W. Wang, and X. Zhou. Spark: top-k keyword query in relational databases. In SIGMOD Conference, pages 115--126, 2007.
[28]
S. Sarawagi and M. Stonebraker. Efficient organization of large multidimensional arrays. In ICDE, pages 328--336, 1994.
[29]
A. Shukla, P. Deshpande, and J. F. Naughton. Materialized view selection for multidimensional datasets. In VLDB, pages 488--499, 1998.
[30]
Y. Sismanis, A. Deligiannakis, N. Roussopoulos, and Y. Kotidis. Dwarf: shrinking the petacube. In SIGMOD Conference, pages 464--475, 2002.
[31]
I. H. Witten, A. Moffat, and T. C. Bell. Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann, May 1999.
[32]
D. Xin, J. Han, H. Cheng, and X. Li. Answering top-k queries with multi-dimensional selections: The ranking cube approach. In VLDB, pages 463--475, 2006.
[33]
D. Xin, J. Han, X. Li, and B. W. Wah. Star-cubing: Computing iceberg cubes by top-down and bottom-up integration. In VLDB, pages 476--487, 2003.
[34]
Y. Zhao, P. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In SIGMOD Conference, pages 159--170, 1997.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data
June 2008
1396 pages
ISBN:9781605581026
DOI:10.1145/1376616
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data cube
  2. partial materialization
  3. ranking aggregate queries

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)2
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Semantic-Aware Data Cube for Cloud NetworksSearchable Storage in Cloud Computing10.1007/978-981-13-2721-6_8(179-204)Online publication date: 9-Feb-2019
  • (2018)TopKubeIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2017.267134124:3(1394-1407)Online publication date: 1-Mar-2018
  • (2018)Efficient Longest Streak Discovery in Multidimensional Sequence DataWeb and Big Data10.1007/978-3-319-96893-3_13(166-181)Online publication date: 19-Jul-2018
  • (2017)Query Frequency based View SelectionInternational Journal of Business Analytics10.4018/IJBAN.20170101034:1(36-55)Online publication date: Jan-2017
  • (2017)Data CanopyProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3064051(557-572)Online publication date: 9-May-2017
  • (2017)Extracting Top-K Insights from Multi-dimensional DataProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3035922(1509-1524)Online publication date: 9-May-2017
  • (2014)ANTELOPEIEEE Transactions on Computers10.1109/TC.2013.11063:9(2146-2159)Online publication date: 1-Sep-2014
  • (2014)A correlation-aware partial materialization scheme for near real-time automotive queries2014 International Conference on Smart Computing10.1109/SMARTCOMP.2014.7043864(237-244)Online publication date: Nov-2014
  • (2014)Fault-tolerant cycles embedding in hypercubes with faulty edgesInformation Sciences: an International Journal10.1016/j.ins.2014.05.052282(57-69)Online publication date: 1-Oct-2014
  • (2011)TopRecsProceedings of the 14th International Conference on Extending Database Technology10.1145/1951365.1951392(213-224)Online publication date: 21-Mar-2011
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media