Abstract
Datacube queries compute simple aggregates at multiple granularities. In this paper we examine the more general and useful problem of computing a complex subquery involving multiple dependent aggregates at multiple granularities. We call such queries “multi-feature cubes.” An example is “Broken down by all combinations of month and customer, find the fraction of the total sales in 1996 of a particular item due to suppliers supplying within 10% of the minimum price (within the group), showing all subtotals across each dimension.” We classify multi-feature cubes based on the extent to which fine granularity results can be used to compute coarse granularity results; this classification includes distributive, algebraic and holistic multi-feature cubes. We provide syntactic sufficient conditions to determine when a multi-feature cube is either distributive or algebraic. This distinction is important because, as we show, existing datacube evaluation algorithms can be used to compute multi-feature cubes that are distributive or algebraic, without any increase in I/O complexity. We evaluate the CPU performance of computing multi-feature cubes using the datacube evaluation algorithm of Ross and Srivastava. Using a variety of synthetic, benchmark and real-world data sets, we demonstrate that the CPU cost of evaluating distributive multi-feature cubes is comparable to that of evaluating simple datacubes. We also show that a variety of holistic multi-feature cubes can be evaluated with a manageable overhead compared to the distributive case.
Preview
Unable to display preview. Download preview PDF.
References
S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. In Proceedings of VLDB, pages 506–521, 1996.
R. Agrawal, A. Gupta, and S. Sarawagi. Modeling multidimensional databases. In Proceedings of IEEE ICDE, 1997.
D. Chatziantoniou and K. A. Ross. Querying multiple features of groups in relational databases. In Proceedings of VLDB, pages 295–306, 1996.
J. Gray, A. Bosworth, A. Layman, and H. Pirahesh. Datacube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In Proceedings of IEEE ICDE, pages 152–159, 1996. Also available as Microsoft Technical Report MSR-TR-95-22.
C. J. Hahn, S. G. Warren, and J. London. Edited synoptic cloud reports from ships and land stations over the globe, 1982–1991. Available from http://cdiac.esd.ornl.gov/cdiac/ndps/ndp026b.html, 1994.
C. Li and X. S. Wang. A data model for supporting on-line analytical processing. In Proceedings of CIKM, pages 81–88, 1996.
K. A. Ross and D. Srivastava. Fast computation of sparse datacubes. In Proceedings of VLDB, pages 116–125, 1997.
K. A. Ross, D. Srivastava and D. Chatziantoniou. Complex aggregation at multiple granularities. AT&T Technical Report, 1997.
Transaction Processing Performance Council (TPC), 777 N. First Street, Suite 600, San Jose, CA 95112, USA. TPC Benchmark D (Decision Support), May 1995.
Y. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In Proceedings of ACM SIGMOD, pages 159–170, 1997.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ross, K.A., Srivastava, D., Chatziantoniou, D. (1998). Complex aggregation at multiple granularities. In: Schek, HJ., Alonso, G., Saltor, F., Ramos, I. (eds) Advances in Database Technology — EDBT'98. EDBT 1998. Lecture Notes in Computer Science, vol 1377. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0100990
Download citation
DOI: https://doi.org/10.1007/BFb0100990
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64264-0
Online ISBN: 978-3-540-69709-1
eBook Packages: Springer Book Archive