The Multi-Tree Cubing algorithm for computing iceberg cubes

Li, Xing; Hamilton, Howard J.; Karimi, Kamran; Geng, Liqiang

doi:10.1007/s10844-008-0074-3

The Multi-Tree Cubing algorithm for computing iceberg cubes

Published: 30 October 2008

Volume 33, pages 179–208, (2009)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Xing Li¹,
Howard J. Hamilton¹,
Kamran Karimi¹ &
…
Liqiang Geng¹

540 Accesses
Explore all metrics

Abstract

The computation of data cubes is one of the most expensive operations in on-line analytical processing (OLAP). To improve efficiency, an iceberg cube represents only the cells whose aggregate values are above a given threshold (minimum support). Top-down and bottom-up approaches are used to compute the iceberg cube for a data set, but both have performance limitations. In this paper, a new algorithm, called Multi-Tree Cubing (MTC), is proposed for computing an iceberg cube. The Multi-Tree Cubing algorithm is an integrated top-down and bottom-up approach. Overall control is handled in a top-down manner, so MTC features shared computation. By processing the orderings in the opposite order from the Top-Down Computation algorithm, the MTC algorithm is able to prune attributes. The Bottom Up Computation (BUC) algorithm and its variations also perform pruning by relying on the processing of intermediate partitions. The MTC algorithm, however, prunes without processing such partitions. The MTC algorithm is based on a specialized type of prefix tree data structure, called an Attribute–Partition tree (AP-tree), consisting of attribute and partition nodes. The AP-tree facilitates fast, in-memory sorting and APRIORI-like pruning. We report on five series of experiments, which confirm that MTC is consistently as fast or faster than BUC, while finding the same iceberg cubes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Agarwal, S., Agrawal, R., Deshpande, P. M., Gupta, A., Naughton, J. F., Ramakrishnan, R., et al. (1996). On the computation of multidimensional aggregates. In Proceedings of the 22nd VLDB conference (pp. 506–521). Bombay, India.
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In Proceedings of the 20th international conference on very large databases (pp. 487–499). Santiago, Chile.
Berry, M. J. A., & Linoff, G. (1997). Data mining techniques for marketing, sales, and customer support. Hoboken: Wiley.
Google Scholar
Beyer, K., & Ramakrishnan, R. (1999). Bottom-up computation of sparse and iceberg cubes. In Proceedings of the 1999 ACM SIGMOD international conference on management of data (pp. 359–370). Philadelphia
Chen, Y., Dehne, F., Eavis, T., & Rau-Chaplin, A. (2005). PnP: Parallel and external memory iceberg cubes. In Proceedings of the 21st international conference on data engineering (pp. 576–577). Tokyo, Japan.
Cho, M., Pei, J., & Cheung, D. (2005). Cross table cubing: Mining iceberg cubes from data warehouses. In Proceedings of the 5th SIAM international data mining conference (pp. 461–465). Newport Beach, California, USA.
Chou, P. L., & Zhang, X. (2003). Efficiently computing the top N averages in iceberg cubes. In Proceedings of the twenty-sixth Australasian computer science conference (ACSC2003) (pp. 101–109). Adelaide, Australia.
Codd, E. F. (1993). Providing OLAP (On-line Analytical Processing) to user-analysts: An IT mandate. Hoboken: E. F. Codd and Associates.
Google Scholar
Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., & Uthurusamy, R. (Eds.) (1996). Advances in knowledge discovery and data mining. Cambridge: AAAI Press/The MIT Press.
Google Scholar
Findlater, L., & Hamilton, H. J. (2003). Iceberg-cube algorithms: An empirical evaluation on synthetic and real data. Intelligent Data Analysis, 7(2), 77–97.
MATH Google Scholar
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., et al. (1997). Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Mining and Knowledge Discovery, 1(1), 29–54.
Article Google Scholar
Han, J., Pei, J., Dong, G., & Wang, K. (2001). Efficient computation of iceberg cubes with complex measures. In Proceedings of the 2001 ACM SIGMOD international conference on management of data (pp. 1–12). Santa Barbara, California.
Li, X. (2005). The Multi-Tree Cubing algorithm for computing iceberg cubes. M. Sc. Thesis, Department of Computer Science, University of Regina, Regina, SK, Canada, June.
Pendse, N., & Creeth, R. (1995). The OLAP Report. http://www.olapreport.com/Analyses.htm/.
Poosala, V. (1995). Zipf’s Law. Technical report, Computer Science, University of Wisconsin, Madison, Wisconsin, USA.
Ross, K. A., & Srivastava, D. (1997). Fast computation of sparse data cubes. In Proceedings of the 23rd international conference on very large databases (pp. 116–125). Athens, Greece.
Shao, Z., Han, J., & Xin, D. (2004). MM-Cubing: Computing iceberg cubes by factorizing the lattice space. In Proceedings of the 16th international conference on scientific and statistical database management (SSDBM 2004), June (pp. 213–222). Santorini Island, Greece.
Wang, K., Jiang, Y., Yu, J. X., Dong, G., & Han, J. (2005). Divide-and-approximate: A novel constraint push strategy for iceberg cube mining. IEEE Transactions on Knowledge and Data Engineering, 17(3), 354–368.
Article Google Scholar
Xin, D., Han, J., Li, X., & Wah, B. W. (2003). Star-Cubing: Computing iceberg cubes by top-down and bottom-up integration. In Proceedings of the 26th international conference on very large databases (VLDB’03) (pp. 476–487). Berlin, Germany.
Zhao, Y., Deshpande, P., & Naughton, J. F. (1997). An array-based algorithm for simultaneous multidimensional aggregates. In Proceedings of the 1997 ACM SIGMOD international conference on management of data (pp. 159–170) Tucson, Arizona.

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Regina, Regina, Saskatchewan, S4S 0A2, Canada
Xing Li, Howard J. Hamilton, Kamran Karimi & Liqiang Geng

Authors

Xing Li
View author publications
You can also search for this author inPubMed Google Scholar
Howard J. Hamilton
View author publications
You can also search for this author inPubMed Google Scholar
Kamran Karimi
View author publications
You can also search for this author inPubMed Google Scholar
Liqiang Geng
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Howard J. Hamilton.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, X., Hamilton, H.J., Karimi, K. et al. The Multi-Tree Cubing algorithm for computing iceberg cubes. J Intell Inf Syst 33, 179–208 (2009). https://doi.org/10.1007/s10844-008-0074-3

Download citation

Received: 26 April 2006
Revised: 28 July 2008
Accepted: 03 September 2008
Published: 30 October 2008
Issue Date: October 2009
DOI: https://doi.org/10.1007/s10844-008-0074-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Multi-Tree Cubing algorithm for computing iceberg cubes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

HaCube: Extending MapReduce for Efficient OLAP Cube Materialization and View Maintenance

Scalable distributed data cube computation for large-scale multidimensional data analysis on a Spark cluster

Computing and Mining ClustCube Cubes Efficiently

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

The Multi-Tree Cubing algorithm for computing iceberg cubes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

HaCube: Extending MapReduce for Efficient OLAP Cube Materialization and View Maintenance

Scalable distributed data cube computation for large-scale multidimensional data analysis on a Spark cluster

Computing and Mining ClustCube Cubes Efficiently

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now