Cubegrades: Generalizing Association Rules

Imieliński, Tomasz; Khachiyan, Leonid; Abdulghani, Amin

doi:10.1023/A:1015417610840

Cubegrades: Generalizing Association Rules

Published: July 2002

Volume 6, pages 219–257, (2002)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Tomasz Imieliński¹,
Leonid Khachiyan¹ &
Amin Abdulghani¹

216 Accesses
45 Citations
Explore all metrics

Abstract

Cubegrades are a generalization of association rules which represent how a set of measures (aggregates) is affected by modifying a cube through specialization (rolldown), generalization (rollup) and mutation (which is a change in one of the cube's dimensions). Cubegrades are significantly more expressive than association rules in capturing trends and patterns in data because they can use other standard aggregate measures, in addition to COUNT. Cubegrades are atoms which can support sophisticated “what if” analysis tasks dealing with behavior of arbitrary aggregates over different database segments. As such, cubegrades can be useful in marketing, sales analysis, and other typical data mining applications in business.

In this paper we introduce the concept of cubegrades. We define them and give examples of their usage. We then describe in detail an important task for computing cubegrades: generation of significant cubes whichis analogous to generating frequent sets. A novel Grid Based Pruning (GBP) method is employed for this purpose. We experimentally demonstrate the practicality of the method. We conclude with a number of open questions and possible extensions of the work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agarwal, S., Agrawal, R., Deshpande, P.M., Gupta, A., Naughton, J.F., Ramakrishnan, R., and Sarawagi, S. 1996. On the computation of multidimensional aggregates. In Proceedings of the 22nd International Conference on Very Large Data Bases (VLDB'96), Bombay, India, pp. 506–521.
Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining associations rules between sets of items in large databases. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'93), Washington, D.C., pp. 207–216.
Abdulghani, A. and Imielinski, T. Datamining with cubegrades: Querying, generation and application support. Data Mining and Knowledge Discovery, submitted for publication.
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A.I. 1996. Fast discovery of association rules. In Advances in Knowledge Discovery and Data Mining, U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, (Eds.), Menlo Park, CA: AAAI Press, pp. 307–328.
Google Scholar
Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In vldb94, Santiago, Chile, pp. 487–499.
Basu, S. 1997.An improved algorithm for quantifier elimination over real closed fields. In Foundations of Computer Science (FOCS 1997).
Baralis, E., Paraboschi, S., and Teniente, E. 1997. Materialized view selection in a multidimensional database. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB'97), Athens, Greece, pp. 156–165.
Beyer, K. and Ramakrishnan, R. 1999. Bottom-up computation of sparse and iceberg cubes. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'97), Philadelphia, Pennsylvania, pp. 359–370.
Cognos Software Corporation. Cognos scenaron. http://www.cognos.com, 1998.
Gray, J., Bosworth, A., Layman, A., and Pirahesh, H. 1996. Data Cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In 12th International Conference on Data Engineering (ICDE'96), New Orleans, Louisiana, pp. 152–159.
Han, J. and Fu, Y. 1995. Discovery of multiple level association rules from large databases. In Proceedings of the 21st International Conference on Very Large Data Bases (VLDB'95), Zurich, Switzerland, pp. 420–431.
Heintz, J., Roy, M.-F., and Solernó, P. 1993. On the theoretical and practical complexity of the existential theory of reals. The Computer Journal, 36(5).
Harinarayan, V., Rajaraman, A., and Ullman, J.D. 1996. Implementing data cubes efficiently. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'96), pp. 205–216.
Imielinski, T. and Virmani, A. 1999. M-sql: A query language for database mining. Data Mining and Knowledge Discovery.
Imielinski, T., Virmani, A., and Abdulghani, A. 1999. Dmajor-application programming interface for database mining. Data Mining and Knowledge Discovery.
Lakshmanan, L.V.S., Ng, R., Han, J., and Pang, A. 1999. Optimization of constrained frequent set queries with 2-variable constraints. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'99), Philadelphia, Pennsylvania, pp. 157–168.
Ng, R., Lakshmanan, L.V.S., Han, J., and Pang, A. 1998. Exploratory mining and pruning optimizations of constrained association rules. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'98), Seattle, Washington, pp. 13–24.
StatLog Project. Australian credit. http://www.ncc.up.pt/liacc/ML/statlog/.
Ross, K.A. and Srivastava, D. 1997. Fast computation of sparse datacubes. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB'97), Athens, Greece, pp. 116–125.
Ross, K.A., Srivastava, D., Stuckey, P.J., and Sudarshan, S. 1998. Foundations of aggregation constraints. Theoretical Computer Science, 193:149–179.
Google Scholar
Sarawagi, S., Agrawal, R., and Megiddo, N. 1998. Discovery-driven exploration of olap data cubes. In 6th International Conference on Extending Database Technology, Valencia, Spain, pp. 168–182.
Sarawagi, S. 1999. Explaining differences in multidimensional aggregates. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB'99), Edinburgh, Scotland, pp. 42–53.
Shukla, A., Deshpande, P.M., and Naughton, J.F. 1998. Materialized view selection for multidimensional datasets. In Proceedings of the 24th International Conference on Very Large Data Bases (VLDB'98), New York, NY, pp. 488–499.
Pilot Software. Decision support suite. http://www.pilotsw.com.
Virmani, A. 1998. Discovery board: A query based approach to data mining. PhD Thesis, Rutgers University.

Download references

Author information

Authors and Affiliations

Department of Computer Science, Rutgers, The State University of N.J., Piscataway, N.J, 08854
Tomasz Imieliński, Leonid Khachiyan & Amin Abdulghani

Authors

Tomasz Imieliński
View author publications
You can also search for this author in PubMed Google Scholar
Leonid Khachiyan
View author publications
You can also search for this author in PubMed Google Scholar
Amin Abdulghani
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Imieliński, T., Khachiyan, L. & Abdulghani, A. Cubegrades: Generalizing Association Rules. Data Mining and Knowledge Discovery 6, 219–257 (2002). https://doi.org/10.1023/A:1015417610840

Download citation

Issue Date: July 2002
DOI: https://doi.org/10.1023/A:1015417610840

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cubegrades: Generalizing Association Rules

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

On the nature and types of anomalies: a review of deviations in data

Multidimensional scaling for big data

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Cubegrades: Generalizing Association Rules

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

On the nature and types of anomalies: a review of deviations in data

Multidimensional scaling for big data

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation