Skip to main content
Log in

Cubegrades: Generalizing Association Rules

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Cubegrades are a generalization of association rules which represent how a set of measures (aggregates) is affected by modifying a cube through specialization (rolldown), generalization (rollup) and mutation (which is a change in one of the cube's dimensions). Cubegrades are significantly more expressive than association rules in capturing trends and patterns in data because they can use other standard aggregate measures, in addition to COUNT. Cubegrades are atoms which can support sophisticated “what if” analysis tasks dealing with behavior of arbitrary aggregates over different database segments. As such, cubegrades can be useful in marketing, sales analysis, and other typical data mining applications in business.

In this paper we introduce the concept of cubegrades. We define them and give examples of their usage. We then describe in detail an important task for computing cubegrades: generation of significant cubes whichis analogous to generating frequent sets. A novel Grid Based Pruning (GBP) method is employed for this purpose. We experimentally demonstrate the practicality of the method. We conclude with a number of open questions and possible extensions of the work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agarwal, S., Agrawal, R., Deshpande, P.M., Gupta, A., Naughton, J.F., Ramakrishnan, R., and Sarawagi, S. 1996. On the computation of multidimensional aggregates. In Proceedings of the 22nd International Conference on Very Large Data Bases (VLDB'96), Bombay, India, pp. 506–521.

  • Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining associations rules between sets of items in large databases. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'93), Washington, D.C., pp. 207–216.

  • Abdulghani, A. and Imielinski, T. Datamining with cubegrades: Querying, generation and application support. Data Mining and Knowledge Discovery, submitted for publication.

  • Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., and Verkamo, A.I. 1996. Fast discovery of association rules. In Advances in Knowledge Discovery and Data Mining, U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, (Eds.), Menlo Park, CA: AAAI Press, pp. 307–328.

    Google Scholar 

  • Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In vldb94, Santiago, Chile, pp. 487–499.

  • Basu, S. 1997.An improved algorithm for quantifier elimination over real closed fields. In Foundations of Computer Science (FOCS 1997).

  • Baralis, E., Paraboschi, S., and Teniente, E. 1997. Materialized view selection in a multidimensional database. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB'97), Athens, Greece, pp. 156–165.

  • Beyer, K. and Ramakrishnan, R. 1999. Bottom-up computation of sparse and iceberg cubes. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'97), Philadelphia, Pennsylvania, pp. 359–370.

  • Cognos Software Corporation. Cognos scenaron. http://www.cognos.com, 1998.

  • Gray, J., Bosworth, A., Layman, A., and Pirahesh, H. 1996. Data Cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. In 12th International Conference on Data Engineering (ICDE'96), New Orleans, Louisiana, pp. 152–159.

  • Han, J. and Fu, Y. 1995. Discovery of multiple level association rules from large databases. In Proceedings of the 21st International Conference on Very Large Data Bases (VLDB'95), Zurich, Switzerland, pp. 420–431.

  • Heintz, J., Roy, M.-F., and Solernó, P. 1993. On the theoretical and practical complexity of the existential theory of reals. The Computer Journal, 36(5).

  • Harinarayan, V., Rajaraman, A., and Ullman, J.D. 1996. Implementing data cubes efficiently. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'96), pp. 205–216.

  • Imielinski, T. and Virmani, A. 1999. M-sql: A query language for database mining. Data Mining and Knowledge Discovery.

  • Imielinski, T., Virmani, A., and Abdulghani, A. 1999. Dmajor-application programming interface for database mining. Data Mining and Knowledge Discovery.

  • Lakshmanan, L.V.S., Ng, R., Han, J., and Pang, A. 1999. Optimization of constrained frequent set queries with 2-variable constraints. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'99), Philadelphia, Pennsylvania, pp. 157–168.

  • Ng, R., Lakshmanan, L.V.S., Han, J., and Pang, A. 1998. Exploratory mining and pruning optimizations of constrained association rules. In Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD'98), Seattle, Washington, pp. 13–24.

  • StatLog Project. Australian credit. http://www.ncc.up.pt/liacc/ML/statlog/.

  • Ross, K.A. and Srivastava, D. 1997. Fast computation of sparse datacubes. In Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB'97), Athens, Greece, pp. 116–125.

  • Ross, K.A., Srivastava, D., Stuckey, P.J., and Sudarshan, S. 1998. Foundations of aggregation constraints. Theoretical Computer Science, 193:149–179.

    Google Scholar 

  • Sarawagi, S., Agrawal, R., and Megiddo, N. 1998. Discovery-driven exploration of olap data cubes. In 6th International Conference on Extending Database Technology, Valencia, Spain, pp. 168–182.

  • Sarawagi, S. 1999. Explaining differences in multidimensional aggregates. In Proceedings of the 25th International Conference on Very Large Data Bases (VLDB'99), Edinburgh, Scotland, pp. 42–53.

  • Shukla, A., Deshpande, P.M., and Naughton, J.F. 1998. Materialized view selection for multidimensional datasets. In Proceedings of the 24th International Conference on Very Large Data Bases (VLDB'98), New York, NY, pp. 488–499.

  • Pilot Software. Decision support suite. http://www.pilotsw.com.

  • Virmani, A. 1998. Discovery board: A query based approach to data mining. PhD Thesis, Rutgers University.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Imieliński, T., Khachiyan, L. & Abdulghani, A. Cubegrades: Generalizing Association Rules. Data Mining and Knowledge Discovery 6, 219–257 (2002). https://doi.org/10.1023/A:1015417610840

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1015417610840

Navigation