Skip to main content
Log in

Efficient Rule-Based Attribute-Oriented Induction for Data Mining

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Data mining has become an important technique which has tremendous potential in many commercial and industrial applications. Attribute-oriented induction is a powerful mining technique and has been successfully implemented in the data mining system DBMiner (Han et al. Proc. 1996 Int'l Conf. on Data Mining and Knowledge Discovery (KDD'96), Portland, Oregon, 1996). However, its induction capability is limited by the unconditional concept generalization. In this paper, we extend the concept generalization to rule-based concept hierarchy, which enhances greatly its induction power. When previously proposed induction algorithm is applied to the more general rule-based case, a problem of induction anomaly occurs which impacts its efficiency. We have developed an efficient algorithm to facilitate induction on the rule-based case which can avoid the anomaly. Performance studies have shown that the algorithm is superior than a previously proposed algorithm based on backtracking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal, S., Agrawal, R., Deshpande, P.M., Gupta, A., Naughton, J.F., Ramakrishnan, R., and Sarawagi, S. (1996).On the Computation of Multidimensional Aggregates. In Proceedings of the International Conference on Very Large Databases, Bombay, India (pp. 506–521).

  • Agrawal, R., Ghosh, S., Imielinski, T., Iyer, B., and Swami, A. (1992). An Interval Classifier for Database Mining Applications. In Proc. 18th Int. Conf. Very Large Data Bases, Vancouver, Canada (pp. 560–573).

  • Agrawal, R., Imielinski, T., and Swami, A. (1993). Database Mining: A Performance Perspective. IEEE Trans. on Knowledge and Data Engineering, 5, 914–925.

    Google Scholar 

  • Agrawal, R. and Srikant, R. (1994). Fast Algorithms for Mining Association Rules. In Proc. 1994 Int. Conf. Very Large Data Bases, Santiago, Chile (pp. 487–499).

  • Brodie, M.L. and Ceri, S. (1992). On Intelligent and Cooperative Information Systems: A Workshop Summary. International Journal of Intelligent and Cooperative Information Systems, 1(2), 233–248.

    Google Scholar 

  • Chaudhuri, S. and Dayal, U. (1997). An Overview of DataWarehousing and OLAP Technology. ACM-SIGMOD Record, 26(1), P.65–P.74.

    Google Scholar 

  • Cheung, D.W., Fu, A.W., and Han, J. (1994). Knowledge Discovery in Databases: A Rule-Based Attribute-Oriented Approach. In Proc. 1994 Int. Symp. on Methodologies for Intelligent Systems, Charlotte, North Carolina (pp. 164–173).

  • Cheung,D.W., Han, J., Ng,V.T., Fu,A.W., and Fu,Y. (1996a).AFast Distributed Algorithm for Mining Association Rules. In Proc. Fourth International Conference on Parallel and Distributed Information System (PDIS-96), Miami Beach, Florida.

  • Cheung, D.W., Han, J., Ng, V.T., and Wong, C.Y. (1996b). Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique. In Proc. 1996 IEEE Int. Conf. on Data Engineering, New Orleans, Louisiana.

  • Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R. (1995). Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press.

  • Fisher, D. (1987). Improving Inference through Conceptual Clustering. In Proc. 1987 AAAI Conf., Seattle,Washington (pp. 461–465)

  • Frawley,W.J., Piatetsky-Shapiro, G., and Matheus, C.J. (1991). Knowledge Discovery in Databases: An Overview. In G. Piatetsky-Shapiro and W.J. Frawley (Eds.), Knowledge Discovery in Databases (pp. 1–27). AAAI/MIT Press.

  • Gallaire, H., Minker, J., and Nicolas, J. (1984). Logic and Databases: A Deductive Approach. ACM Comput. Surv., 16, 153–185.

    Google Scholar 

  • Gray, J., Bosworth, A., Layman, A., and Piramish, H. (1996). Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-tab, and Sub-total. In Proceeding of the 12th Intl. Conference on Data Engineering, New Orleans (pp. 152–159).

  • Han, J., Cai, Y., and Cercone, N. (1992). Knowledge Discovery in Databases: An Attribute-Oriented Approach. In Proc. 18th Int. Conf. Very Large Data Bases, Vancouver, Canada (pp. 547–559).

  • Han, J., Cai, Y., and Cercone, N. (1993). Data-Driven Discovery of Quantitative Rules in Relational Databases. IEEE Trans. Knowledge and Data Engineering, 5, 29–40.

    Google Scholar 

  • Han, J. and Fu, Y. (1995a). Discovery of Multiple-Level Association Rules from Large Databases. In Proc. 1995 Int. Conf. Very Large Data Bases, Zurich, Switzerland (pp. 420–431).

  • Han, J. and Fu, Y. (1995b). Exploration of the Power of Attribute-Oriented Induction in Data Mining. In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy (Eds.), Advances in Knowledge Discovery and Data Mining (pp. 399–421). AAAI/MIT Press.

  • Han, J., Fu, Y., Wang, W., Chiang, J., Gong, W., Koperski, K., Li, D., Lu, Y., Rajan, A., Stefanovic, N., Xia, B., and Zaiane, O.R. (1996). DBMiner: A System for Mining Knowledge in Large Relational Databases. In Proc. 1996 Int'l Conf. on Data Mining and Knowledge Discovery (KDD'96), Portland, Oregon.

  • Harinarayan, V., Rajaraman, A., and Ullman, J. (1996). Implementing Data Cubes Efficiently. In Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, Montreal, Canada.

  • Haussler, D. (1987). Learning Conjuctive Concepts in Structural Domains. In Proc. 1987 AAAI Conf., Seattle, Washington (pp. 466–470).

  • Hwang, H.Y. and Fu, A.W. (1995). Efficient Algorithms for Attribute-Oriented Induction. In Proc. 1995 1st Int. Conf. on Knowledge Discovery and Data Mining (KDD'95), Montreal, Canada.

  • Kaufman, K.A., Michalski, R.S., and Kerschberg, L. (1991). Mining for Knowledge in Databases: Goals and General Description of the INLEN System. In G. Piatetsky-Shapiro andW.J. Frawley (Eds.), Knowledge Discovery in Databases (pp. 449–462). AAAI/MIT Press.

  • Michalski, R.S. (1983). A Theory and Methodology of Inductive Learning. In Michalski et al. (Eds.), Machine Learning: An Artificial Intelligence Approach, Vol. 1 (pp. 83–134). Morgan Kaufmann.

  • Silberschatz, A., Stonebraker, M., and Ullman, J. (1995). Database Research: Achievements and Opportunities Into the 21st Century. In Report of an NSF Workshop on the Future of Database Systems Research.

  • Ullman, J.D. (1989). Principles of Database and Knowledge-Base Systems, Vols. 1/2. Computer Science Press.

  • Widom, J. (1995). Research Problems in Data Warehousing. In Proc. 1995 4th Int. Conf. on Information and Knowledge Management (CIKM).

  • Zytkow, J. and Baker, J. (1991). Interactive Mining of Regularities in Databases. In G. Piatetsky-Shapiro andW.J. Frawley (Eds.), Knowledge Discovery in Databases (pp. 31–54). AAAI/MIT Press.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheung, D.W., Hwang, H., Fu, A.W. et al. Efficient Rule-Based Attribute-Oriented Induction for Data Mining. Journal of Intelligent Information Systems 15, 175–200 (2000). https://doi.org/10.1023/A:1008778107391

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008778107391

Navigation