Abstract
In this paper we deal with association rule mining in the context of a complex, interactive and iterative knowledge discovery process. After a general introduction covering the basics of association rule mining and of the knowledge discovery process in databases we draw the attention to the problematic aspects concerning the integration of both. Actually, we come to the conclusion that with regard to human involvement and interactivity the current situation is far from being satisfying. In our paper we tackle this problem on three sides: First of all there is the algorithmic complexity. Although today’s algorithms efficiently prune the immense search space the achieved run times do not allow true interactivity. Nevertheless we present a rule caching schema that significantly reduces the number of mining runs. This schema helps to gain interactivity even in the presence of extreme run times of the mining algorithms. Second, today the mining data is typically stored in a relational database management system. We present an efficient integration with modern database systems which is one of the key factors in practical mining applications. Third, interesting rules must be picked from the set of generated rules. This might be quite costly because the generated rule sets normally are quite large whereas the percentage of useful rules is typically only a very small fraction. We enhance the traditional association rule mining framework in order to cope with this situation.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
P. Adriaans and D. Zantinge. Data Mining. Addison-Wesley, Harlow, England, 1996.
R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data (ACM SIGMOD’ 93), pages 207–216, Washington, USA, May 1993.
R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Databases (VLDB’ 94), Santiago, Chile, June 1994.
T. Barth. Guidelines for the data mining process. Technical report, University of Stuttgart, Stuttgart, Germany, 1998. ESPRIT Project Number 22700.
R. J. Brachman and T. Anand. The process of knowledge discovery in databases: A human centered approach. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, chapter 2, pages 37–57. AAAI/MIT Press, 1996.
S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: Generalizing association rules to correlations. In Proceedings of the ACM SIGMOD International Conference on Management of Data (ACM SIGMOD’ 7), pages 265–276, 1997.
S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and implication rules for market basket data. In Proceedings of the ACM SIGMOD International Conference on Management of Data (ACM SIGMOD’ 97), pages 265–276, 1997.
C. E. Brodley and P. Smyth. The process of applying machine learning algorithms. In Presented at Workshop on Applying Machine Learning in Practice, 12th International Machine Learning Conference (IMLC 95), Tahoe City, CA, 1995.
P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, and R. Wirth. CRISP-DM 1.0. http://www.crisp-dm.org/, 2000.
U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM, 39(11):27–34, November 1996.
J. Han, Y. Fu, W. Wang, K. Koperski, and O. Zaiane. DMQL: A data mining query language for relational databases. In Proceedings of the 1996 SIGMOD’96 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’ 96), Montreal, Canada, June 1996.
H. Heuser. Lehrbuch der Analysis. B. G. Teubner Verlag, Stuttgart, 8 edition, 1990.
J. Hipp, U. Güntzer, and U. Grimmer. Integrating association rule mining algorithms with relational database systems. In Proceedings of the 3rd International Conference on Enterprise Information Systems (ICEIS 2001), pages 130–137, Setúbal, Portugal, July 7–10 2001.
J. Hipp, U. Güntzer, and G. Nakhaeizadeh. Algorithms for association rule mining-a general survey and comparison. SIGKDD Explorations, 2(1):58–64, July 2000.
J. Hipp, U. Güntzer, and G. Nakhaeizadeh. Mining association rules: Deriving a superior algorithm by analysing today’s approaches. In Proceedings of the 4th European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’ 00), pages 159–168, Lyon, France, September 13–16 2000.
J. Hipp and G. Lindner. Analysing warranty claims of automobiles. an application description following the CRISP-DM data mining process. In Proceedings of 5th International Computer Science Conference (ICSC’ 99), pages 31–40, Hong Kong, China, December 13–15 1999.
J. Hipp, C. Mangold, U. Güntzer, and G. Nakhaeizadeh. Efficient rule retrieval and postponed restrict operations for association rule mining. In Proceedings of the Sixth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’02), May 6–8 2002.
J. Hipp, A. Myka, R. Wirth, and U. Güntzer. A new algorithm for faster mining of generalized association rules. In Proceedings of the 2nd European Symposium on Principles of Data Mining and Knowledge Discovery (PKDD’ 98), pages 74–82, Nantes, France, Sept. 23–26 1998.
IBM. Intelligent Miner Handbook, 1999.
T. Imielinski, A. Virmani, and A. Abdulghani. Data mining: Application programming interface and query language for database mining. In Proceedings of the 2nd International Conference on Knowledge Discovery in Databases and Data Mining (KDD’ 96), pages 256–262, Portland, Oregon, USA, August 1996.
T. Imielinski, A. Virmani, and A. Abdulghani. DMajor-application programming interface for database mining. Data Mining and Knowledge Discovery, 3(4):347–372, December 1999.
L. Lakshmanan, R. Ng, J. Han, and A. Pang. Optimization of constrained frequent set queries: 2-var constraints. In 3rd SIGMOD’98 Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), pages 157–168, Seattle, WA, June 1998.
R. Meo, G. Psaila, and S. Ceri. A new sql-like operator for mining association rules. In Proceedings of the 22nd International Conference on Very Large Databases (VLDB’ 96), Mumbai (Bombay), India, September 1996.
R. Ng, L. S. Lakshmanan, J. Han, and T. Mah. Exploratory mining via constrained frequent set queries. In Proceedings of the 1999 ACM-SIGMOD International Conference on Management of Data (SIGMOD’ 99), pages 556–558, Philadelphia, PA, USA, June 1999.
R. Ng, L. S. Lakshmanan, J. Han, and A. Pang. Exploratory mining and pruning optimizations of constrained associations rules. In Proceedings of 1998 ACM SIGMOD International Conference on Management of Data (SIGMOD’ 98), Seattle, Washington, USA, June 1998.
A. Savasere, E. Omiecinski, and S. Navathe. An efficient algorithm for mining association rules in large databases. In Proceedings of the 21st Conference on Very Large Databases (VLDB’ 95), pages 432–444, Zürich, Switzerland, September 1995.
R. Srikant and R. Agrawal. Mining generalized association rules. In Proceedings of the 21st Conference on Very Large Databases (VLDB’ 95), Zürich, Switzerland, September 1995.
R. Srikant and R. Agrawal. Mining quantitative association rules in large relational tables. In Proceedings of the 1996 ACM SIGMOD Conference on Management of Data, Montreal, Canada, June 1996.
R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Proceedings of the 3rd International Conference on KDD and Data Mining (KDD’ 97), Newport Beach, California, August 1997.
G. J. Williams and Z. Huang. Modelling the kdd process. Technical report, CSIRO Division of Information Technology, GPO Box 664 Canberra ACT 2601 Australia, Februar 1996.
R. Wirth, M. Borth, and J. Hipp. When distribution is part of the semantics: A new problem class for distributed knowledge discovery. In Proceedings of the PKDD 2001 Workshop on Ubiquitous Data Mining for Mobile and Distributed Environments, pages 56–64, Freiburg, Germany, September 3–7 2001.
R. Wirth and J. Hipp. CRISP-DM: Towards a standard process modell for data mining. In Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, pages 29–39, Manchester, UK, April 2000.
M. J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. New algorithms for fast discovery of association rules. In Proceedings of the 3rd International Conference on KDD and Data Mining (KDD’ 97), Newport Beach, California, August 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hipp, J., Güntzer, U., Nakhaeizadeh, G. (2002). Data Mining of Association Rules and the Process of Knowledge Discovery in Databases. In: Perner, P. (eds) Advances in Data Mining. Lecture Notes in Computer Science(), vol 2394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46131-0_2
Download citation
DOI: https://doi.org/10.1007/3-540-46131-0_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44116-8
Online ISBN: 978-3-540-46131-9
eBook Packages: Springer Book Archive