Abstract
Many researchers in our community (this author included) regularly emphasize the role constraints play in improving performance of data-mining algorithms. This emphasis has led to remarkable progress – current algorithms allow an incredibly rich and varied set of hidden patterns to be efficiently elicited from massive datasets, even under the burden of NP-hard problem definitions and disk-resident or distributed data. But this progress has come at a cost. In our single-minded drive towards maximum performance, we have often neglected and in fact hindered the important role of discovery in the knowledge discovery and data-mining (KDD) process. In this paper, I propose various strategies for applying constraints within algorithms for itemset and rule mining in order to escape this pitfall.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the 1993 ACM-SIGMOD Conf. on Management of Data, pp. 207–216 (1993)
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast Discovery of Association Rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI Press, Menlo Park (1996)
Antunes, C., Oliveira, A.L.: Mining Patterns Using Relaxations of User Defined Constraints. In: Proc. of the Workshop on Knowledge Discovery in Inductive Databases (2004)
Bayardo, R.J.: Efficiently Mining Long Patterns from Databases. In: Proc. of the 1998 ACM-SIGMOD Int’l Conf. on Management of Data, pp. 85–93 (1998)
Bayardo, R.J.: The many roles of constraints in data mining (Letter from the guest editor.). ACM SIGKDD Explorations 4(1), i–ii (2002)
Bayardo, R.J., Agrawal, R.: Mining the most interesting rules. In: Proc. of the Fifth ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, pp. 145–154 (1999)
Bayardo, R.J., Agrawal, R., Gunopulos, G.: Constraint-based rule mining in large, dense databases. In: Proc. of the 15th Int’l Conf. on Data Engineering, pp. 188–197 (1999)
Bayardo, R.J.: Brute-force mining of high confidence classification rules. In: Proc. of the Third International Conference on Knowledge Discovery and Data Mining, pp. 123–126 (1997)
Bucila, C., Gehrke, J., Kifer, D.: DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints. In: Proc. SIGKDD (2002)
Brin, S., Motwani, R., Ullman, J., Tsur, S.: Dynamic Itemset Counting and Implication Rules for Market Basket Data. In: Proc. of the 1997 ACM-SIGMOD Conf. on Management of Data, pp. 255–264 (1997)
Brachman, R.J., Anand, T.: The Process Of Knowledge Discovery In Databases: A Human-Centered Approach. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances In Knowledge Discovery And Data Mining, pp. 37–57. AAAI Press/The MIT Press, Menlo Park (1996)
Goethals, B., Zaki, M.J.: Advances in Frequent Itemset Mining Implementations: Introduction to FIMI 2003. In: Proc. of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations (2003)
Boulicaut, J.-F., Bykowski, A., Rigotti, C.: Approximation of frequency queries by means of free-sets. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 75–85. Springer, Heidelberg (2000)
Cremilleux, B., Boulicaut, J.-F.: Simplest rules characterizing classes generated by delta-free sets. In: Proceedings of the 22nd BCS SGAI International Conference on Knowledge Based Systems and Applied Artificial Intelligence, Cambridge, UK, pp. 33–46 (2002)
Jeudy, B., Boulicaut, J.-F.: Using Condensed Representations for Interactive Association Rule Mining. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, pp. 228–236. Springer, Heidelberg (2002)
Hipp, J., Güntzer, U.: Is pushing constraints deeply into the mining algorithms really what we want?: an alternative approach for association rule mining. ACM SIGKDD Explorations 4(1), 50–55 (2002)
Nag, B., Deshpande, P.M., DeWitt, D.J.: Using a knowledge cache for interactive discovery of association rules. In: Proc. SIGKDD 1999, pp. 244–253 (1999)
Ng, R., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proc. SIGMOD 1998, pp. 13–24 (1998)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Information Systems 24(1), 25–46 (1999)
Rymon, R.: Search through systematic set enumeration. In: Proc. of the Third Int’l Conf. on Principles of Knowledge Representation and Reasoning, pp. 539–550 (1992)
Sahar, S.: Interestingness via What is Not Interesting. In: Proc. of SIGKDD 1999, pp. 332–336 (1999)
Slagel, J.R., Chang, C.-L., Lee, R.C.T.: A New Algorithm for Generating Prime Implicants. IEEE Trans. on Computers C-19(4), 304–310 (1970)
Srikant, R., Vu, Q., Agrawal, R.: Mining Association Rules with Item Constraints. In: Proc. of the Third Int’l Conf. on Knowledge Discovery in Databases and Data Mining, pp. 67–73 (1997)
Webb, G.I.: Opus: an efficient admissible algorithm for unordered search. Journal of Artificial Intelligence Research 3, 431–465 (1995)
Zaki, M.J.: Generating non-redundant association rules. In: Proc. SIGKDD 2000, pp. 34–43 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bayardo, R.J. (2006). The Hows, Whys, and Whens of Constraints in Itemset and Rule Discovery. In: Boulicaut, JF., De Raedt, L., Mannila, H. (eds) Constraint-Based Mining and Inductive Databases. Lecture Notes in Computer Science(), vol 3848. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11615576_1
Download citation
DOI: https://doi.org/10.1007/11615576_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31331-1
Online ISBN: 978-3-540-31351-9
eBook Packages: Computer ScienceComputer Science (R0)