Abstract
Association rule mining is an often used method to find relationships in the data and has been extensively studied in the literature. Unfortunately, most of these methods do not work well for numerical attributes. State-of-the-art quantitative association rule mining algorithms follow a common routine: (1) discretize the data and (2) mine for association rules. Unfortunately, this two-step approach can be rather inaccurate as discretization partitions the data space. This misses rules that are present in overlapping intervals.
In this paper, we explore the data for quantitative association rules hidden in overlapping regions of numeric data. Our method works without the need for a discretization step, and thus, prevents information loss in partitioning numeric attributes prior to the mining step. It exploits a statistical test for selecting relevant attributes, detects relationships of dense intervals in these attributes, and finally combines them into quantitative association rules. We evaluate our method on synthetic and real data to show its efficiency and quality improvement compared to state-of-the-art methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD 22(2), 207–216 (1993)
Altay Guvenir, H., Uysal, I.: Bilkent university function approximation repository (2000). http://funapp.cs.bilkent.edu.tr
Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. In: ACM SIGKDD, pp. 261–270 (1999)
Bay, S.D.: Multivariate discretization for set mining. Knowl. Inf. Syst. 3(4), 491–512 (2001)
Brin, S., Rastogi, R., Shim, K.: Mining optimized gain rules for numeric attributes. IEEE Trans. Knowl. Data Eng. 15(2), 324–338 (2003)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: ACM SIGKDD, pp. 226–231 (1996)
Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Mining optimized association rules for numeric attributes. J. Comput. Syst. Sci. 58(1), 1–12 (1999)
Grzymała-Busse, J.W.: Three strategies to rule induction from data with numerical attributes. In: Peters, J.F., Skowron, A., Dubois, D., Grzymała-Busse, J.W., Inuiguchi, M., Polkowski, L. (eds.) Transactions on Rough Sets II. LNCS, vol. 3135, pp. 54–62. Springer, Heidelberg (2004)
Kaytoue, M., Kuznetsov, S.O., Napoli, A.: Revisiting numerical pattern mining with formal concept analysis. International Joint Conference on Artificial Intelligence (IJCAI) arXiv preprint arxiv:1111.5689 (2011)
Kaytoue, M., Kuznetsov, S.O., Napoli, A., Duplessis, S.: Mining gene expression data with pattern structures in formal concept analysis. Inf. Sci. 181(10), 1989–2001 (2011)
Ke, Y., Cheng, J., Ng, W.: Mic framework: an information-theoretic approach to quantitative association rule mining. In: IEEE ICDE, pp. 112–112 (2006)
Kriegel, H.P., Kröger, P., Renz, M., Wurst, S.H.R.: A generic framework for efficient subspace clustering of high-dimensional data. In: IEEE ICDM, pp. 250–257 (2005)
Kröger, P., Kriegel, H.P., Kailing, K.: Density-connected subspace clustering for high-dimensional data. In: SIAM SDM, pp. 246–256 (2004)
Mata, J., Alvarez, J.L., Riquelme, J.C.: An evolutionary algorithm to discover numeric association rules. In: ACM SAC, pp. 590–594 (2002)
Miller, R.J., Yang, Y.: Association rules over interval data. ACM SIGMOD 26(2), 452–461 (1997)
Müller, E., Assent, I., Günnemann, S., Seidl, T.: Scalable density-based subspace clustering. In: ACM CIKM, pp. 1077–1086 (2011)
Müller, E., Assent, I., Krieger, R., Günnemann, S., Seidl, T.: DensEst: Density estimation for data mining in high dimensional spaces. In: SIAM SDM, pp. 175–186 (2009)
Müller, E., Günnemann, S., Assent, I., Seidl, T.: Evaluating clustering in subspace projections of high dimensional data. PVLDB 2(1), 1270–1281 (2009)
Salleb-Aouissi, A., Vrain, C., Nortet, C., Kong, X., Rathod, V., Cassard, D.: Quantminer for mining quantitative association rules. J. Mach. Learn. Res. 14(1), 3153–3157 (2013)
Serrurier, M., Dubois, D., Prade, H., Sudkamp, T.: Learning fuzzy rules with their implication operators. Data Knowl. Eng. 60(1), 71–89 (2007). http://dx.doi.org/10.1016/j.datak.2006.01.007
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: ACM SIGMOD. pp. 1–12 (1996)
Tatti, N.: Itemsets for real-valued datasets. In: IEEE ICDM, pp. 717–726 (2013)
Vannucci, M., Colla, V.: Meaningful discretization of continuous features for association rules mining by means of a som. In: ESANN, pp. 489–494 (2004)
Washio, T., Mitsunaga, Y., Motoda, H.: Mining quantitative frequent itemsets using adaptive density-based subspace clustering. In: IEEE ICDM, pp. 793–796 (2005)
Webb, G.I.: Discovering associations with numeric variables. In: ACM SIGKDD, pp. 383–388 (2001)
Wijsen, J., Meersman, R.: On the complexity of mining quantitative association rules. Data Min. Knowl. Discov. 2(3), 263–281 (1998)
Zhu, F., Yan, X., Han, J., Yu, P.S., Cheng, H.: Mining colossal frequent patterns by core pattern fusion. In: IEEE ICDE, pp. 706–715 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Van Brussel, T., Müller, E., Goethals, B. (2016). Discovering Overlapping Quantitative Associations by Density-Based Mining of Relevant Attributes. In: Gyssens, M., Simari, G. (eds) Foundations of Information and Knowledge Systems. FoIKS 2016. Lecture Notes in Computer Science(), vol 9616. Springer, Cham. https://doi.org/10.1007/978-3-319-30024-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-30024-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30023-8
Online ISBN: 978-3-319-30024-5
eBook Packages: Computer ScienceComputer Science (R0)