Discovering Overlapping Quantitative Associations by Density-Based Mining of Relevant Attributes

Van Brussel, Thomas; Müller, Emmanuel; Goethals, Bart

doi:10.1007/978-3-319-30024-5_8

Thomas Van Brussel¹⁵,
Emmanuel Müller^15,16 &
Bart Goethals¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9616))

Included in the following conference series:

FoIKS

956 Accesses
1 Citations

Abstract

Association rule mining is an often used method to find relationships in the data and has been extensively studied in the literature. Unfortunately, most of these methods do not work well for numerical attributes. State-of-the-art quantitative association rule mining algorithms follow a common routine: (1) discretize the data and (2) mine for association rules. Unfortunately, this two-step approach can be rather inaccurate as discretization partitions the data space. This misses rules that are present in overlapping intervals.

In this paper, we explore the data for quantitative association rules hidden in overlapping regions of numeric data. Our method works without the need for a discretization step, and thus, prevents information loss in partitioning numeric attributes prior to the mining step. It exploits a statistical test for selecting relevant attributes, detects relationships of dense intervals in these attributes, and finally combines them into quantitative association rules. We evaluate our method on synthetic and real data to show its efficiency and quality improvement compared to state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD 22(2), 207–216 (1993)
Article Google Scholar
Altay Guvenir, H., Uysal, I.: Bilkent university function approximation repository (2000). http://funapp.cs.bilkent.edu.tr
Aumann, Y., Lindell, Y.: A statistical theory for quantitative association rules. In: ACM SIGKDD, pp. 261–270 (1999)
Google Scholar
Bay, S.D.: Multivariate discretization for set mining. Knowl. Inf. Syst. 3(4), 491–512 (2001)
Article MATH Google Scholar
Brin, S., Rastogi, R., Shim, K.: Mining optimized gain rules for numeric attributes. IEEE Trans. Knowl. Data Eng. 15(2), 324–338 (2003)
Article Google Scholar
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: ACM SIGKDD, pp. 226–231 (1996)
Google Scholar
Fukuda, T., Morimoto, Y., Morishita, S., Tokuyama, T.: Mining optimized association rules for numeric attributes. J. Comput. Syst. Sci. 58(1), 1–12 (1999)
Article MathSciNet MATH Google Scholar
Grzymała-Busse, J.W.: Three strategies to rule induction from data with numerical attributes. In: Peters, J.F., Skowron, A., Dubois, D., Grzymała-Busse, J.W., Inuiguchi, M., Polkowski, L. (eds.) Transactions on Rough Sets II. LNCS, vol. 3135, pp. 54–62. Springer, Heidelberg (2004)
Chapter Google Scholar
Kaytoue, M., Kuznetsov, S.O., Napoli, A.: Revisiting numerical pattern mining with formal concept analysis. International Joint Conference on Artificial Intelligence (IJCAI) arXiv preprint arxiv:1111.5689 (2011)
Kaytoue, M., Kuznetsov, S.O., Napoli, A., Duplessis, S.: Mining gene expression data with pattern structures in formal concept analysis. Inf. Sci. 181(10), 1989–2001 (2011)
Article MathSciNet Google Scholar
Ke, Y., Cheng, J., Ng, W.: Mic framework: an information-theoretic approach to quantitative association rule mining. In: IEEE ICDE, pp. 112–112 (2006)
Google Scholar
Kriegel, H.P., Kröger, P., Renz, M., Wurst, S.H.R.: A generic framework for efficient subspace clustering of high-dimensional data. In: IEEE ICDM, pp. 250–257 (2005)
Google Scholar
Kröger, P., Kriegel, H.P., Kailing, K.: Density-connected subspace clustering for high-dimensional data. In: SIAM SDM, pp. 246–256 (2004)
Google Scholar
Mata, J., Alvarez, J.L., Riquelme, J.C.: An evolutionary algorithm to discover numeric association rules. In: ACM SAC, pp. 590–594 (2002)
Google Scholar
Miller, R.J., Yang, Y.: Association rules over interval data. ACM SIGMOD 26(2), 452–461 (1997)
Article Google Scholar
Müller, E., Assent, I., Günnemann, S., Seidl, T.: Scalable density-based subspace clustering. In: ACM CIKM, pp. 1077–1086 (2011)
Google Scholar
Müller, E., Assent, I., Krieger, R., Günnemann, S., Seidl, T.: DensEst: Density estimation for data mining in high dimensional spaces. In: SIAM SDM, pp. 175–186 (2009)
Google Scholar
Müller, E., Günnemann, S., Assent, I., Seidl, T.: Evaluating clustering in subspace projections of high dimensional data. PVLDB 2(1), 1270–1281 (2009)
Google Scholar
Salleb-Aouissi, A., Vrain, C., Nortet, C., Kong, X., Rathod, V., Cassard, D.: Quantminer for mining quantitative association rules. J. Mach. Learn. Res. 14(1), 3153–3157 (2013)
MATH Google Scholar
Serrurier, M., Dubois, D., Prade, H., Sudkamp, T.: Learning fuzzy rules with their implication operators. Data Knowl. Eng. 60(1), 71–89 (2007). http://dx.doi.org/10.1016/j.datak.2006.01.007
Article Google Scholar
Srikant, R., Agrawal, R.: Mining quantitative association rules in large relational tables. In: ACM SIGMOD. pp. 1–12 (1996)
Google Scholar
Tatti, N.: Itemsets for real-valued datasets. In: IEEE ICDM, pp. 717–726 (2013)
Google Scholar
Vannucci, M., Colla, V.: Meaningful discretization of continuous features for association rules mining by means of a som. In: ESANN, pp. 489–494 (2004)
Google Scholar
Washio, T., Mitsunaga, Y., Motoda, H.: Mining quantitative frequent itemsets using adaptive density-based subspace clustering. In: IEEE ICDM, pp. 793–796 (2005)
Google Scholar
Webb, G.I.: Discovering associations with numeric variables. In: ACM SIGKDD, pp. 383–388 (2001)
Google Scholar
Wijsen, J., Meersman, R.: On the complexity of mining quantitative association rules. Data Min. Knowl. Discov. 2(3), 263–281 (1998)
Article Google Scholar
Zhu, F., Yan, X., Han, J., Yu, P.S., Cheng, H.: Mining colossal frequent patterns by core pattern fusion. In: IEEE ICDE, pp. 706–715 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Antwerp, Antwerp, Belgium
Thomas Van Brussel, Emmanuel Müller & Bart Goethals
Hasso-Plattner-Institute, Potsdam, Germany
Emmanuel Müller

Authors

Thomas Van Brussel
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuel Müller
View author publications
You can also search for this author in PubMed Google Scholar
Bart Goethals
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas Van Brussel .

Editor information

Editors and Affiliations

Faculteit Wetenschappen, Universiteit Hasselt, Hasselt, Belgium
Marc Gyssens
Dept. Ciencias Ingeniería Computación, Universidad Nacional del Sur, Bahía Blanca, Argentina
Guillermo Simari

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Van Brussel, T., Müller, E., Goethals, B. (2016). Discovering Overlapping Quantitative Associations by Density-Based Mining of Relevant Attributes. In: Gyssens, M., Simari, G. (eds) Foundations of Information and Knowledge Systems. FoIKS 2016. Lecture Notes in Computer Science(), vol 9616. Springer, Cham. https://doi.org/10.1007/978-3-319-30024-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-30024-5_8
Published: 04 March 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30023-8
Online ISBN: 978-3-319-30024-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics