Using association rules to mine for strong approximate dependencies

Sánchez, Daniel; Serrano, José María; Blanco, Ignacio; Martín-Bautista, Maria Jose; Vila, María-Amparo

doi:10.1007/s10618-008-0092-3

Using association rules to mine for strong approximate dependencies

Published: 30 March 2008

Volume 16, pages 313–348, (2008)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Daniel Sánchez^1,2,
José María Serrano³,
Ignacio Blanco²,
Maria Jose Martín-Bautista² &
…
María-Amparo Vila²

291 Accesses
18 Citations
Explore all metrics

Abstract

In this paper we deal with the problem of mining for approximate dependencies (AD) in relational databases. We introduce a definition of AD based on the concept of association rule, by means of suitable definitions of the concepts of item and transaction. This definition allow us to measure both the accuracy and support of an AD. We provide an interpretation of the new measures based on the complexity of the theory (set of rules) that describes the dependence, and we employ this interpretation to compare the new measures with existing ones. A methodology to adapt existing association rule mining algorithms to the task of discovering ADs is introduced. The adapted algorithms obtain the set of ADs that hold in a relation with accuracy and support greater than user-defined thresholds. The experiments we have performed show that our approach performs reasonably well over large databases with real-world data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Discovery of Differential Dependencies Through Association Rules Mining

Interestingness Measures for Multi-Level Association Rules

Mining relaxed functional dependencies from data

Article 23 December 2019

Loredana Caruccio, Vincenzo Deufemia & Giuseppe Polese

References

Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD conference, pp 207–216
Bell S (1995) Discovery and maintenance of functional dependencies by independencies. In: Proceedings of the first international conference on knowlege discovery and data mining (KDD’95), pp 27–32
Bell S (1997) Dependency mining in relational databases. In: Proceedings of the ECSQARU-FAPR’97, pp 16–29
Berzal F, Blanco I, Sánchez D and Vila M (2002). Measuring the accuracy and interest of association rules: A new framework. Intell Data Anal 6: 221–235
MATH Google Scholar
Berzal F, Cubero J, Sánchez D, Serrano J, Vila MA (2003) Finding fuzzy approximate dependencies within STULONG data. In: Berka P (ed) Proceedings of the ECML/PKDD 2003 workshop on discovery challenge, pp 34–46
Berzal F, Blanco I, Sánchez D, Serrano J and Vila MA (2005). A definition for fuzzy approximate dependencies. Fuzzy Set Syst 149: 105–129
Article MATH Google Scholar
Bitton D, Millman J, Torgersen S (1989) A feasibility and performance study of dependency inference. In: Proceedings of the 5th international conference on data engineering, pp 635–641
Bosc P, Lietard L, Pivert O (1997) Functional dependencies revisited under graduality and imprecision. In: Annual meeting of NAFIPS, pp 57–62
Bra PD and Paredaens J (1983). Horizontal decompositions for handling exceptions to functional dependencies. Adv Database Theor 2: 123–144
Google Scholar
Brin S, Motwani R, Ullman J and Tsur S (1997). Dynamic itemset counting and implication rules for market basket data. SIGMOD Rec 26(2): 255–264
Article Google Scholar
Calero J, Delgado G, Sánchez-Marañón M, Sánchez D, Serrano J, Vila MA (2003) Helping user to discover association rules. a case in soil color as aggregation of other soil properties. In: Proceedings of the 5th international conference on enterprise information systems, ICEIS’03, pp 533–540
Calero J, Delgado G, Sánchez D, Serrano J, Vila MA (2004a) A proposal of fuzzy correspondence analysis based on flexible data mining techniques. In: López-Díaz M, Gil M, Grzegorzewski P, Hyrniewicz O, Lawry J (eds) Soft methodology and random information systems. Advances in soft computing series. Springer, pp 447–454
Calero J, Delgado G, Sánchez-Marañón M, Sánchez D, Vila MA, Serrano J (2004b) An experience in management of imprecise soil databases by means of fuzzy association rules and fuzzy approximate dependencies. In: Proceedings of the 6th international conference on enterprise information systems, ICEIS’04, pp 138–146
Calero J, Delgado G, Serrano J, Sánchez D, Vila MA (2004c) Fuzzy approximate dependencies over imprecise domains. an example in soil data management. In: Proceedings of the IADIS international conference applied computing 2004, pp 396–403
Cubero J, Cuenca F, Blanco I, Vila M (1998) Incomplete functional dependencies versus knowledge discovery in databases. In: Proceedings of the EUFIT’98, Aachen, Germany, pp 731–74
Delgado M, Marín N, Sánchez D and Vila M (2003). Fuzzy association rules: general model and applications. IEEE Trans Fuzzy Syst 11(2): 214–225
Article Google Scholar
Dubois D, Hüllermeier E and Prade H (2006). A systematic approach to the assessment of fuzzy association rules. Data Min Knowl Disc 13(2): 167–192
Article Google Scholar
Flach P and Savnik I (1999). Database dependency discovery: a machine learning approach. AI Commun 12(3): 139–160
MathSciNet Google Scholar
Gunopulos D, Mannila H, Saluja S (1997) Discovering all most specific sentences by randomized algorithms. In: Afrati F, Kolaitis P (eds) Proceedings of the international conference on database theory, pp 215–229
Huhtala Y, Karkkainen J, Porkka P, Toivonen H (1998) Efficient discovery of functional and approximate dependencies using partitions. In: Proceedings of the 14th international conference on data engineering, pp 392–401
Huhtala Y, Karkkainen J, Porkka P and Toivonen H (1999). TANE: an efficient algorithm for discovering functional and approximate dependencies. Comput J 42(2): 100–111
Article MATH Google Scholar
Kivinen J and Mannila H (1995). Approximate dependency inference from relations. Theor Comput Sci 149(1): 129–149
Article MATH MathSciNet Google Scholar
Kramer S, Pfahringer B (1996) Efficient search for strong partial determinations. In: Proceedings of the 2nd international conference on knowledge discovery and data mining (KDD’96), pp 371–374
Lavrac N, Flach P, Zupan B (1999) Rule evaluation measures: a unifying view. In: LNAI 1364. Springer-Verlag, pp 74–185
Lopes S, Petit J and Lakhal L (2002). Functional and approximate dependency mining: Database and FCA points of view. J Expt Theor Artif Intell 14: 93–114
Article MATH Google Scholar
Lukasiewicz J (1970) Die logishen grundlagen der wahrscheinilchkeitsrechnung. In: Borkowski L (ed) Jan Lukasiewicz - Selected Works. North Holland Publishing Company, Amsterdam, London, Polish Scientific Publishers, Warsaw, pp 16–63
Mannila H and Räihä K (1992). On the complexity of inferring functional dependencies. Discrete Appl Math 40: 237–243
Article MATH MathSciNet Google Scholar
Mannila H and Räihä K (1994). Algorithms for inferring functional dependencies. Data Knowl Eng 12(1): 83–99
Article MATH Google Scholar
Pawlak Z (1982). Rough sets. Int J Comput Inf Sci 11(5): 341–356
Article MathSciNet MATH Google Scholar
Pawlak Z (1991). Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishing, Dordrecht
MATH Google Scholar
Pfahringer B, Kramer S (1995) Compression-based evaluation of partial determinations. In: Proceedings of the first international conference on knowledge discovery and data mining (KDD’95), pp 234–239
Piatetsky-Shapiro G (1991). Discovery, analysis, and presentation of strong rules. In: Piatetsky-Shapiro G, Frawley W (eds) Knowledge discovery in databases. AAAI/MIT Press, pp 229–238
Piatetsky-Shapiro G (1992) Probabilistic data dependencies. In: Zytkow J (ed) Proceedings of machine discovery workshop, pp 11–17
Russell S (1989) The use of knowledge in analogy and induction. Pitman Publishing
Sánchez D (1999) Adquisición de relaciones entre atributos en bases de datos relacionales (Translates to: Acquisition of relationships between attributes in relational databases) (in Spanish). PhD thesis, Department of Computer Science and Artificial Intelligence, University of Granada
Sánchez D, Serrano J, Vila M, Aranda V, Calero J and Delgado G (2003). Using data mining techniques to analyze correspondences between user and scientific knowledge in an agricultural environment. In: Piattini, M, Filipe, J, and Braz, J (eds) Enterprise information systems IV, pp 75–89. Kluwer Academic Publishers, Hingham, MA, USA
Google Scholar
Savnik I, Flach P (1993) Bottom-up induction of functional dependencies from relations. In: Piatetsky-Shapiro G (ed) Knowledge discovery in databases, papers from the 1993 AAAI workshop. AAAI, pp 174–185
Schlimmer J (1993) Efficiently inducing determinations: a complete and systematic search algorithm that uses optimal pruning. In: Piatetsky-Shapiro G (ed) Proceedings of the 10th international conference on machine learning, pp 284–290
Shen W (1991) Discovering regularities from large knowledge bases. In: Proceedings of the 8th international workshop on machine learning, pp 539–543
Shortliffe E and Buchanan B (1975). A model of inexact reasoning in medicine. Math Biosci 23: 351–379
Article MathSciNet Google Scholar
Silverstein C, Brin S and Motwani R (1998). Beyond market baskets: generalizing association rules to dependence rules. Data Min Knowl Disc 2: 39–68
Article Google Scholar
Ziarko W (1991) The discovery, analysis and representation of data dependencies in databases. In: Piatetsky-Shapiro G, Frawley W (eds) Knowl discovery databases. AAAI/MIT Press, pp 195–209

Download references

Author information

Authors and Affiliations

E.T.S.I.I.T., C/ Periodista Daniel Saucedo Aranda s/n, 18071, Granada, Spain
Daniel Sánchez
Department of Computer Science and A.I., University of Granada, Granada, Spain
Daniel Sánchez, Ignacio Blanco, Maria Jose Martín-Bautista & María-Amparo Vila
Department of Informatics, University of Jaén, Jaén, Spain
José María Serrano

Authors

Daniel Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
José María Serrano
View author publications
You can also search for this author in PubMed Google Scholar
Ignacio Blanco
View author publications
You can also search for this author in PubMed Google Scholar
Maria Jose Martín-Bautista
View author publications
You can also search for this author in PubMed Google Scholar
María-Amparo Vila
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Sánchez.

Additional information

Responsible editor: M. J. Zaki.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sánchez, D., Serrano, J.M., Blanco, I. et al. Using association rules to mine for strong approximate dependencies. Data Min Knowl Disc 16, 313–348 (2008). https://doi.org/10.1007/s10618-008-0092-3

Download citation

Received: 02 August 2006
Accepted: 24 January 2008
Published: 30 March 2008
Issue Date: June 2008
DOI: https://doi.org/10.1007/s10618-008-0092-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using association rules to mine for strong approximate dependencies

Abstract

Access this article

Similar content being viewed by others

Efficient Discovery of Differential Dependencies Through Association Rules Mining

Interestingness Measures for Multi-Level Association Rules

Mining relaxed functional dependencies from data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using association rules to mine for strong approximate dependencies

Abstract

Access this article

Similar content being viewed by others

Efficient Discovery of Differential Dependencies Through Association Rules Mining

Interestingness Measures for Multi-Level Association Rules

Mining relaxed functional dependencies from data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation