Mining frequent conjunctive queries in relational databases through dependency discovery

Goethals, Bart; Laurent, Dominique; Le Page, Wim; Dieng, Cheikh Tidiane

doi:10.1007/s10115-012-0526-5

Mining frequent conjunctive queries in relational databases through dependency discovery

Regular Paper
Published: 19 August 2012

Volume 33, pages 655–684, (2012)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Bart Goethals¹,
Dominique Laurent²,
Wim Le Page¹ &
…
Cheikh Tidiane Dieng²

280 Accesses
Explore all metrics

Abstract

We present an approach for mining frequent conjunctive in arbitrary relational databases. Our pattern class is the simple, but appealing subclass of simple conjunctive queries. Our algorithm, called Conqueror$^+$, is capable of detecting previously unknown functional and inclusion dependencies that hold on the database relations as well as on joins of relations. These newly detected dependencies are then used to prune redundant queries. We propose an efficient database-oriented implementation of our algorithm using SQL and provide several promising experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computing Dependencies Using FCA

Efficient order dependency detection

Article 09 December 2015

Efficient Discovery of Differential Dependencies Through Association Rules Mining

Notes

We consider in our approach that the only possible functional dependencies with an empty left-hand side are of the form $\emptyset \rightarrow \emptyset $.
The case of projections over the emptyset is defined in [16] as being empty if the projected relation is empty, and otherwise, as containing a single specific tuple, called the empty tuple.
The source code of Conqueror$^{+}$ can be downloaded at http://www.adrem.ua.ac.be.

References

Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo A (1996) Fast discovery of association rules. In: Advances in knowledge discovery and data mining. AAAI-MIT Press, Cambridge, pp 309–328
Baixeries J (2004) A formal concept analysis framework to mine functional dependencies. In: Int. workshop on mathematical methods for learning, pp 1–9
Baixeries J (2008) A formal context for symmetric dependencies. In: Formal concept analysis, 6th international conference, ICFCA, vol 4933 of lecture notes in computer science, Springer, Berlin, pp 90–105
Baixeries J (2011) A new formal context for symmetric dependencies. In: Concept lattices and their applications, CLA, pp 333–348 (INRIA)
Bohannon P, Fan W, Geerts F, Jia X, Kementsietsidis A (2007) Conditional functional dependencies for data cleaning. In: ICDE, pp 746–755
Dehaspe L, Toivonen H (2001) Discovery of relational association rules. In: Džeroski S, Lavrač N (eds) Relational data mining. Springer, Berlin, pp 189–208
Dieng C, Jen T-Y, Laurent D (2010) An efficient computation of frequent queries in a star schema. In: DEXA 2010, vol 6262(II) of LNCS, Springer, Berlin, pp 225–239
Diop C, Giacometti A, Laurent D, Spyratos N (2002) Composition of mining contexts for efficient extraction of association rules. In: EDBT’02, vol 2287 of LNCS. Springer, Berlin, pp 106–123
Goethals B, den Bussche JV (2002) Relational association rules: getting warmer. In: ESF exploratory workshop on pattern detection and discovery in data mining, vol 2447 of LNCS, Springer, Berlin, pp 125–139
Goethals B, Hoekx E, den Bussche JV (2005) Mining tree queries in a graph. In: ACM KDD, pp 61–69
Goethals B, Laurent D, Le Page W (2010) Discovery and application of functional dependencies in conjunctive query mining. In: DAWAK 2010, vol 6263 of LNCS, Springer, Berlin, pp 142–156
Goethals B, Le Page W, Mannila H (2008) Mining association rules of simple conjunctive queries. In: SIAM-SDM, pp 96–107
Hoekx E, den Bussche JV (2006) Mining for tree-query associations in a graph. In: IEEE ICDM, pp 254–264
IMDB. http://imdb.com. 2008
Inokuchi A, Washio T, Motoda H (2000) An Apriori-based algorithm for mining frequent substructures from graph data. In: PKDD, vol 1910 of LNCS, Springer, Berlin, pp 13–23
Jen T-Y, Laurent D, Spyratos N (2008) Mining all frequent selection-projection queries from a relational table. In: EDBT’08, ACM Press, pp 368–379
Jen T-Y, Laurent D, Spyratos N (2009) Mining frequent conjunctive queries in star schemas. In: International database engineering and applications symposium (IDEAS), ACM Press, pp 97–108
Jen T-Y, Laurent D, Spyratos N, Sy O (2005) Towards mining frequent queries in star schemes. In: International workshop on knowledge discovery in databases (KDID), vol 3933 of LNCS, Springer, Berlin, pp 104–123
Jensen V, Soparker N (2000) Frequent itemset counting across multiple tables. In: PAKDD, vol 1805 of lecture notes in computer science, Springer, Berlin, pp 49–61
Kamber M, Han J, Chiang J (1997) Metarule-guided mining of multi-dimensional association rules using data cubes. In: ACM KDD, pp 207–210
Knuth D (2006) The art of computer programming, vol. 4. Addison-Wesley, Reading
Google Scholar
Kuramochi M, Karypis G (2001) Frequent subgraph discovery. In: IEEE ICDM, pp 313–320
Lakhal L, Stumme G (2005) Efficient mining of association rules based on formal concept analysis. In: Formal concept analysis, vol 3626 of lecture notes in computer science. Springer, Berlin, pp 180–195
Le Page W (2009) Mining patterns in relational databases. PhD thesis, University of Antwerp, Antwerp
Lopes S, Petit J-M, Lakhal L (2002) Functional and approximative dependency mining: database and FCA points of view. J Exp Theor Artif Intell 14(2–3):93–114
Article MATH Google Scholar
Ng E, Fu A, Wang K (2002) Mining association rules from stars. In: IEEE-ICDM, pp 322–329
Nijssen S, Kok JN (2003) Efficient frequent query discovery in FARMER. In: PKDD 2003, vol 2838 of LNCS, Springer, Berlin, pp 350–362
Novelli N, Cicchetti R (2001) FUN: an efficient algorithm for mining functional and embedded dependencies. In: International conference on database theory (ICDT), vol 1973 of LNCS, Springer, Berlin, pp 189–203
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inf Syst 24(1):25–46
Article Google Scholar
Plotkin G (1970) A note on inductive generalization. Mach Intell 5:153–163
MathSciNet Google Scholar
Ullman J (1988–1989) Principles of databases and knowledge-base systems, vol 1–2. Computer Science Press, Rockville
Weisstein EW (2009) Restricted growth string. In: A Wolfram web resource (http://mathword.wolfram.com/RestrictedGrowthString.html)
Wyss C-M, Giannella C, Robertson E-L (2001) FastFDs: a heuristic-driven, depth-first algorithm for mining functional dependencies from relation instances. In: DAWAK, vol 2114 of LNCS, Springer, Berlin, pp 101–110
Yan X, Han J (2002) gSpan: graph-based substructure pattern mining. In: IEEE ICDM, pp 721–724
Yao H, Hamilton HJ (2008) Mining functional dependencies from data. Data Min Know Discov 16(2):197–219
Article MathSciNet Google Scholar
Zaki M (2002) Efficiently mining frequent trees in a forest. In: ACM KDD, pp 71–80
Zaki M, Hsiao C-J (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE-TKDE 17(4):462–478
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Antwerp, 2020 , Antwerp, Belgium
Bart Goethals & Wim Le Page
ETIS, CNRS, ENSEA, Université de Cergy Pontoise, 95000 , Cergy-Pontoise, France
Dominique Laurent & Cheikh Tidiane Dieng

Authors

Bart Goethals
View author publications
You can also search for this author inPubMed Google Scholar
Dominique Laurent
View author publications
You can also search for this author inPubMed Google Scholar
Wim Le Page
View author publications
You can also search for this author inPubMed Google Scholar
Cheikh Tidiane Dieng
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Dominique Laurent.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goethals, B., Laurent, D., Le Page, W. et al. Mining frequent conjunctive queries in relational databases through dependency discovery. Knowl Inf Syst 33, 655–684 (2012). https://doi.org/10.1007/s10115-012-0526-5

Download citation

Received: 30 August 2011
Revised: 12 April 2012
Accepted: 14 July 2012
Published: 19 August 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s10115-012-0526-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining frequent conjunctive queries in relational databases through dependency discovery

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Computing Dependencies Using FCA

Efficient order dependency detection

Efficient Discovery of Differential Dependencies Through Association Rules Mining

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now