Abstract
We address the issue of mining frequent conjunctive queries in a relational database, a problem known to be intractable even for conjunctive queries over a single table. In this article, we show that mining frequent projection-selection-join queries becomes tractable if joins are performed along keys and foreign keys, in a database satisfying functional and inclusion dependencies, under certain restrictions. We note that these restrictions cover most practical cases, including databases operating over star schemas, snow-flake schemas and constellation schemas. In our approach, we define an equivalence relation over queries using a pre-ordering with respect to which the support is shown to be anti-monotonic. We propose a level-wise algorithm for computing all frequent queries by exploiting the fact that equivalent queries have the same support. We report on experiments showing that, in our context, mining frequent projection-selection-join queries is indeed tractable, even for large data sets.
Similar content being viewed by others
References
Agrawal R., Mannila H., Srikant R., Toivonen H., Verkamo A.: Fast discovery of association rules. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (Eds.) Advances in Knowledge Discovery and Data Mining., pp. 309–328. AAAI-MIT Press, Manlo Park, CA (1996)
Armstrong, W.: Dependency structures of data base relationships. In: IFIP Congress, pp. 580–583. North Holland (1974)
Bohannon, P., Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Conditional functional dependencies for data cleaning. In: ICDE, pp. 746–755 (2007)
Casanova M.A., Fagin R., Papadimitriou C.H.: Inclusion dependencies and their interaction with functional dependencies. J. Comput. Syst. Sci. 28(1), 29–59 (1984)
De Marchi F., Lopes S., Petit J.-M.: Unary and n-ary inclusion dependency discovery in relational databases. J. Intell. Inf. Syst. 32(1), 53–73 (2009)
Dehaspe, L., Raedt, L.D.: Mining association rules in multiple relations. In: 7th International Workshop on Inductive Logic Programming, vol. 1297 of LNCS, pp. 125–132. Springer (1997)
Dieng, C.-T., Jen, T.-Y., Laurent, D.: An efficient computation of frequent queries in a star schema. In: DEXA 2010, vol. 6262(II) of LNCS, pp. 225–239. Springer (2010)
Diop, C., Giacometti, A., Laurent, D., Spyratos, N. Composition of mining contexts for efficient extraction of association rules. In: EDBT’02, vol. 2287 of LNCS, pp. 106–123. Springer (2002)
Fan W., Geerts F., Li J., Xiong M.: Discovering conditional functional dependencies. IEEE Trans. Knowl. Data Eng. 23(5), 683–698 (2011)
Faye, A., Giacometti, A., Laurent, D., Spyratos, N.: Mining rules in databases with multiple tables: problems and perspectives. In: Proceedings of the 3rd International Conference on Computing Anticipatory Systems (CASYS) (1999)
Goethals, B., den Bussche, J.V.: Relational association rules: getting warmer. In: ESF Exploratory Workshop on Pattern Detection and Discovery in Data Mining, vol. 2447 of LNCS, pp. 125–139. Springer (2002)
Goethals, B., Hoekx, E., den Bussche, J.V.: Mining tree queries in a graph. In: Proceedings of 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 61–69 (2005)
Goethals, B., Laurent, D., Page, W.L.: Discovery and application of functional dependencies in conjunctive query mining. In: DAWAK 2010, vol. 6263 of LNCS, pp. 142–156. Springer (2010)
Goethals, B., Page, W.L., Mapaey, M.: Mining interesting sets and rules in relational databases. In: ACM SAC, pp. 997–1001. ACM Press (2010)
Goethals, B., Page, W.L., Mannila, H.: Mining association rules of simple conjunctive queries. In: SIAM-SDM, pp. 96–107 (2008)
Han, J., Fu, Y., Wang, W., Koperski, K., Zaiane, O.: Dmql : a data mining query language for relational databases. In:SIGMOD-DMKD’96, pp. 27–34 (1996)
Jen, T.-Y., Laurent, D., Spyratos, N.: Mining all frequent selection-projection queries from a relational table. In: EDBT’08, pp. 368–379. ACM Press (2008)
Jen, T.-Y., Laurent, D., Spyratos, N.: Mining frequent conjunctive queries in star schemas. In: Proceedings of the International Database Engineering and Applications Symposium (IDEAS), pp. 97–108. ACM Press (2009)
Jen T.-Y., Laurent D., Spyratos N.: Computing supports of conjunctive queries on relational tables with functional dependencies. Fundamenta Informaticae 99(3), 263–292 (2010)
Jen, T.-Y., Laurent, D., Spyratos, N., Sy, O.: Towards mining frequent queries in star schemes. In: International Workshop on Knowledge Discovery in Databases (KDID), vol. 3933 of LNCS, pp. 104–123. Springer (2005)
Jen, T.-Y., Taouil, R., Laurent, D.: A dichotomous algorithm for association rule mining. In: International workshop Grid and Peer-to-Peer Computing Impacts on Large Scale Heterogeneous Distributed Database Systems (GLOBE), in conjunction with International conference on DEXA, pp. 567–571. IEEE Press (2004)
Kimball R.: The Data Warehouse Toolkit. Wiley, New York (1996)
Levene M., Loizou G.: Guaranteeing no interaction between functional dependencies and tree-like inclusion dependencies. Theor. Comput. Sci. 254, 683–690 (2001)
Levene M., Loizou G.: Why is the snowflake schema a good data warehouse design?. Inf. Syst. 28(3), 225–240 (2003)
Lopes S., Petit J.-M., Lakhal L.: Functional and approximative dependency mining: database and FCA points of view. J. Exp. Theor. Artif. Intell. 14(2–3), 93–114 (2002)
Meo R., Psaila G., Ceri S.: An extension to sql for mining association rules. Data Min. Knowl. Discov. 9, 275–300 (1997)
Mitchell J.C.: The implication problem for functional dependencies and inclusion dependencies. Inf. Control 56, 154–173 (1983)
Turmeaux, T., Salleb, A., Vrain, C., Cassard, D.: Learning characteristic rules relying on quantified paths. In: PKDD, vol. 2838 of LNCS, pp. 471–482. Springer (2003)
Ullman J.: Principles of Databases and Knowledge-Base Systems, vol. 1. Computer Science Press, Rockville, MD (1988)
Yao H., Hamilton H.J.: Mining functional dependencies from data. Data Min. Knowl. Discov. 16(2), 197–219 (2008)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dieng, C.T., Jen, TY., Laurent, D. et al. Mining frequent conjunctive queries using functional and inclusion dependencies. The VLDB Journal 22, 125–150 (2013). https://doi.org/10.1007/s00778-012-0277-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-012-0277-7