Abstract
Differential dependency (DD) is a newly proposed data dependency theory that captures the relationships amongst data values. Like the classical functional dependency (FD) theory, DDs are defined to hold over entire instances of relations. This paper proposes a novel extension of the DD theory to hold over subsets of relations, called conditional DD (CDD), similar to the relaxations of FD to conditional FD (CFD) [4] and conditional FD with predicates (CFDPs) [6]. In this work, we present: the formal definitions; the consistency and implication analysis; and a set of axioms to infer CDDs. Furthermore, we study the discovery problem of CDDs and present an algorithm for mining a minimal cover set \(\varSigma _c\) of constant CDDs from a given instance of a relation. And, we propose an interestingness measure for ranking discovered CDDs and reducing the size \(|\varSigma _c|\) of \(\varSigma _c\). We demonstrate the efficiency, effectiveness and scalability of the discovery algorithm through experiments on both real and synthetic datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
PW is petal width; SL is sepal length; CL is Iris class.
- 2.
we note that we use \(A\langle \mathtt {a}\rangle \) for a constant CS of A; and A[w] for a DF of A.
- 3.
\(w_a=[x,y]\) where x, y are the min. and max. distance of values in \(\varPsi _A\) respectively.
- 4.
similar to the dependent quality measure in [14].
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: 20th International Conference on VLDB, pp. 487–499 (1994)
Armstrong, W.W.: Dependency structures of data base relationships. In: World Computer Congress - IFIP, pp. 580–583 (1974)
Bache, K., Lichman, M.: UCI Machine Learning repository (2013). http://archive.ics.uci.edu/ml
Bohannon, P., Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Conditional functional dependencies for data cleaning. In: 23rd ICDE, pp. 746–755 (2007)
Bravo, L., Fan, W., Ma, S.: Extending dependencies with conditions. In: 33rd International Conference on VLDB, pp. 243–254 (2007)
Chen, W., Fan, W., Ma, S.: Analyses and validation of conditional dependencies with built-in predicates. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds.) DEXA 2009. LNCS, vol. 5690, pp. 576–591. Springer, Heidelberg (2009)
Fan, W., Geerts, F.: Foundations of Data Quality Management. Synth. Lect. Data Manage. 4(5), 1–127 (2012). doi:10.2200/S00439ED1V01Y201207DTM030
Kwashie, S., Liu, J., Li, J., Ye, F.: Mining differential dependencies: a subspace clustering approach. In: Wang, H., Sharaf, M.A. (eds.) ADC 2014. LNCS, vol. 8506, pp. 50–61. Springer, Heidelberg (2014)
Li, J., Liu, G., Wong, L.: Mining statistically important equivalence classes and delta-discriminative emerging patterns. In: 13th ACM SIGKDD, pp. 430–439 (2007)
Liu, G., Li, J., Sim, K., Wong, L.: Distance based subspace clustering with flexible dimension partitioning. In: 23rd ICDE, pp. 1250–1254 (2007)
Liu, J., Kwashie, S., Li, J., Ye, F., Vincent, M.W.: Discovery of approximate differential dependencies. CoRR, abs/1309.3733 (2013)
Simpson, E.H.: The interpretation of interaction in contingency tables. J. Roy. Stat. Soc. Ser. B (Stat. Meth.) 13(2), 238–241 (1951)
Song, S., Chen, L.: Differential dependencies: reasoning and discovery. ACM Trans. Database Syst. 36(3), 16:1–16:41 (2011)
Song, S., Chen, L., Cheng, H.: Parameter-free determination of distance thresholds for metric distance constraints. In: 28th ICDE, pp. 846–857 (2012)
Uno, T., Kiyomi, M., Arimura, H.: LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining. In: 1st International Workshop on Open Source Data Mining, pp. 77–86 (2005)
Acknowledgement
This work is partially supported by NSFC 61472166.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kwashie, S., Liu, J., Li, J., Ye, F. (2015). Conditional Differential Dependencies (CDDs). In: Tadeusz, M., Valduriez, P., Bellatreche, L. (eds) Advances in Databases and Information Systems. ADBIS 2015. Lecture Notes in Computer Science(), vol 9282. Springer, Cham. https://doi.org/10.1007/978-3-319-23135-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-23135-8_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23134-1
Online ISBN: 978-3-319-23135-8
eBook Packages: Computer ScienceComputer Science (R0)