Abstract
The discovery of differential dependencies (DDs) is the problem of finding a minimal cover set of DDs that hold in a given relation. This paper proposes a novel subspace-clustering-based approach to mine DDs that exist in a given relation. We study and reveal a link between δ-nClusters and differential functions (DFs). Based on this relationship, we adopt and co-opt techniques for mining δ-nClusters to find the set of candidate antecedent DFs of DDs efficiently, based on a user-specified distance threshold. Furthermore, we define an interestingness measure for DDs to aid the discovery of essential DDs and avoid the mining of an extremely large set. Finally, we demonstrate the scalability and efficiency of our solution through experiments on real-world benchmark datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann Publishers Inc. (1994)
Bache, K., Lichman, M.: UCI Machine Learning Repository (2013), http://archive.ics.uci.edu/ml
Borgelt, C.: Frequent Item Set Mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2(6), 437–456 (2012)
Fan, W., Geerts, F., Li, J., Xiong, M.: Discovering Conditional Functional Dependencies. IEEE Trans. on Knowledge and Data Engineering 23, 683–698 (2011)
Golab, L., Karloff, H., Korn, F., Srivastava, D., Yu, B.: On Generating Near-optimal Tableaux for Conditional Functional Dependencies. Proc. VLDB Endow. 1(1), 376–390 (2008)
Huhtala, Y., Krkkinen, J., Porkka, P., Toivonen, H.: Tane: An Efficient Algorithm for Discovering Functional and Approximate Dependencies. The Computer Journal 42(2), 100–111 (1999)
Ilyas, I.F., Markl, V., Haas, P.J., Brown, P., Aboulnaga, A.: CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies. In: International Conference on Management of Data, pp. 647–658 (2004)
Li, J., Liu, J., Toivonen, H., Yong, J.: Effective Pruning for the Discovery of Conditional Functional Dependencies. Computer Journal 56(3), 378–392 (2013)
Liu, G., Li, J., Sim, K., Wong, L.: Distance Based Subspace Clustering with Flexible Dimension Partitioning. In: 23rd International Conference on Data Engineering, pp. 1250–1254 (2007)
Liu, G., Li, J., Sim, K., Wong, L.: Efficient Mining of Distance-based Subspace Clusters. Statistical Analysis and Data Mining 2(5-6), 427–444 (2009)
Liu, J., Ye, F., Li, J., Wang, J.: On Discovery of Functional Dependencies from Data. Data & Knowledge Engineering 86, 146–159 (2013)
Novelli, N., Cicchetti, R.: FUN: An Efficient Algorithm for Mining Functional and Embedded Dependencies. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 189–203. Springer, Heidelberg (2000)
Song, S., Chen, L.: Discovering Matching Dependencies. In: 18th ACM Conference on Information and Knowledge Management, pp. 1421–1424 (2009)
Song, S., Chen, L.: Differential Dependencies: Reasoning and Discovery. ACM Trans. Database Syst. 16, 1–16 (2011)
Song, S., Chen, L., Cheng, H.: Parameter-Free Determination of Distance Thresholds for Metric Distance Constraints. In: 28th International Conference on Data Engineering, pp. 846–857. IEEE Computer Society (2012)
Wyss, C., Giannella, C., Robertson, E.: FastFDs: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances Extended Abstract. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, pp. 101–110. Springer, Heidelberg (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kwashie, S., Liu, J., Li, J., Ye, F. (2014). Mining Differential Dependencies: A Subspace Clustering Approach. In: Wang, H., Sharaf, M.A. (eds) Databases Theory and Applications. ADC 2014. Lecture Notes in Computer Science, vol 8506. Springer, Cham. https://doi.org/10.1007/978-3-319-08608-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-08608-8_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08607-1
Online ISBN: 978-3-319-08608-8
eBook Packages: Computer ScienceComputer Science (R0)