Skip to main content

Mining Differential Dependencies: A Subspace Clustering Approach

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8506))

Abstract

The discovery of differential dependencies (DDs) is the problem of finding a minimal cover set of DDs that hold in a given relation. This paper proposes a novel subspace-clustering-based approach to mine DDs that exist in a given relation. We study and reveal a link between δ-nClusters and differential functions (DFs). Based on this relationship, we adopt and co-opt techniques for mining δ-nClusters to find the set of candidate antecedent DFs of DDs efficiently, based on a user-specified distance threshold. Furthermore, we define an interestingness measure for DDs to aid the discovery of essential DDs and avoid the mining of an extremely large set. Finally, we demonstrate the scalability and efficiency of our solution through experiments on real-world benchmark datasets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann Publishers Inc. (1994)

    Google Scholar 

  2. Bache, K., Lichman, M.: UCI Machine Learning Repository (2013), http://archive.ics.uci.edu/ml

  3. Borgelt, C.: Frequent Item Set Mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2(6), 437–456 (2012)

    Google Scholar 

  4. Fan, W., Geerts, F., Li, J., Xiong, M.: Discovering Conditional Functional Dependencies. IEEE Trans. on Knowledge and Data Engineering 23, 683–698 (2011)

    Article  Google Scholar 

  5. Golab, L., Karloff, H., Korn, F., Srivastava, D., Yu, B.: On Generating Near-optimal Tableaux for Conditional Functional Dependencies. Proc. VLDB Endow. 1(1), 376–390 (2008)

    Article  Google Scholar 

  6. Huhtala, Y., Krkkinen, J., Porkka, P., Toivonen, H.: Tane: An Efficient Algorithm for Discovering Functional and Approximate Dependencies. The Computer Journal 42(2), 100–111 (1999)

    Article  MATH  Google Scholar 

  7. Ilyas, I.F., Markl, V., Haas, P.J., Brown, P., Aboulnaga, A.: CORDS: Automatic Discovery of Correlations and Soft Functional Dependencies. In: International Conference on Management of Data, pp. 647–658 (2004)

    Google Scholar 

  8. Li, J., Liu, J., Toivonen, H., Yong, J.: Effective Pruning for the Discovery of Conditional Functional Dependencies. Computer Journal 56(3), 378–392 (2013)

    Article  MATH  Google Scholar 

  9. Liu, G., Li, J., Sim, K., Wong, L.: Distance Based Subspace Clustering with Flexible Dimension Partitioning. In: 23rd International Conference on Data Engineering, pp. 1250–1254 (2007)

    Google Scholar 

  10. Liu, G., Li, J., Sim, K., Wong, L.: Efficient Mining of Distance-based Subspace Clusters. Statistical Analysis and Data Mining 2(5-6), 427–444 (2009)

    Article  MathSciNet  Google Scholar 

  11. Liu, J., Ye, F., Li, J., Wang, J.: On Discovery of Functional Dependencies from Data. Data & Knowledge Engineering 86, 146–159 (2013)

    Article  Google Scholar 

  12. Novelli, N., Cicchetti, R.: FUN: An Efficient Algorithm for Mining Functional and Embedded Dependencies. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 189–203. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  13. Song, S., Chen, L.: Discovering Matching Dependencies. In: 18th ACM Conference on Information and Knowledge Management, pp. 1421–1424 (2009)

    Google Scholar 

  14. Song, S., Chen, L.: Differential Dependencies: Reasoning and Discovery. ACM Trans. Database Syst. 16, 1–16 (2011)

    Article  Google Scholar 

  15. Song, S., Chen, L., Cheng, H.: Parameter-Free Determination of Distance Thresholds for Metric Distance Constraints. In: 28th International Conference on Data Engineering, pp. 846–857. IEEE Computer Society (2012)

    Google Scholar 

  16. Wyss, C., Giannella, C., Robertson, E.: FastFDs: A Heuristic-Driven, Depth-First Algorithm for Mining Functional Dependencies from Relation Instances Extended Abstract. In: Kambayashi, Y., Winiwarter, W., Arikawa, M. (eds.) DaWaK 2001. LNCS, vol. 2114, pp. 101–110. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Kwashie, S., Liu, J., Li, J., Ye, F. (2014). Mining Differential Dependencies: A Subspace Clustering Approach. In: Wang, H., Sharaf, M.A. (eds) Databases Theory and Applications. ADC 2014. Lecture Notes in Computer Science, vol 8506. Springer, Cham. https://doi.org/10.1007/978-3-319-08608-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08608-8_5

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08607-1

  • Online ISBN: 978-3-319-08608-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics