Skip to main content

Approximating a Set of Approximate Inclusion Dependencies

  • Conference paper
Intelligent Information Processing and Web Mining

Part of the book series: Advances in Soft Computing ((AINSC,volume 31))

Abstract

Approximating a collection of patterns is a new and active area of research in data mining. The main motivation lies in two observations : the number of mined patterns is often too large to be useful for any end-users and user-defined input parameters of many data mining algorithms are most of the time almost arbitrary defined (e.g. the frequency threshold).

In this setting, we apply the results given in the seminal paper [11] for frequent sets to the problem of approximating a set of approximate inclusion dependencies with k inclusion dependencies. Using the fact that inclusion dependencies are “representable as sets”, we point out how approximation schemes defined in [11] for frequent patterns also apply in our context. An heuristic solution is also proposed for this particular problem. Even if the quality of this approximation with respect to the best solution cannot be precisely defined, an interaction property between IND and FD may be used to justify this heuristic.

Some interesting perspectives of this work are pointed out from results obtained so far.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Casanova, R. Fagin, and C. Papadimitriou. Inclusion dependencies and their interaction with functional dependencies. Journal of Computer and System Sciences, 24(1):29–59, 1984.

    Article  MathSciNet  Google Scholar 

  2. M. A. Casanova, L. Tucherman, and A. L. Furtado. Enforcing inclusion dependencies and referencial integrity. In F. Bancilhon and D. J. DeWitt, editors, International Conference on Very Large Data Bases (VLDB’88), Los Angeles, California, USA, pages 38–49. Morgan Kaufmann, 1988.

    Google Scholar 

  3. Q. Cheng, J. Gryz, F. Koo, T. Y. Cliff Leung, L. Liu, X. Qian, and B. Schiefer. Implementation of two semantic query optimization techniques in DB2 universal database. In M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, editors, International Conference on Very Large Data Bases (VLDB’99), Edinburgh, Scotland, UK, pages 687–698. Morgan Kaufmann, 1999.

    Google Scholar 

  4. F. De Marchi, S. Lopes, and J.-M. Petit. Efficient algorithms for mining inclusion dependencies. In C. S. Jensen, K. G. Jeffery, J. Pokorný, S. Saltenis, E. Bertino, K. Böhm, and M. Jarke, editors, International Conference on Extending Database Technology (EDBT’02), Prague, Czech Republic, volume 2287 of Lecture Notes in Computer Science, pages 464–476. Springer, 2002.

    Google Scholar 

  5. F. De Marchi, S. Lopes, J.-M. Petit, and F. Toumani. Analysis of existing databases at the logical level: the DBA companion project. ACM Sigmod Record, 32(1):47–52, 2003.

    Article  Google Scholar 

  6. F. De Marchi and J-M. Petit. Zigzag: a new algorithm for discovering large inclusion dependencies in relational databases. In International Conference on Data Mining (ICDM’03), Melbourne, Florida, USA, pages 27–34. IEEE Computer Society, 2003.

    Google Scholar 

  7. F. Flouvat, F. De Marchi, and J-M. Petit. Abs: Adaptive borders search of frequent itemsets. In FIMI’04, 2004.

    Google Scholar 

  8. J. Han, J. Wang, Y. Lu, and P. Tzvetkov. Mining top-k frequent closed patterns without minimum support. In International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, pages 211–218. IEEE Computer Society, 2002.

    Google Scholar 

  9. Dorit Hochbaum. Approximation algorithms for NP-hard problems. PWS Publishing Compagny, 1997.

    Google Scholar 

  10. M. Kantola, H. Mannila, K. J. Räihä, and H. Siirtola. Discovering functional and inclusion dependencies in relational databases. International Journal of Intelligent Systems, 7:591–607, 1992.

    Google Scholar 

  11. W. Kim, R. Kohavi, J. Gehrke, and W. DuMouchel, editors. Approximating a collection of frequent sets. ACM, 2004.

    Google Scholar 

  12. A. Koeller and E. A. Rundensteiner. Discovery of high-dimentional inclusion dependencies (poster). In Poster session of International Conference on Data Engineering (ICDE’03). IEEE Computer Society, 2003.

    Google Scholar 

  13. A. Koeller and E. A. Rundensteiner. Heuristic strategies for inclusion dependency discovery. In R. Meersman and Z. Tari, editors, CoopIS, DOA, and ODBASE, OTM Confederated International Conferences, Napa, Cyprus, Part II, volume 3291 of Lecture Notes in Computer Science, pages 891–908. Springer, 2004.

    Google Scholar 

  14. M. Levene and G. Loizou. A Guided Tour of Relational Databases and Beyond. Springer, 1999.

    Google Scholar 

  15. M. Levene and M. W. Vincent. Justification for inclusion dependency normal form. IEEE Transactions on Knowledge and Data Engineering, 12(2):281–291, 2000.

    Article  Google Scholar 

  16. S. Lopes, F. De Marchi, and J.-M. Petit. DBA Companion: A tool for logical database tuning. In Demo session of International Conference on Data Engineering (ICDE’04), http://www.isima.fr/~demarchi/dbacomp/, 2004. IEEE Computer Society.

    Google Scholar 

  17. S. Lopes, J.-M. Petit, and F. Toumani. Discovering interesting inclusion dependencies: Application to logical database tuning. Information System, 17(1):1–19, 2002.

    Article  Google Scholar 

  18. H. Mannila and K.-J. Räihä. Inclusion dependencies in database design. In International Conference on Data Engineering (ICDE’86), Los Angeles, California, USA, pages 713–718. IEEE Computer Society, 1986.

    Google Scholar 

  19. H. Mannila and K. J. Räihä. The Design of Relational Databases. Addison-Wesley, second edition, 1994.

    Google Scholar 

  20. H. Mannila and H. Toivonen. Levelwise Search and Borders of Theories in Knowledge Discovery. Data Mining and Knowledge Discovery, 1(1):241–258, 1997.

    Article  Google Scholar 

  21. R. J. Miller, M. A. Hernández, L. M. Haas, L. Yan, C. T. H. Ho, R. Fagin, and L. Popa. The clio project: Managing heterogeneity. ACM SIGMOD Record, 30(1):78–83, 2001.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

De Marchi, F., Petit, JM. (2005). Approximating a Set of Approximate Inclusion Dependencies. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 31. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-32392-9_76

Download citation

  • DOI: https://doi.org/10.1007/3-540-32392-9_76

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25056-2

  • Online ISBN: 978-3-540-32392-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics