Approximating a Set of Approximate Inclusion Dependencies

De Marchi, Fabien; Petit, Jean-Marc

doi:10.1007/3-540-32392-9_76

Fabien De Marchi³ &
Jean-Marc Petit⁴

Part of the book series: Advances in Soft Computing ((AINSC,volume 31))

853 Accesses
1 Citations

Abstract

Approximating a collection of patterns is a new and active area of research in data mining. The main motivation lies in two observations : the number of mined patterns is often too large to be useful for any end-users and user-defined input parameters of many data mining algorithms are most of the time almost arbitrary defined (e.g. the frequency threshold).

In this setting, we apply the results given in the seminal paper [11] for frequent sets to the problem of approximating a set of approximate inclusion dependencies with k inclusion dependencies. Using the fact that inclusion dependencies are “representable as sets”, we point out how approximation schemes defined in [11] for frequent patterns also apply in our context. An heuristic solution is also proposed for this particular problem. Even if the quality of this approximation with respect to the best solution cannot be precisely defined, an interaction property between IND and FD may be used to justify this heuristic.

Some interesting perspectives of this work are pointed out from results obtained so far.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

M. Casanova, R. Fagin, and C. Papadimitriou. Inclusion dependencies and their interaction with functional dependencies. Journal of Computer and System Sciences, 24(1):29–59, 1984.
Article MathSciNet Google Scholar
M. A. Casanova, L. Tucherman, and A. L. Furtado. Enforcing inclusion dependencies and referencial integrity. In F. Bancilhon and D. J. DeWitt, editors, International Conference on Very Large Data Bases (VLDB’88), Los Angeles, California, USA, pages 38–49. Morgan Kaufmann, 1988.
Google Scholar
Q. Cheng, J. Gryz, F. Koo, T. Y. Cliff Leung, L. Liu, X. Qian, and B. Schiefer. Implementation of two semantic query optimization techniques in DB2 universal database. In M. P. Atkinson, M. E. Orlowska, P. Valduriez, S. B. Zdonik, and M. L. Brodie, editors, International Conference on Very Large Data Bases (VLDB’99), Edinburgh, Scotland, UK, pages 687–698. Morgan Kaufmann, 1999.
Google Scholar
F. De Marchi, S. Lopes, and J.-M. Petit. Efficient algorithms for mining inclusion dependencies. In C. S. Jensen, K. G. Jeffery, J. Pokorný, S. Saltenis, E. Bertino, K. Böhm, and M. Jarke, editors, International Conference on Extending Database Technology (EDBT’02), Prague, Czech Republic, volume 2287 of Lecture Notes in Computer Science, pages 464–476. Springer, 2002.
Google Scholar
F. De Marchi, S. Lopes, J.-M. Petit, and F. Toumani. Analysis of existing databases at the logical level: the DBA companion project. ACM Sigmod Record, 32(1):47–52, 2003.
Article Google Scholar
F. De Marchi and J-M. Petit. Zigzag: a new algorithm for discovering large inclusion dependencies in relational databases. In International Conference on Data Mining (ICDM’03), Melbourne, Florida, USA, pages 27–34. IEEE Computer Society, 2003.
Google Scholar
F. Flouvat, F. De Marchi, and J-M. Petit. Abs: Adaptive borders search of frequent itemsets. In FIMI’04, 2004.
Google Scholar
J. Han, J. Wang, Y. Lu, and P. Tzvetkov. Mining top-k frequent closed patterns without minimum support. In International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, pages 211–218. IEEE Computer Society, 2002.
Google Scholar
Dorit Hochbaum. Approximation algorithms for NP-hard problems. PWS Publishing Compagny, 1997.
Google Scholar
M. Kantola, H. Mannila, K. J. Räihä, and H. Siirtola. Discovering functional and inclusion dependencies in relational databases. International Journal of Intelligent Systems, 7:591–607, 1992.
Google Scholar
W. Kim, R. Kohavi, J. Gehrke, and W. DuMouchel, editors. Approximating a collection of frequent sets. ACM, 2004.
Google Scholar
A. Koeller and E. A. Rundensteiner. Discovery of high-dimentional inclusion dependencies (poster). In Poster session of International Conference on Data Engineering (ICDE’03). IEEE Computer Society, 2003.
Google Scholar
A. Koeller and E. A. Rundensteiner. Heuristic strategies for inclusion dependency discovery. In R. Meersman and Z. Tari, editors, CoopIS, DOA, and ODBASE, OTM Confederated International Conferences, Napa, Cyprus, Part II, volume 3291 of Lecture Notes in Computer Science, pages 891–908. Springer, 2004.
Google Scholar
M. Levene and G. Loizou. A Guided Tour of Relational Databases and Beyond. Springer, 1999.
Google Scholar
M. Levene and M. W. Vincent. Justification for inclusion dependency normal form. IEEE Transactions on Knowledge and Data Engineering, 12(2):281–291, 2000.
Article Google Scholar
S. Lopes, F. De Marchi, and J.-M. Petit. DBA Companion: A tool for logical database tuning. In Demo session of International Conference on Data Engineering (ICDE’04), http://www.isima.fr/~demarchi/dbacomp/, 2004. IEEE Computer Society.
Google Scholar
S. Lopes, J.-M. Petit, and F. Toumani. Discovering interesting inclusion dependencies: Application to logical database tuning. Information System, 17(1):1–19, 2002.
Article Google Scholar
H. Mannila and K.-J. Räihä. Inclusion dependencies in database design. In International Conference on Data Engineering (ICDE’86), Los Angeles, California, USA, pages 713–718. IEEE Computer Society, 1986.
Google Scholar
H. Mannila and K. J. Räihä. The Design of Relational Databases. Addison-Wesley, second edition, 1994.
Google Scholar
H. Mannila and H. Toivonen. Levelwise Search and Borders of Theories in Knowledge Discovery. Data Mining and Knowledge Discovery, 1(1):241–258, 1997.
Article Google Scholar
R. J. Miller, M. A. Hernández, L. M. Haas, L. Yan, C. T. H. Ho, R. Fagin, and L. Popa. The clio project: Managing heterogeneity. ACM SIGMOD Record, 30(1):78–83, 2001.
Article Google Scholar

Download references

Author information

Authors and Affiliations

LIRIS, FRE CNRS 2672, Univ. Lyon 1, 69622, Villeurbanne, France
Fabien De Marchi
LIMOS, UMR CNRS 6158, Univ. Clermont-Ferrand II, 63177, Aubière, France
Jean-Marc Petit

Authors

Fabien De Marchi
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Marc Petit
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Sciences, Polish Academy of Sciences, ul. Ordona 21, 01-237, Warszawa, Poland
Mieczysław A. Kłopotek , Sławomir T. Wierzchoń & Krzysztof Trojanowski , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

De Marchi, F., Petit, JM. (2005). Approximating a Set of Approximate Inclusion Dependencies. In: Kłopotek, M.A., Wierzchoń, S.T., Trojanowski, K. (eds) Intelligent Information Processing and Web Mining. Advances in Soft Computing, vol 31. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-32392-9_76

Download citation

DOI: https://doi.org/10.1007/3-540-32392-9_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25056-2
Online ISBN: 978-3-540-32392-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics