Support measures for graph data*

Vanetik, N.; Shimony, S. E.; Gudes, E.

doi:10.1007/s10618-006-0044-8

Support measures for graph data^*

Published: 26 May 2006

Volume 13, pages 243–260, (2006)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

N. Vanetik¹,
S. E. Shimony¹ &
E. Gudes¹

351 Accesses
34 Citations
9 Altmetric
1 Mention
Explore all metrics

Abstract

The concept of support is central to data mining. While the definition of support in transaction databases is intuitive and simple, that is not the case in graph datasets and databases. Most mining algorithms require the support of a pattern to be no greater than that of its subpatterns, a property called anti-monotonicity, or admissibility. This paper examines the requirements for admissibility of a support measure. Support measures for mining graphs are usually based on the notion of an instance graph---a graph representing all the instances of the pattern in a database and their intersection properties. Necessary and sufficient conditions for support measure admissibility, based on operations on instance graphs, are developed and proved. The sufficient conditions are used to prove admissibility of one support measure—the size of the independent set in the instance graph. Conversely, the necessary conditions are used to quickly show that some other support measures, such as weighted count of instances, are not admissible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density-Based Clustering Based on Hierarchical Density Estimates

Clustering graph data: the roadmap to spectral techniques

Article Open access 22 January 2024

Genetic Algorithms and Their Applications

References

Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. Proc. of the 20th Int'l Conf. on VLDB, Santiago, Chile
Bray T, Paoli J, Sperberg-McQueen C, (Eds.) (1998) Extensible Markup Language (XML) 1.0, February, http://www.w3.org/XML/#9802xml10
Chamberlin D (2003) XQuery: A query language for XML, Proceedings of SIGMOD Conference
Chen MS, Park JS, Yu PS (1998) Efficient data mining for path traversal patterns. IEEE Transactions on Knowledge and Data Engineering 10(2):209–221
Google Scholar
Dehaspe L, Toivonen H, King RD (1998) Finding frequent substructures in chemical compounds. Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD-98) New York, New York, pp. 30-36
Deutsch A, Fernandez M, Florescu D, Levy A, Maier D, Suciu D (1999) Querying XML data. IEEE Data Engineering Bulletin 22(3):27–34
Google Scholar
Deutsch A, Fernandez MF, Suciu D (1999) Storing semistructured data with STORED. Proceedings of SIGMOD Conference, pp. 431–442
Domshlak C, Brafman R, Shimony SE (2001) Preference-based configuration of web page content. Proceedings of IJCAI
Goldman R, Widom J (1997) DataGuides: Enabling query formulation and optimization in semistructured databases. Proc. of 23rd VLDB Conf., Athens, Greece
Graph Matching Library, http://amalfi.dis.unina.it/graph/db/vflib-2.0/doc/vflib.html
Yan X, Han J (2002) gSpan: Graph-based substructure pattern mining. Proceedings of ICDM, pp. 721–724
Huffman SB, Baudin C, Toward structured retrieval in semi-structured information spaces, Proceedings of IJCAI-97, Nagaya, Japan, pp. 751–756
Inokuchi A, Washio T, Motoda H (2000) An apriori based algorithm for mining frequent substructures from graph data. Proceedings of PKDD00
Kuramochi M, Karypis G (2004) Finding Frequent Patterns in a Large Sparse Graph Proceedings 2004 SIAM Data Mining Conference, Orlando, Florida
Kuramochi M, Karypis G (2001) Frequent subgraph discovery. Proceedings of IEEE ICDM
Lin X, Liu Ch, Zhang Y, Zhou X (1998) Efficiently computing frequent tree-like topology patterns in a web environment. Proceedings of 31st Int. Conf. on Tech. of Object-Oriented Language and Systems
Maximum weight clique program, http://www.tcs.hut.fi/ pat/wclique.html
McKay BD (1998) Isomorph-free exhaustive generation. Journal of Algorithms 26:306–324
Google Scholar
Meisels A, Orlov M, Maor T (2001) Discovering associations in XML data. BGU Technical report
Milner R (1983) Calculi for synchrony and asynchrony. Proceedings of TCS 25:267–310
Ng RT, Lakshmanan LVS, Han J, A. Pang (1998) Exploratory mining and pruning optimizations of constrained association rules. Proceedings of SIGMOD Conference, pp. 13–24
Movie database, http://us.imdb.com
Ostergard PRJ (2001) A new algorithm for the maximum-weight clique problem, Helsinki University of Technology, internal report
Pennec X, Ayache N (1998) A geometric algorithm to find small but highly similar 3D substructures in proteins. Bioinformatics 14(6):516–522
Google Scholar
Srikant R, Agrawal R (1995) Mining generalized association rules. Proceedings of the 21st Int'l Conference on Very Large Databases, Zurich, Switzerland
Vanetik N (2002) Discovery of frequent patterns in semi-structured data. M.Sc. thesis. Dept. of Computer Science, Ben Gurion University
Vanetik N, Gudes E (2004) Mining frequent labeled and partially labeled graph patterns. Proceedings of ICDE, Boston, pp. 91–102
Vanetik N, Gudes E, Shimony SE (2002) Computing frequent graph patterns from semistructured data. Proceedings ICDM, pp. 458–465
Vanetik N, Shimony ES, Gudes E (2004) Computing frequent graph patterns using disjoint paths. submitted for a journal publication
Vanetik N, Gudes E, Shimony SE (2005) Support measures for graph data. Technical Report FC-06-02, Computer Science Dept., Ben Gurion University
Wang K, Liu H (1998) Discovering Typical Structures of Documents: A Road Map Approach. Proceedings of SIGIR, pp. 146–154
Wang X, Wang JTLi, Shasha D, Shapiro B, Rigoutsos I, Zhang K (2002) Finding patterns in three-dimensional graphs: Algorithms and applications to scientific data mining. IEEE Trans on Knowledge and Data Eng 14(4):731–749
Washio T, Motoda H (2003) State of the art of graph-based data mining. SIGKDD explorations

Download references

Author information

Authors and Affiliations

Department of Computer Science, Ben-Gurion University of the Negev, 84105, Beer-Sheva, Israel
N. Vanetik, S. E. Shimony & E. Gudes

Authors

N. Vanetik
View author publications
You can also search for this author in PubMed Google Scholar
S. E. Shimony
View author publications
You can also search for this author in PubMed Google Scholar
E. Gudes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to N. Vanetik.

Additional information

^*Partially supported by the KITE consortium under contract to the Israeli Ministry of Trade and Industry, and by the Paul Ivanier Center for Robotics and Production Management.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vanetik, N., Shimony, S.E. & Gudes, E. Support measures for graph data^* . Data Min Knowl Disc 13, 243–260 (2006). https://doi.org/10.1007/s10618-006-0044-8

Download citation

Received: 18 July 2005
Accepted: 07 March 2006
Published: 26 May 2006
Issue Date: September 2006
DOI: https://doi.org/10.1007/s10618-006-0044-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Support measures for graph data^*

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Clustering graph data: the roadmap to spectral techniques

Genetic Algorithms and Their Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Support measures for graph data*

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Clustering graph data: the roadmap to spectral techniques

Genetic Algorithms and Their Applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Support measures for graph data^*