Finding Itemset-Sharing Patterns in a Large Itemset-Associated Graph

Fukuzaki, Mutsumi; Seki, Mio; Kashima, Hisashi; Sese, Jun

doi:10.1007/978-3-642-13672-6_15

Finding Itemset-Sharing Patterns in a Large Itemset-Associated Graph

Mutsumi Fukuzaki²³,
Mio Seki²³,
Hisashi Kashima²⁴ &
…
Jun Sese²³

Conference paper

2185 Accesses
9 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6119))

Abstract

Itemset mining and graph mining have attracted considerable attention in the field of data mining, since they have many important applications in various areas such as biology, marketing, and social network analysis. However, most existing studies focus only on either itemset mining or graph mining, and only a few studies have addressed a combination of both. In this paper, we introduce a new problem which we call itemset-sharing subgraph (ISS) set enumeration, where the task is to find sets of subgraphs with common itemsets in a large graph in which each vertex has an associated itemset. The problem has various interesting potential applications such as in side-effect analysis in drug discovery and the analysis of the influence of word-of-mouth communication in marketing in social networks. We propose an efficient algorithm ROBIN for finding ISS sets in such graph; this algorithm enumerates connected subgraphs having common itemsets and finds their combinations. Experiments using a synthetic network verify that our method can efficiently process networks with more than one million edges. Experiments using a real biological network show that our algorithm can find biologically interesting patterns. We also apply ROBIN to a citation network and find successful collaborative research works.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. 20th Int. Conf. Very Large Data Bases, VLDB, pp. 487–499 (1994)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD ’00, pp. 1–12 (2000)
Google Scholar
Mannila, H., Toivonen, H., Verkamo, A.I.: Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov. 1(3), 259–289 (1997)
Article Google Scholar
Inokuchi, A., Washio, T., Motoda, H.: An Apriori-based algorithm for mining frequent substructures from graph data. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 13–23. Springer, Heidelberg (2000)
Chapter Google Scholar
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: ICDM 2001, pp. 313–320 (2001)
Google Scholar
Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: ICDM ’02, pp. 721 (2002)
Google Scholar
Basu, S., Bilenko, M., Mooney, R.J.: A probabilistic framework for semi-supervised clustering. In: KDD ’04, pp. 59–68 (2004)
Google Scholar
Hashimoto, K., Takigawa, I., Shiga, M., Kanehisa, M., Mamitsuka, H.: Incorporating gene functions as priors in model-based clustering of microarray gene expression data. Bioinformatics 24(16), i167–i173 (2008)
Article Google Scholar
Shiga, M., Takigawa, I., Mamitsuka, H.: A spectral clustering approach to optimally combining numerical vectors with a modular network. In: KDD ’07, pp. 647–656 (2007)
Google Scholar
Bayardo, R.: Efficiently mining long patterns from databases. In: SIGMOD ’98, pp. 85–93 (1998)
Google Scholar
Gasch, A.P., et al.: Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. Cell 11(12), 4241–4257 (2000)
Google Scholar
Knowledge Discovery Laboratory, University of Massachusetts Amherst: The Proximity DBLP database, http://kdl.cs.umass.edu/data/dblp/dblp-info.html
Huan, J., Wang, W., Prins, J., Yang, J.: Spin: mining maximal frequent subgraphs from graph databases. In: KDD ’04, pp. 581–586 (2004)
Google Scholar
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Advances in knowledge discovery and data mining, pp. 307–328 (1996)
Google Scholar
Zaki, M.J., Hsiao, C.J.: Efficient algorithms for mining closed itemsets and their lattice structure. IEEE TKDE 17(4), 462–478 (2005)
Google Scholar
Ulitsky, I., Shamir, R.: Identification of functional modules using network topology and high throughput data. BMC Systems Biology 1 (2007)
Google Scholar
Moser, F., Colak, R., Rafiey, A., Ester, M.: Mining cohesive patterns from graphs with feature vectors. In: SDM ’09 (2009)
Google Scholar
Seki, M., Sese, J.: Identification of active biological networks and common expression conditions. In: BIBE ’08 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Ochanomizu Univ., 2-1-1 Otsuka, Bunkyo, Tokyo, Japan
Mutsumi Fukuzaki, Mio Seki & Jun Sese
Dept. of Math. Informatics, Univ. of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo, Japan
Hisashi Kashima

Authors

Mutsumi Fukuzaki
View author publications
You can also search for this author in PubMed Google Scholar
Mio Seki
View author publications
You can also search for this author in PubMed Google Scholar
Hisashi Kashima
View author publications
You can also search for this author in PubMed Google Scholar
Jun Sese
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, Rensselaer Polytechnic Institute, USA
Mohammed J. Zaki
The Chinese University of Hong Kong, China
Jeffrey Xu Yu
IIT Madras, Chennai, India
B. Ravindran
IIIT, Hyderabad, India
Vikram Pudi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fukuzaki, M., Seki, M., Kashima, H., Sese, J. (2010). Finding Itemset-Sharing Patterns in a Large Itemset-Associated Graph. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2010. Lecture Notes in Computer Science(), vol 6119. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13672-6_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-13672-6_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13671-9
Online ISBN: 978-3-642-13672-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics