Knowledge-Guided Maximal Clique Enumeration

Harenberg, Steve; Seay, Ramona G.; Bello, Gonzalo A.; Chirkova, Rada Y.; Doraiswamy, P. Murali; Samatova, Nagiza F.

doi:10.1007/978-3-319-49586-6_43

Steve Harenberg^18,19,
Ramona G. Seay^18,19,
Gonzalo A. Bello^18,19,
Rada Y. Chirkova¹⁸,
P. Murali Doraiswamy²⁰ &
…
Nagiza F. Samatova^18,19

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10086))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2407 Accesses

Abstract

Maximal clique enumeration is a long-standing problem in graph mining and knowledge discovery. Numerous classic algorithms exist for solving this problem. However, these algorithms focus on enumerating all maximal cliques, which may be computationally impractical and much of the output may be irrelevant to the user. To address this issue, we introduce the problem of knowledge-biased clique enumeration, a query-driven formulation that reduces output space, computation time, and memory usage. Moreover, we introduce a dynamic state space indexing strategy for efficiently processing multiple queries over the same graph. This strategy reduces redundant computations by dynamically indexing the constituent state space generated with each query. Experimental results over real-world networks demonstrate this strategy’s effectiveness at reducing the cumulative query-response time. Although developed in the context of maximal cliques, our techniques could possibly be generalized to other constraint-based graph enumeration tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Noaa earth system research laboratory. http://www.esrl.noaa.gov/psd/data/gridded/data.noaa.ersst.html
Abello, J., Resende, M.G.C., Sudarsky, S.: Massive Quasi-Clique detection. In: Rajsbaum, S. (ed.) LATIN 2002. LNCS, vol. 2286, pp. 598–612. Springer, Heidelberg (2002). doi:10.1007/3-540-45995-2_51
Chapter Google Scholar
Angles, R.: A comparison of current graph database models. In: Workshops Proceedings of the IEEE 28th International Conference on Data Engineering, ICDE, pp. 171–177 (2012)
Google Scholar
Bast, H., Funke, S., Matijevic, D., Sanders, P., Schultes, D.: In transit to constant time shortest-path queries in road networks. In: Proceedings of the Nine Workshop on Algorithm Engineering and Experiments, ALENEX (2007)
Google Scholar
Bron, C., Kerbosch, J.: Finding all cliques of an undirected graph (algorithm 457). Commun. ACM 16(9), 575–576 (1973)
Article MATH Google Scholar
Creamer, G., Rowe, R., Hershkop, S., Stolfo, S.J.: Segmentation and automated social hierarchy detection through email network analysis. In: Advances in Web Mining and Web Usage Analysis, 9th International Workshop on Knowledge Discovery on the Web, WebKDD 2007, pp. 40–58 (2007)
Google Scholar
Cui, W., Xiao, Y., Wang, H., Lu, Y., Wang, W.: Online search of overlapping communities. In: Proceedings of the International Conference on Management of Data, SIGMOD 2013, pp. 277–288 (2013)
Google Scholar
Cui, W., Xiao, Y., Wang, H., Wang, W.: Local search of communities in large graphs. In: International Conference on Management of Data, SIGMOD, pp. 991–1002 (2014)
Google Scholar
Davis, A.P., Murphy, C.G., Johnson, R., Lay, J.M., Lennon-Hopkins, K., Saraceni-Richards, C.A., Sciaky, D., King, B.L., Rosenstein, M.C., Wiegers, T.C., Mattingly, C.J.: The comparative toxicogenomics database: update 2013. Nucleic Acids Res. 41(D1), D1104–D1114 (2013)
Article Google Scholar
Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., Lin, J., Minguez, P., Bork, P., von Mering, C., Jensen, L.J.: STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 41(D1), D808–D815 (2013)
Article Google Scholar
Goldenberg, S.B., Shapiro, L.J.: Physical mechanisms for the association of El Niño and West African rainfall with Atlantic major hurricane activity. J. Clim. 9(6), 1169–1187 (1996)
Article Google Scholar
Iordanov, B.: HyperGraphDB: a generalized graph database. In: Web-Age Information Management, pp. 25–36 (2010)
Google Scholar
Jia, L., Ye, J., Haiyan, L.V., Wang, W., Zhou, C., Zhang, X., Xu, J., Wang, L., Jia, J.: Genetic association between polymorphisms of pen2 gene and late onset Alzheimer’s disease in the north chinese population. Brain Res. 1141, 10–14 (2007)
Article Google Scholar
Klotzbach, P.J.: On the Madden-Julian oscillation-Atlantic hurricane relationship. J. Clim. 23(2), 282–293 (2010)
Article Google Scholar
Modani, N., Dey, K.: Large maximal cliques enumeration in large sparse graphs. In: Proceedings of the 15th International Conference on Management of Data (2009)
Google Scholar
Moon, J.W., Moser, L.: On cliques in graphs. Israel J. Math. 3(1), 23–28 (1965)
Article MathSciNet MATH Google Scholar
Sozio, M., Gionis, A.: The community-search problem and how to plan a successful cocktail party. In: Proceedings of the 16th International Conference on Knowledge Discovery and Data Mining, KDD. pp. 939–948 (2010)
Google Scholar
Steinhaeuser, K., Chawla, N.V., Ganguly, A.R.: Complex networks as a unified framework for descriptive analysis and predictive modeling in climate science. Stat. Anal. Data Min. 4(5), 497–511 (2011)
Article MathSciNet Google Scholar
Tsourakakis, C.E., Bonchi, F., Gionis, A., Gullo, F., Tsiarli, M.A.: Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In: The 19th International Conference on Knowledge Discovery and Data Mining, KDD, pp. 104–112 (2013)
Google Scholar
Vicknair, C., Macias, M., Zhao, Z., Nan, X., Chen, Y., Wilkins, D.: A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th Annual Southeast Regional Conference, p. 42 (2010)
Google Scholar
Vimont, D.J., Kossin, J.P.: The Atlantic meridional mode and hurricane activity. Geophysical Research Letters 34(7) (2007)
Google Scholar
Wei, F.: TEDI: efficient shortest path query answering on graphs. In: Graph Data Management: Techniques and Applications., pp. 214–238 (2011)
Google Scholar
Wilkinson, K., Sayers, C., Kuno, H.A., Reynolds, D.: Efficient RDF storage and retrieval in Jena2. In: Proceedings of The First International Workshop on Semantic Web and Databases, SWDB, pp. 131–150 (2003)
Google Scholar
Xiao, Y., Wu, W., Pei, J., Wang, W., He, Z.: Efficiently indexing shortest paths by exploiting symmetry in graphs. In: 12th International Conference on Extending Database Technology, EDBT, pp. 493–504 (2009)
Google Scholar
Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. In: 12th International Conference on Data Mining, ICDM, pp. 745–754 (2012)
Google Scholar
Zhang, B., Park, B.H., Karpinets, T., Samatova, N.F.: From pull-down data to protein interaction networks and complexes with biological relevance. Bioinformatics 24(7), 979–986 (2008)
Article Google Scholar
Zhu, F., Yan, X., Han, J., Yu, P.S.: gPrune: a constraint pushing framework for graph pattern mining. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 388–400. Springer, Heidelberg (2007). doi:10.1007/978-3-540-71701-0_38
Chapter Google Scholar

Download references

Acknowledgments

The authors would like to thank David A. Boyuka II and Sriram Lakshminarasimhan for their initial discussions and feedback with this work. This material is based upon work supported in part by the Laboratory for Analytic Sciences and the U.S. DOE Office of Science ASCR. Any opinions or findings expressed in this material are those of the author(s) and do not necessarily reflect the views of any agent or entity of the US government.

Author information

Authors and Affiliations

North Carolina State University, Raleigh, North Carolina, USA
Steve Harenberg, Ramona G. Seay, Gonzalo A. Bello, Rada Y. Chirkova & Nagiza F. Samatova
Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Steve Harenberg, Ramona G. Seay, Gonzalo A. Bello & Nagiza F. Samatova
Duke University, Durham, North Carolina, USA
P. Murali Doraiswamy

Authors

Steve Harenberg
View author publications
You can also search for this author in PubMed Google Scholar
Ramona G. Seay
View author publications
You can also search for this author in PubMed Google Scholar
Gonzalo A. Bello
View author publications
You can also search for this author in PubMed Google Scholar
Rada Y. Chirkova
View author publications
You can also search for this author in PubMed Google Scholar
P. Murali Doraiswamy
View author publications
You can also search for this author in PubMed Google Scholar
Nagiza F. Samatova
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nagiza F. Samatova .

Editor information

Editors and Affiliations

University of Technology , Sydney, New South Wales, Australia
Jinyan Li
University of Queensland , Brisbane, Australia
Xue Li
Beijing Institute of Technology , Beijing, China
Shuliang Wang
University of Western Australia , Crawley, West Australia, Australia
Jianxin Li
University of Adelaide , Adelaide, South Australia, Australia
Quan Z. Sheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Harenberg, S., Seay, R.G., Bello, G.A., Chirkova, R.Y., Doraiswamy, P.M., Samatova, N.F. (2016). Knowledge-Guided Maximal Clique Enumeration. In: Li, J., Li, X., Wang, S., Li, J., Sheng, Q. (eds) Advanced Data Mining and Applications. ADMA 2016. Lecture Notes in Computer Science(), vol 10086. Springer, Cham. https://doi.org/10.1007/978-3-319-49586-6_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-49586-6_43
Published: 13 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49585-9
Online ISBN: 978-3-319-49586-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics