skip to main content
10.1145/2396761.2398454acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

GRAFT: an approximate graphlet counting algorithm for large graph analysis

Published: 29 October 2012 Publication History

Abstract

Graphlet frequency distribution (GFD) is an analysis tool for understanding the variance of local structure in a graph. Many recent works use GFD for comparing, and characterizing real-life networks. However, the main bottleneck for graph analysis using GFD is the excessive computation cost for obtaining the frequency of each of the graphlets in a large network. To overcome this, we propose a simple, yet powerful algorithm, called GRAFT, that obtains the approximate graphlet frequency for all graphlets that have upto 5 vertices. Comparing to an exact counting algorithm, our algorithm achieves a speedup factor between 10 and 100 for a negligible counting error, which is, on average, less than 5%; For example, exact graphlet counting for ca-AstroPh takes approximately 3 days; but, GRAFT runs for 45 minutes to perform the same task with a counting accuracy of 95.6%.

References

[1]
A.-L. Barabasi and R. Albert. Emergence of Scaling In Random Networks. Science, 286:509--512, 1999.
[2]
S. P. Borgatti, A. Mehra, D. J. Brass, and G. Labianca. Network analysis in the social sciences. Science, 323:892--895, 2009.
[3]
E. C. E. Tsourakakis. Counting triangles in real-world networks using projections. Knowl. Inf., 26:501--520, 2011.
[4]
M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In Proc. of the conference on Applications, technologies, architectures, and protocols for computer communication, SIGCOMM '99, pages 251--262, 1999.
[5]
O. Kuchaiev, A. Stevanovic, W. Hayes, and N. Przulj. Graphcrunch 2: Software tool for network modeling, alignment and clustering. BMC Bioinformatics, 12(1), 2011.
[6]
J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. In Proc. of the 11th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, 2005.
[7]
T. Milenkovic and N. Przulj. Uncovering biological network function via graphlet degree signatures. Cancer Inform, 6:257--273, 2008.
[8]
N. Przulj. Biological network comparison using graphlet degree distribution. Bioinformatics, 23(2):e177--e183, 2007.
[9]
C. E. Tsourakakis, U. Kang, G. L. Miller, and C. Faloutsos. Doulion: Counting triangles in massive graphs with a coin. In In Proc. of KDD, 2009.
[10]
D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393:440--442, 1998.

Cited By

View all

Index Terms

  1. GRAFT: an approximate graphlet counting algorithm for large graph analysis

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management
      October 2012
      2840 pages
      ISBN:9781450311564
      DOI:10.1145/2396761
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 October 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. approximate graphlet counting
      2. graph analysis
      3. graphlet frequency distribution

      Qualifiers

      • Short-paper

      Conference

      CIKM'12
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)6
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 14 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)GPU-based butterfly countingThe VLDB Journal10.1007/s00778-024-00861-033:5(1543-1567)Online publication date: 27-Jun-2024
      • (2022)Efficient load-balanced butterfly counting on GPUProceedings of the VLDB Endowment10.14778/3551793.355180615:11(2450-2462)Online publication date: 29-Sep-2022
      • (2021)Sampling Graphlets of Multiplex Networks: A Restricted Random Walk ApproachACM Transactions on the Web10.1145/345629115:4(1-31)Online publication date: 14-Jun-2021
      • (2020)Accelerating All 5-Vertex Subgraphs Counting Using GPUsDatabase and Expert Systems Applications10.1007/978-3-030-59003-1_4(55-70)Online publication date: 14-Sep-2020
      • (2019)AutoMineProceedings of the 27th ACM Symposium on Operating Systems Principles10.1145/3341301.3359633(509-523)Online publication date: 27-Oct-2019
      • (2019)Android Malware Detection via Graphlet SamplingIEEE Transactions on Mobile Computing10.1109/TMC.2018.288073118:12(2754-2767)Online publication date: 1-Dec-2019
      • (2018)ApproxGProceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2018.00080(533-542)Online publication date: 1-May-2018
      • (2018)SNOD: a fast sampling method of exploring node orbit degrees for large graphsKnowledge and Information Systems10.1007/s10115-018-1301-zOnline publication date: 13-Dec-2018
      • (2018)The Role of Graphlets in Viral Processes on NetworksJournal of Nonlinear Science10.1007/s00332-018-9465-yOnline publication date: 26-May-2018
      • (2017)Fast and Flexible Top-k Similarity Search on Large NetworksACM Transactions on Information Systems10.1145/308669536:2(1-30)Online publication date: 21-Aug-2017
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media