skip to main content
10.1145/1772690.1772762acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Sampling community structure

Published: 26 April 2010 Publication History

Abstract

We propose a novel method, based on concepts from expander graphs, to sample communities in networks. We show that our sampling method, unlike previous techniques, produces subgraphs representative of community structure in the original network. These generated subgraphs may be viewed as stratified samples in that they consist of members from most or all communities in the network. Using samples produced by our method, we show that the problem of community detection may be recast into a case of statistical relational learning. We empirically evaluate our approach against several real-world datasets and demonstrate that our sampling method can effectively be used to infer and approximate community affiliation in the larger network.

References

[1]
M. Adler and M. Mitzenmacher. Towards compressing web graphs. In Proc. of the IEEE DCC, pages 203--212, 2000.
[2]
M. Boguna, R. P. Satorras, A. D. Guilera, and A. Arenas. Models of social networks based on social distance attachment. Physical Review E, 70(5), 2004.
[3]
E. Bullmore and O. Sporns. Complex brain networks: graph theoretical analysis of structural and functional systems. Nature reviews. Neuroscience, Feb 2009.
[4]
B. A. Cipra. The best of the 20th century: Editors name top 10 algorithms. SIAM News, 33, 2000.
[5]
A. Clauset, M. E. J. Newman, and C. Moore. Finding community structure in very large networks. Physical Review E, 70(6):066111+, Dec 2004.
[6]
R. Clayford and T. Johnson. Operational parameters affecting use of anonymous cell phone tracking for generating traffic information. In 82nd TRB Annual Meeting, Jan 2003.
[7]
D. Crandall, D. Cosley, D. Huttenlocher, J. Kleinberg, and S. Suri. Feedback effects between similarity and social influence in online communities. In KDD '08, pages 160--168, 2008.
[8]
L. Danon, A. Diaz-Guilera, J. Duch, and A. Arenas. Comparing community structure identification. J. Stat. Mech.: Theory and Experiment, 2005(09):P09008+, Sep 2005.
[9]
J. Duch and A. Arenas. Community detection in complex networks using extremal optimization. Physical Review E, 72(2):027104+, 2005.
[10]
T. Feder and R. Motwani. Clique partitions, graph compression and speeding-up algorithms. In J. Comp. and Sys. Sci., pages 123--133, 1991.
[11]
S. Fortunato. Community detection in graphs. Jun 2009.
[12]
O. Frank. Network Sampling and Model Fitting. Cambridge University Press, February 2005.
[13]
J. Gehrke, P. Ginsparg, and J. Kleinberg. Overview of the 2003 kdd cup. SIGKDD Explor. Newsl., 5(2):149--151, 2003.
[14]
A. C. Gilbert and K. Levchenko. Compressing network graphs. In Proc. LinkKDD workshop at the 10th ACM Conference on KDD, Aug 2004.
[15]
M. Girvan and M. E. Newman. Community structure in social and biological networks. Proc Natl Acad Sci USA, 99(12):7821--7826, Jun 2002.
[16]
S. Goel, Matthew, and J. Salganik. Respondent-driven sampling as markov chain monte carlo. Stat. in Medicine, 2009.
[17]
B. H. Good, Y.-A. de Montjoye, and A. Clauset. The performance of modularity maximization in practical contexts. arXiv ePrints, 0910.0165, 2009.
[18]
D. Gusfield. Partition-distance: A problem and class of perfect graphs arising in clustering. Info. Proc. Letters, 82(3):159--164, May 2002.
[19]
S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications. Bull. Amer. Math. Soc., 43:439--561, 2006.
[20]
C. Hubler, H.-P. Kriegel, K. Borgwardt, and Z. Ghahramani. Metropolis algorithms for representative subgraph sampling. In ICDM '08, pages 283--292, 2008.
[21]
H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai, and A. L. Barabasi. The large-scale organization of metabolic networks. Nature, 407(6804):651--654, Oct 2000.
[22]
R. Kannan, S. Vempala, and A. Vetta. On clusterings: Good, bad and spectral. J. ACM, 51(3):497--515, May 2004.
[23]
J. Leskovec, L. Backstrom, R. Kumar, and A. Tomkins. Microscopic evolution of social networks. In KDD '08, pages 462--470, 2008.
[24]
J. Leskovec and C. Faloutsos. Sampling from large graphs. In KDD '06, pages 631--636, New York, NY, USA, 2006.
[25]
J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: densification laws, shrinking diameters and possible explanations. In KDD '05, pages 177--187, 2005.
[26]
D. J. C. Mackay. Information Theory, Inference & Learning Algorithms. Cambridge University Press, Jun 2002.
[27]
S. A. Macskassy and F. Provost. Classification in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res., 8:935--983, 2007.
[28]
A. Mehler and S. Skiena. Expanding network communities from representative examples. ACM TKDD, 3(2):1--27, 2009.
[29]
N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller. Equation of state calculations by fast computing machines. J. Chem. Phys., 21(6):1087--92, 1953.
[30]
M. E. J. Newman. Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74(3), 2006.
[31]
D. Rafiei. Effectively visualizing large networks through sampling. In Proc. VIS 05, pages 375--382, 2005.
[32]
M. Richardson, R. Agrawal, and P. Domingos. Trust Management for the Semantic Web, volume 2870. January 2003.
[33]
S. Wasserman, K. Faust, and D. Iacobucci. Social Network Analysis : Methods and Applications. Cambridge University Press, Nov 1994.
[34]
W. Zachary. An information flow model for conflict and fission in small groups. J. Anthropological Research, 33:452--473, 1977.

Cited By

View all
  • (2024)GraphSAID: Graph Sampling via Attention based Integer Programming MethodProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662983(1256-1264)Online publication date: 6-May-2024
  • (2024)Representative and Back-In-Time Sampling from Real-world HypergraphsACM Transactions on Knowledge Discovery from Data10.1145/365330618:6(1-48)Online publication date: 26-Apr-2024
  • (2024)Go-Network: a graph sampling library written in GoCompanion of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629527.3652903(151-155)Online publication date: 7-May-2024
  • Show More Cited By

Index Terms

  1. Sampling community structure

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '10: Proceedings of the 19th international conference on World wide web
    April 2010
    1407 pages
    ISBN:9781605587998
    DOI:10.1145/1772690

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 April 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. clustering
    2. community detection
    3. complex networks
    4. graphs
    5. sampling
    6. social networks

    Qualifiers

    • Research-article

    Conference

    WWW '10
    WWW '10: The 19th International World Wide Web Conference
    April 26 - 30, 2010
    North Carolina, Raleigh, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)39
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 22 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)GraphSAID: Graph Sampling via Attention based Integer Programming MethodProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3662983(1256-1264)Online publication date: 6-May-2024
    • (2024)Representative and Back-In-Time Sampling from Real-world HypergraphsACM Transactions on Knowledge Discovery from Data10.1145/365330618:6(1-48)Online publication date: 26-Apr-2024
    • (2024)Go-Network: a graph sampling library written in GoCompanion of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629527.3652903(151-155)Online publication date: 7-May-2024
    • (2024)An Extensive Characterization of Graph Sampling AlgorithmsCompanion of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629527.3652899(135-140)Online publication date: 7-May-2024
    • (2024)SamBaS: Sampling-Based Stochastic Block PartitioningIEEE Transactions on Network Science and Engineering10.1109/TNSE.2024.335830111:3(3053-3065)Online publication date: May-2024
    • (2024)Ricci Curvature-Based Graph Sparsification for Continual Graph Representation LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.330345435:12(17398-17410)Online publication date: Dec-2024
    • (2024)Denoising Item Graph With Disentangled Learning for RecommendationIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.336148236:7(2942-2955)Online publication date: Jul-2024
    • (2024)Extracting High-Fidelity Smaller Scale Subgraphs of Complex Networks by Edge-Reinforced Random WalkIEEE Transactions on Computational Social Systems10.1109/TCSS.2024.338177711:5(6181-6191)Online publication date: Oct-2024
    • (2024)Graph Sampling Quality Prediction for Algorithm Recommendation2024 IEEE 31st International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC62374.2024.00032(243-254)Online publication date: 18-Dec-2024
    • (2024)Sampling unknown large networks restricted by low sampling ratesScientific Reports10.1038/s41598-024-64018-314:1Online publication date: 10-Jun-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    EPUB

    View this article in ePub.

    ePub

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media