research-article

Towards a Better Tradeoff between Quality and Efficiency of Community Detection: An Inductive Embedding Method across Graphs

Authors:

Dit-Yan YeungAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data, Volume 17, Issue 9

Article No.: 127, Pages 1 - 34

https://doi.org/10.1145/3596605

Published: 15 June 2023 Publication History

Abstract

Many network applications can be formulated as NP-hard combinatorial optimization problems of community detection (CD) that partitions nodes of a graph into several groups with dense linkage. Most existing CD methods are transductive, which independently optimized their models for each single graph, and can only ensure either high quality or efficiency of CD by respectively using advanced machine learning techniques or fast heuristic approximation. In this study, we consider the CD task and aims to alleviate its NP-hard challenge. Motivated by the efficient inductive inference of graph neural networks (GNNs), we explore the possibility to achieve a better tradeoff between the quality and efficiency of CD via an inductive embedding scheme across multiple graphs of a system and propose a novel inductive community detection (ICD) method. Concretely, ICD first conducts the offline training of an adversarial dual GNN structure on historical graphs to capture key properties of a system. The trained model is then directly generalized to new graphs of the same system for online CD without additional optimization, where a better tradeoff between quality and efficiency can be achieved. Compared with existing inductive approaches, we develop a novel feature extraction module based on graph coarsening, which can efficiently extract informative feature inputs for GNNs. Moreover, our original designs of adversarial dual GNN and clustering regularization loss further enable ICD to capture permutation-invariant community labels in the offline training and help derive community-preserved embedding to support the high-quality online CD. Experiments on a set of benchmarks demonstrate that ICD can achieve a significant tradeoff between quality and efficiency over various baselines.

References

[1]

Emmanuel Abbe. 2017. Community detection and stochastic block models: Recent developments. Journal of Machine Learning Research 18, 1 (2017), 6446–6531.

Digital Library

[2]

Ranjan Kumar Behera, Debadatta Naik, Santanu Kumar Rath, and Ramesh Dharavath. 2020. Genetic algorithm-based community detection in large-scale social networks. Neural Computing & Applications 32, 13 (2020), 9649–9665.

[3]

Ranjan Kumar Behera, Santanu Kumar Rath, Sanjay Misra, Robertas Damaševičius, and Rytis Maskeliūnas. 2017. Large scale community detection using a small world model. Applied Sciences 7, 11 (2017), 1173.

[4]

Kamal Berahmand, Asgarali Bouyer, and Mahdi Vasighi. 2018. Community detection in complex networks by detecting and expanding core nodes through extended local similarity of nodes. IEEE Transactions on Computational Social Systems 5, 4 (2018), 1021–1033.

[5]

Kamal Berahmand, Mehrnoush Mohammadi, Azadeh Faroughi, and Rojiar Pir Mohammadiani. 2022. A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix. Cluster Computing 25, 2 (2022), 869–888. DOI:

Digital Library

[6]

Kamal Berahmand, Mehrnoush Mohammadi, Farid Saberi-Movahed, Yuefeng Li, and Yue Xu. 2023. Graph regularized nonnegative matrix factorization for community detection in attributed networks. IEEE Transactions on Network Science and Engineering 10, 1 (2023), 372–385. DOI:

[7]

Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory & Experiment 2008, 10 (2008), P10008.

[8]

Mingming Chen, Konstantin Kuzmin, and Boleslaw K. Szymanski. 2014. Community detection via maximization of modularity and its variants. IEEE Transactions on Computational Social Systems 1, 1 (2014), 46–65.

[9]

Zhengdao Chen, Lisha Li, and Joan Bruna. 2020. Supervised community detection with line graph neural networks. In Proceedings of the 8th International Conference on Learning Representations. 1705.08415.

[10]

Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 257–266.

Digital Library

[11]

Aaron Clauset, Mark E. J. Newman, and Cristopher Moore. 2004. Finding community structure in very large networks. Physical Review E 70, 6 (2004), 066111.

[12]

Lin Dai and Bo Bai. 2017. Optimal decomposition for large-scale infrastructure-based wireless networks. IEEE Transactions on Wireless Communications 16, 8 (2017), 4956–4969.

Digital Library

[13]

Asim K. Dey, Yahui Tian, and Yulia R. Gel. 2022. Community detection in complex networks: From statistical foundations to data science applications. Wiley Interdisciplinary Reviews: Computational Statistics 14, 2 (2022), e1566.

Digital Library

[14]

Inderjit S. Dhillon, Yuqiang Guan, and Brian Kulis. 2007. Weighted graph cuts without eigenvectors a multilevel approach. IEEE Transactions on Pattern Analysis & Machine Intelligence 29, 11 (2007), 1944–1957.

Digital Library

[15]

Wenzheng Feng, Yuxiao Dong, Tinglin Huang, Ziqi Yin, Xu Cheng, Evgeny Kharlamov, and Jie Tang. 2022. GRAND+: Scalable graph random neural networks. In Proceedings of the ACM Web Conference 2022. 3248–3258.

Digital Library

[16]

Santo Fortunato and Mark E. J. Newman. 2022. 20 years of network community detection. Nature Physics 18, 8 (2022), 848–850.

[17]

Michelle Girvan and Mark E. J. Newman. 2002. Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99, 12 (2002), 7821–7826.

[18]

Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 30th International Conference on Artificial Intelligence & Statistics. 249–256.

[19]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 855–864.

Digital Library

[20]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of Advances in Neural Information Processing Systems. 1024–1034.

[21]

Bruce Hendrickson and Robert W. Leland. 1995. A multi-level algorithm for partitioning graphs. Proceedings Supercomputing 95, 28 (1995), 1–14.

[22]

Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2020. Strategies for pre-training graph neural networks. In Proceedings of the 8th International Conference on Learning Representations. 1905.12265.

[23]

Yuting Jia, Qinqin Zhang, Weinan Zhang, and Xinbing Wang. 2019. Communitygan: Community detection with generative adversarial nets. In Proceedings of the ACM Web Conference 2019. 784–794.

[24]

Di Jin, Zhizhi Yu, Pengfei Jiao, Shirui Pan, Dongxiao He, Jia Wu, Philip S. Yu, and Weixiong Zhang. 2023. A survey of community detection approaches: From statistical modeling to deep learning. IEEE Transactions on Knowledge & Data Engineering 35, 2 (2023), 1149–1170. DOI:

[25]

Brian Karrer and Mark E. J. Newman. 2011. Stochastic blockmodels and community structure in networks. Physical Review E 83, 1 (2011), 016107.

[26]

Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations. 1609.02907.

[27]

Anisha Kumari, Ranjan Kumar Behera, Abhishek Sai Shukla, Satya Prakash Sahoo, Sanjay Misra, and Sanatanu Kumar Rath. 2020. Quantifying influential communities in granular social networks using fuzzy theory. In Proceedings of the 20th International Conference of Computational Science & Its Applications (ICCSA). Springer, 906–917.

Digital Library

[28]

Andrea Lancichinetti, Santo Fortunato, and Filippo Radicchi. 2008. Benchmark graphs for testing community detection algorithms. Physical Review E 78, 4 (2008), 046110.

[29]

Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. 177–187.

Digital Library

[30]

Wei Li, Meng Qin, and Kai Lei. 2019. Identifying interpretable link communities with user interactions and messages in social networks. In Proceedings of 2019 IEEE International Conference on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, 271–278.

[31]

Yu Li, Ying Wang, Tingting Zhang, Jiawei Zhang, and Yi Chang. 2019. Learning network embedding with community structural information. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI).

Digital Library

[32]

Xin Liu, Tsuyoshi Murata, Kyoung-Sook Kim, Chatchawan Kotarasu, and Chenyi Zhuang. 2019. A general view for network embedding as matrix factorization. In Proceedings of the 12th ACM International Conference on Web Search & Data Mining. 375–383.

Digital Library

[33]

Zhuocheng Ma, Dan Yin, Chanying Huang, Qing Yang, and Haiwei Pan. 2021. LAA: Inductive community detection algorithm based on label aggregation. In Proceedings of the 2021 International Workshops on Database Systems for Advanced Applications (DASFAA). Springer, 45–56.

Digital Library

[34]

Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. 2015. Adversarial autoencoders. arXiv:1511.05644. Retrieved from https://arxiv.org/abs/1511.05644.

[35]

Christian Mayer, Muhammad Adnan Tariq, Ruben Mayer, and Kurt Rothermel. 2018. Graph: Traffic-aware graph processing. IEEE Transactions on Parallel & Distributed Systems 29, 6 (2018), 1289–1302.

[36]

Azade Nazi, Will Hang, Anna Goldie, Sujith Ravi, and Azalia Mirhoseini. 2019. GAP: Generalizable approximate graph partitioning framework. arXiv:1903.00614. Retrieved from https://arxiv.org/abs/1903.00614.

[37]

Mark E. J. Newman. 2006. Modularity and community structure in networks. Proceedings of the National Academy of Sciences 103, 23 (2006), 8577–8582.

[38]

S. Pan, R. Hu, G. Long, J. Jiang, L. Yao, and C. Zhang. 2018. Adversarially regularized graph autoencoder for graph embedding. In Proceedings of the 27th International Joint Conference on Artificial Intelligence.

[39]

Siddheshwar V. Patil and Dinesh B. Kulkarni. 2021. Graph partitioning using heuristic kernighan-lin algorithm for parallel computing. In Proceedings of the Next Generation Information Processing System. Springer, 281–288.

[40]

Tiago P. Peixoto. 2014. Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models. Physical Review E 89, 1 (2014), 012804.

[41]

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 701–710.

Digital Library

[42]

Michal Piorkowski, Natasa Sarafijanovic, and Matthias Grossglauser. 2009. A parsimonious model of mobile partitioned networks with clustering. In Proceedings of the 1st International Communication Systems & Networks Workshop. IEEE, 1–10.

[43]

Meng Qin, Di Jin, Kai Lei, Bogdan Gabrys, and Katarzyna Musial-Gabrys. 2018. Adaptive community detection incorporating topology and content in social networks. Knowledge-Based Systems 161 (2018), 342–356. https://www.sciencedirect.com/science/article/pii/S0950705118303885.

[44]

Meng Qin and Kai Lei. 2021. Dual-channel hybrid community detection in attributed networks. Information Sciences 551 (2021), 146–167. https://www.sciencedirect.com/science/article/pii/S0020025520310963.

[45]

Meng Qin, Kai Lei, Bo Bai, and Gong Zhang. 2019. Towards a profiling view for unsupervised traffic classification by exploring the statistic features and link patterns. In Proceedings of the 2019 ACM SIGCOMM Workshop on Network Meets AI & ML. 50–56.

Digital Library

[46]

Meng Qin, Chaorui Zhang, Bo Bai, Gong Zhang, and Dit-Yan Yeung. 2023. High-quality temporal link prediction for weighted dynamic graphs via inductive embedding aggregation. IEEE Transactions on Knowledge & Data Engineering, 1–14.

[47]

Jiezhong Qiu, Qibin Chen, Yuxiao Dong, Jing Zhang, Hongxia Yang, Ming Ding, Kuansan Wang, and Jie Tang. 2020. GCC: Graph contrastive coding for graph neural network pre-training. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1150–1160.

Digital Library

[48]

Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018. Network embedding as matrix factorization: Unifying DeepWalk, LINE, TPE, and Node2Vec. In Proceedings of the 11th ACM International Conference on Web Search & Data Mining. 459–467.

Digital Library

[49]

Ryan Rossi and Nesreen Ahmed. 2015. The network data repository with interactive graph analytics and visualization. In Proceedings of the 29th AAAI Conference on Artificial Intelligence. 4292–4293.

[50]

Satu Elisa Schaeffer. 2007. Graph clustering. Computer Science Review 1, 1 (2007), 27–64.

Digital Library

[51]

Fei Tian, Bin Gao, Qing Cui, Enhong Chen, and Tie-Yan Liu. 2014. Learning deep representations for graph clustering. In Proceedings of the 28th AAAI Conference on Artificial Intelligence, Vol. 28.

[52]

Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations. 1710.10903.

[53]

Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics & Computing 17, 4 (2007), 395–416.

Digital Library

[54]

Daixin Wang, Peng Cui, and Wenwu Zhu. 2016. Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1225–1234.

Digital Library

[55]

Fei Wang, Tao Li, Xin Wang, Shenghuo Zhu, and Chris Ding. 2011. Community discovery using nonnegative matrix factorization. Data Mining & Knowledge Discovery 22, 3 (2011), 493–521.

Digital Library

[56]

Po-Wei Wang and J. Zico Kolter. 2020. Community detection using fast low-cardinality semidefinite programming. Proceedings of Advances in Neural Information Processing Systems, 3374–3385.

[57]

Rui-Sheng Wang, Shihua Zhang, Yong Wang, Xiang-Sun Zhang, and Luonan Chen. 2008. Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures. Neurocomputing 72, 1-3 (2008), 134–141.

Digital Library

[58]

Xiao Wang, Peng Cui, Jing Wang, Jian Pei, Wenwu Zhu, and Shiqiang Yang. 2017. Community preserving network embedding. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, Vol. 17. 203–209.

[59]

Klaus Wehrle, Mesut Günes, and James Gross. 2010. Modeling and Tools for Network Simulation. Springer Science & Business Media.

[60]

Bryan Wilder, Eric Ewing, Bistra Dilkina, and Milind Tambe. 2019. End to end learning and optimization on graphs. In Proceedings of Advances in Neural Information Processing Systems. 4672–4683.

[61]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks? In Proceedings of the 7th International Conference on Learning Representations. 1810.00826.

[62]

Pinar Yanardag and SVN Vishwanathan. 2015. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1365–1374.

Digital Library

[63]

Liang Yang, Xiaochun Cao, Dongxiao He, Chuan Wang, Xiao Wang, and Weixiong Zhang. 2016. Modularity based community detection with deep learning. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, Vol. 16. 2252–2258.

[64]

Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, and Sourav S. Bhowmick. 2020. Homogeneous network embedding for massive graphs via reweighted personalized PageRank. VLDB Endow 13, 5 (2020), 670–683.

Digital Library

[65]

Jiaxuan You, Rex Ying, and Jure Leskovec. 2019. Position-aware graph neural networks. In Proceedings of the 2019 International Conference on Machine Learning (ICML). PMLR, 7134–7143.

[66]

Jie Zhang, Yuxiao Dong, Yan Wang, Jie Tang, and Ming Ding. 2019. ProNE: Fast and scalable network representation learning. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Vol. 19. 4278–4284.

[67]

Pan Zhang and Cristopher Moore. 2014. Scalable detection of statistically significant communities and hierarchies, using message passing for modularity. Proceedings of the National Academy of Sciences 111, 51 (2014), 18144–18149.

[68]

Wentao Zhang, Yu Shen, Zheyu Lin, Yang Li, Xiaosen Li, Wen Ouyang, Yangyu Tao, Zhi Yang, and Bin Cui. 2022. Pasca: A graph neural architecture search system under the scalable paradigm. In Proceedings of the ACM Web Conference 2022. 1817–1828.

Digital Library

[69]

Ziwei Zhang, Peng Cui, Haoyang Li, Xiao Wang, and Wenwu Zhu. 2018. Billion-scale network embedding with iterative random projection. In Proceedings of the 2018 IEEE International Conference on Data Mining. IEEE, 787–796.

[70]

Jing Zhu, Xingyu Lu, Mark Heimann, and Danai Koutra. 2021. Node proximity is all you need: Unified structural and positional node and graph embedding. In Proceedings of the 2021 SIAM International Conference on Data Mining. SIAM, 163–171.

Cited By

Chen XLiu CLi XSun YYu WJiao P(2024)Link prediction in bipartite networks via effective integration of explicit and implicit relationsNeurocomputing10.1016/j.neucom.2023.127016566:COnline publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1016/j.neucom.2023.127016
Nooribakhsh MFernández-Diego MGonzález-Ladrón-De-Guevara FMollamotalebi M(2024)Community detection in social networks using machine learning: a systematic mapping studyKnowledge and Information Systems10.1007/s10115-024-02201-866:12(7205-7259)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1007/s10115-024-02201-8
Qin MYeung D(2023)Temporal Link Prediction: A Unified Framework, Taxonomy, and ReviewACM Computing Surveys10.1145/362582056:4(1-40)Online publication date: 9-Nov-2023
https://dl.acm.org/doi/10.1145/3625820
Show More Cited By

Index Terms

Towards a Better Tradeoff between Quality and Efficiency of Community Detection: An Inductive Embedding Method across Graphs
1. Mathematics of computing
  1. Discrete mathematics
    1. Graph theory
      1. Graph algorithms
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Algorithmic game theory and mechanism design
      1. Social networks
    2. Machine learning theory
      1. Inductive inference

Recommendations

Pre-train and Refine: Towards Higher Efficiency in K-Agnostic Community Detection without Quality Degradation
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Community detection (CD) is a classic graph inference task that partitions nodes of a graph into densely connected groups. While many CD methods have been proposed with either impressive quality or efficiency, balancing the two aspects remains a ...
CommDGI: Community Detection Oriented Deep Graph Infomax
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Graph Neural Networks(GNNs), like GCN and GAT, have achieved great success in a number of supervised or semi-supervised tasks including node classification and link prediction. These existing graph neural networks can effectively encode neighborhood ...
On the triangle clique cover and K t clique cover problems
Abstract
An edge clique cover of a graph is a set of cliques that covers all edges of the graph. We generalize this concept to K t clique cover, i.e. a set of cliques that covers all complete subgraphs on t vertices of the graph, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 17, Issue 9

November 2023

373 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/3604532

Editor:
Charu Aggarwal
IBM T. J. Watson Research, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2023

Online AM: 08 May 2023

Accepted: 01 May 2023

Revised: 11 March 2023

Received: 02 October 2022

Published in TKDD Volume 17, Issue 9

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Council of Hong Kong under the Research Impact Fund

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
525
Total Downloads

Downloads (Last 12 months)192
Downloads (Last 6 weeks)21

Reflects downloads up to 02 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen XLiu CLi XSun YYu WJiao P(2024)Link prediction in bipartite networks via effective integration of explicit and implicit relationsNeurocomputing10.1016/j.neucom.2023.127016566:COnline publication date: 4-Mar-2024
https://dl.acm.org/doi/10.1016/j.neucom.2023.127016
Nooribakhsh MFernández-Diego MGonzález-Ladrón-De-Guevara FMollamotalebi M(2024)Community detection in social networks using machine learning: a systematic mapping studyKnowledge and Information Systems10.1007/s10115-024-02201-866:12(7205-7259)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1007/s10115-024-02201-8
Qin MYeung D(2023)Temporal Link Prediction: A Unified Framework, Taxonomy, and ReviewACM Computing Surveys10.1145/362582056:4(1-40)Online publication date: 9-Nov-2023
https://dl.acm.org/doi/10.1145/3625820
Gao YQin MDing YZeng LZhang CZhang WHan WZhao RBai B(2023)RaftGP: Random Fast Graph Partitioning2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363495(1-7)Online publication date: 25-Sep-2023
https://doi.org/10.1109/HPEC58863.2023.10363495
Guo KLin JZhuang QZeng RWang J(2023)Adaptive graph contrastive learning for community detectionApplied Intelligence10.1007/s10489-023-05046-w53:23(28768-28786)Online publication date: 13-Oct-2023
https://dl.acm.org/doi/10.1007/s10489-023-05046-w

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents