skip to main content
research-article

Towards a Better Tradeoff between Quality and Efficiency of Community Detection: An Inductive Embedding Method across Graphs

Published: 15 June 2023 Publication History

Abstract

Many network applications can be formulated as NP-hard combinatorial optimization problems of community detection (CD) that partitions nodes of a graph into several groups with dense linkage. Most existing CD methods are transductive, which independently optimized their models for each single graph, and can only ensure either high quality or efficiency of CD by respectively using advanced machine learning techniques or fast heuristic approximation. In this study, we consider the CD task and aims to alleviate its NP-hard challenge. Motivated by the efficient inductive inference of graph neural networks (GNNs), we explore the possibility to achieve a better tradeoff between the quality and efficiency of CD via an inductive embedding scheme across multiple graphs of a system and propose a novel inductive community detection (ICD) method. Concretely, ICD first conducts the offline training of an adversarial dual GNN structure on historical graphs to capture key properties of a system. The trained model is then directly generalized to new graphs of the same system for online CD without additional optimization, where a better tradeoff between quality and efficiency can be achieved. Compared with existing inductive approaches, we develop a novel feature extraction module based on graph coarsening, which can efficiently extract informative feature inputs for GNNs. Moreover, our original designs of adversarial dual GNN and clustering regularization loss further enable ICD to capture permutation-invariant community labels in the offline training and help derive community-preserved embedding to support the high-quality online CD. Experiments on a set of benchmarks demonstrate that ICD can achieve a significant tradeoff between quality and efficiency over various baselines.

References

[1]
Emmanuel Abbe. 2017. Community detection and stochastic block models: Recent developments. Journal of Machine Learning Research 18, 1 (2017), 6446–6531.
[2]
Ranjan Kumar Behera, Debadatta Naik, Santanu Kumar Rath, and Ramesh Dharavath. 2020. Genetic algorithm-based community detection in large-scale social networks. Neural Computing & Applications 32, 13 (2020), 9649–9665.
[3]
Ranjan Kumar Behera, Santanu Kumar Rath, Sanjay Misra, Robertas Damaševičius, and Rytis Maskeliūnas. 2017. Large scale community detection using a small world model. Applied Sciences 7, 11 (2017), 1173.
[4]
Kamal Berahmand, Asgarali Bouyer, and Mahdi Vasighi. 2018. Community detection in complex networks by detecting and expanding core nodes through extended local similarity of nodes. IEEE Transactions on Computational Social Systems 5, 4 (2018), 1021–1033.
[5]
Kamal Berahmand, Mehrnoush Mohammadi, Azadeh Faroughi, and Rojiar Pir Mohammadiani. 2022. A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix. Cluster Computing 25, 2 (2022), 869–888. DOI:
[6]
Kamal Berahmand, Mehrnoush Mohammadi, Farid Saberi-Movahed, Yuefeng Li, and Yue Xu. 2023. Graph regularized nonnegative matrix factorization for community detection in attributed networks. IEEE Transactions on Network Science and Engineering 10, 1 (2023), 372–385. DOI:
[7]
Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory & Experiment 2008, 10 (2008), P10008.
[8]
Mingming Chen, Konstantin Kuzmin, and Boleslaw K. Szymanski. 2014. Community detection via maximization of modularity and its variants. IEEE Transactions on Computational Social Systems 1, 1 (2014), 46–65.
[9]
Zhengdao Chen, Lisha Li, and Joan Bruna. 2020. Supervised community detection with line graph neural networks. In Proceedings of the 8th International Conference on Learning Representations. 1705.08415.
[10]
Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, and Cho-Jui Hsieh. 2019. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 257–266.
[11]
Aaron Clauset, Mark E. J. Newman, and Cristopher Moore. 2004. Finding community structure in very large networks. Physical Review E 70, 6 (2004), 066111.
[12]
Lin Dai and Bo Bai. 2017. Optimal decomposition for large-scale infrastructure-based wireless networks. IEEE Transactions on Wireless Communications 16, 8 (2017), 4956–4969.
[13]
Asim K. Dey, Yahui Tian, and Yulia R. Gel. 2022. Community detection in complex networks: From statistical foundations to data science applications. Wiley Interdisciplinary Reviews: Computational Statistics 14, 2 (2022), e1566.
[14]
Inderjit S. Dhillon, Yuqiang Guan, and Brian Kulis. 2007. Weighted graph cuts without eigenvectors a multilevel approach. IEEE Transactions on Pattern Analysis & Machine Intelligence 29, 11 (2007), 1944–1957.
[15]
Wenzheng Feng, Yuxiao Dong, Tinglin Huang, Ziqi Yin, Xu Cheng, Evgeny Kharlamov, and Jie Tang. 2022. GRAND+: Scalable graph random neural networks. In Proceedings of the ACM Web Conference 2022. 3248–3258.
[16]
Santo Fortunato and Mark E. J. Newman. 2022. 20 years of network community detection. Nature Physics 18, 8 (2022), 848–850.
[17]
Michelle Girvan and Mark E. J. Newman. 2002. Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99, 12 (2002), 7821–7826.
[18]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the 30th International Conference on Artificial Intelligence & Statistics. 249–256.
[19]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 855–864.
[20]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of Advances in Neural Information Processing Systems. 1024–1034.
[21]
Bruce Hendrickson and Robert W. Leland. 1995. A multi-level algorithm for partitioning graphs. Proceedings Supercomputing 95, 28 (1995), 1–14.
[22]
Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. 2020. Strategies for pre-training graph neural networks. In Proceedings of the 8th International Conference on Learning Representations. 1905.12265.
[23]
Yuting Jia, Qinqin Zhang, Weinan Zhang, and Xinbing Wang. 2019. Communitygan: Community detection with generative adversarial nets. In Proceedings of the ACM Web Conference 2019. 784–794.
[24]
Di Jin, Zhizhi Yu, Pengfei Jiao, Shirui Pan, Dongxiao He, Jia Wu, Philip S. Yu, and Weixiong Zhang. 2023. A survey of community detection approaches: From statistical modeling to deep learning. IEEE Transactions on Knowledge & Data Engineering 35, 2 (2023), 1149–1170. DOI:
[25]
Brian Karrer and Mark E. J. Newman. 2011. Stochastic blockmodels and community structure in networks. Physical Review E 83, 1 (2011), 016107.
[26]
Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations. 1609.02907.
[27]
Anisha Kumari, Ranjan Kumar Behera, Abhishek Sai Shukla, Satya Prakash Sahoo, Sanjay Misra, and Sanatanu Kumar Rath. 2020. Quantifying influential communities in granular social networks using fuzzy theory. In Proceedings of the 20th International Conference of Computational Science & Its Applications (ICCSA). Springer, 906–917.
[28]
Andrea Lancichinetti, Santo Fortunato, and Filippo Radicchi. 2008. Benchmark graphs for testing community detection algorithms. Physical Review E 78, 4 (2008), 046110.
[29]
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. 177–187.
[30]
Wei Li, Meng Qin, and Kai Lei. 2019. Identifying interpretable link communities with user interactions and messages in social networks. In Proceedings of 2019 IEEE International Conference on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). IEEE, 271–278.
[31]
Yu Li, Ying Wang, Tingting Zhang, Jiawei Zhang, and Yi Chang. 2019. Learning network embedding with community structural information. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI).
[32]
Xin Liu, Tsuyoshi Murata, Kyoung-Sook Kim, Chatchawan Kotarasu, and Chenyi Zhuang. 2019. A general view for network embedding as matrix factorization. In Proceedings of the 12th ACM International Conference on Web Search & Data Mining. 375–383.
[33]
Zhuocheng Ma, Dan Yin, Chanying Huang, Qing Yang, and Haiwei Pan. 2021. LAA: Inductive community detection algorithm based on label aggregation. In Proceedings of the 2021 International Workshops on Database Systems for Advanced Applications (DASFAA). Springer, 45–56.
[34]
Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. 2015. Adversarial autoencoders. arXiv:1511.05644. Retrieved from https://arxiv.org/abs/1511.05644.
[35]
Christian Mayer, Muhammad Adnan Tariq, Ruben Mayer, and Kurt Rothermel. 2018. Graph: Traffic-aware graph processing. IEEE Transactions on Parallel & Distributed Systems 29, 6 (2018), 1289–1302.
[36]
Azade Nazi, Will Hang, Anna Goldie, Sujith Ravi, and Azalia Mirhoseini. 2019. GAP: Generalizable approximate graph partitioning framework. arXiv:1903.00614. Retrieved from https://arxiv.org/abs/1903.00614.
[37]
Mark E. J. Newman. 2006. Modularity and community structure in networks. Proceedings of the National Academy of Sciences 103, 23 (2006), 8577–8582.
[38]
S. Pan, R. Hu, G. Long, J. Jiang, L. Yao, and C. Zhang. 2018. Adversarially regularized graph autoencoder for graph embedding. In Proceedings of the 27th International Joint Conference on Artificial Intelligence.
[39]
Siddheshwar V. Patil and Dinesh B. Kulkarni. 2021. Graph partitioning using heuristic kernighan-lin algorithm for parallel computing. In Proceedings of the Next Generation Information Processing System. Springer, 281–288.
[40]
Tiago P. Peixoto. 2014. Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models. Physical Review E 89, 1 (2014), 012804.
[41]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 701–710.
[42]
Michal Piorkowski, Natasa Sarafijanovic, and Matthias Grossglauser. 2009. A parsimonious model of mobile partitioned networks with clustering. In Proceedings of the 1st International Communication Systems & Networks Workshop. IEEE, 1–10.
[43]
Meng Qin, Di Jin, Kai Lei, Bogdan Gabrys, and Katarzyna Musial-Gabrys. 2018. Adaptive community detection incorporating topology and content in social networks. Knowledge-Based Systems 161 (2018), 342–356. https://www.sciencedirect.com/science/article/pii/S0950705118303885.
[44]
Meng Qin and Kai Lei. 2021. Dual-channel hybrid community detection in attributed networks. Information Sciences 551 (2021), 146–167. https://www.sciencedirect.com/science/article/pii/S0020025520310963.
[45]
Meng Qin, Kai Lei, Bo Bai, and Gong Zhang. 2019. Towards a profiling view for unsupervised traffic classification by exploring the statistic features and link patterns. In Proceedings of the 2019 ACM SIGCOMM Workshop on Network Meets AI & ML. 50–56.
[46]
Meng Qin, Chaorui Zhang, Bo Bai, Gong Zhang, and Dit-Yan Yeung. 2023. High-quality temporal link prediction for weighted dynamic graphs via inductive embedding aggregation. IEEE Transactions on Knowledge & Data Engineering, 1–14.
[47]
Jiezhong Qiu, Qibin Chen, Yuxiao Dong, Jing Zhang, Hongxia Yang, Ming Ding, Kuansan Wang, and Jie Tang. 2020. GCC: Graph contrastive coding for graph neural network pre-training. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1150–1160.
[48]
Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018. Network embedding as matrix factorization: Unifying DeepWalk, LINE, TPE, and Node2Vec. In Proceedings of the 11th ACM International Conference on Web Search & Data Mining. 459–467.
[49]
Ryan Rossi and Nesreen Ahmed. 2015. The network data repository with interactive graph analytics and visualization. In Proceedings of the 29th AAAI Conference on Artificial Intelligence. 4292–4293.
[50]
Satu Elisa Schaeffer. 2007. Graph clustering. Computer Science Review 1, 1 (2007), 27–64.
[51]
Fei Tian, Bin Gao, Qing Cui, Enhong Chen, and Tie-Yan Liu. 2014. Learning deep representations for graph clustering. In Proceedings of the 28th AAAI Conference on Artificial Intelligence, Vol. 28.
[52]
Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations. 1710.10903.
[53]
Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics & Computing 17, 4 (2007), 395–416.
[54]
Daixin Wang, Peng Cui, and Wenwu Zhu. 2016. Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1225–1234.
[55]
Fei Wang, Tao Li, Xin Wang, Shenghuo Zhu, and Chris Ding. 2011. Community discovery using nonnegative matrix factorization. Data Mining & Knowledge Discovery 22, 3 (2011), 493–521.
[56]
Po-Wei Wang and J. Zico Kolter. 2020. Community detection using fast low-cardinality semidefinite programming. Proceedings of Advances in Neural Information Processing Systems, 3374–3385.
[57]
Rui-Sheng Wang, Shihua Zhang, Yong Wang, Xiang-Sun Zhang, and Luonan Chen. 2008. Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures. Neurocomputing 72, 1-3 (2008), 134–141.
[58]
Xiao Wang, Peng Cui, Jing Wang, Jian Pei, Wenwu Zhu, and Shiqiang Yang. 2017. Community preserving network embedding. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, Vol. 17. 203–209.
[59]
Klaus Wehrle, Mesut Günes, and James Gross. 2010. Modeling and Tools for Network Simulation. Springer Science & Business Media.
[60]
Bryan Wilder, Eric Ewing, Bistra Dilkina, and Milind Tambe. 2019. End to end learning and optimization on graphs. In Proceedings of Advances in Neural Information Processing Systems. 4672–4683.
[61]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks? In Proceedings of the 7th International Conference on Learning Representations. 1810.00826.
[62]
Pinar Yanardag and SVN Vishwanathan. 2015. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1365–1374.
[63]
Liang Yang, Xiaochun Cao, Dongxiao He, Chuan Wang, Xiao Wang, and Weixiong Zhang. 2016. Modularity based community detection with deep learning. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, Vol. 16. 2252–2258.
[64]
Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, and Sourav S. Bhowmick. 2020. Homogeneous network embedding for massive graphs via reweighted personalized PageRank. VLDB Endow 13, 5 (2020), 670–683.
[65]
Jiaxuan You, Rex Ying, and Jure Leskovec. 2019. Position-aware graph neural networks. In Proceedings of the 2019 International Conference on Machine Learning (ICML). PMLR, 7134–7143.
[66]
Jie Zhang, Yuxiao Dong, Yan Wang, Jie Tang, and Ming Ding. 2019. ProNE: Fast and scalable network representation learning. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Vol. 19. 4278–4284.
[67]
Pan Zhang and Cristopher Moore. 2014. Scalable detection of statistically significant communities and hierarchies, using message passing for modularity. Proceedings of the National Academy of Sciences 111, 51 (2014), 18144–18149.
[68]
Wentao Zhang, Yu Shen, Zheyu Lin, Yang Li, Xiaosen Li, Wen Ouyang, Yangyu Tao, Zhi Yang, and Bin Cui. 2022. Pasca: A graph neural architecture search system under the scalable paradigm. In Proceedings of the ACM Web Conference 2022. 1817–1828.
[69]
Ziwei Zhang, Peng Cui, Haoyang Li, Xiao Wang, and Wenwu Zhu. 2018. Billion-scale network embedding with iterative random projection. In Proceedings of the 2018 IEEE International Conference on Data Mining. IEEE, 787–796.
[70]
Jing Zhu, Xingyu Lu, Mark Heimann, and Danai Koutra. 2021. Node proximity is all you need: Unified structural and positional node and graph embedding. In Proceedings of the 2021 SIAM International Conference on Data Mining. SIAM, 163–171.

Cited By

View all
  • (2024)Link prediction in bipartite networks via effective integration of explicit and implicit relationsNeurocomputing10.1016/j.neucom.2023.127016566:COnline publication date: 4-Mar-2024
  • (2024)Community detection in social networks using machine learning: a systematic mapping studyKnowledge and Information Systems10.1007/s10115-024-02201-866:12(7205-7259)Online publication date: 1-Dec-2024
  • (2023)Temporal Link Prediction: A Unified Framework, Taxonomy, and ReviewACM Computing Surveys10.1145/362582056:4(1-40)Online publication date: 9-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 17, Issue 9
November 2023
373 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3604532
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2023
Online AM: 08 May 2023
Accepted: 01 May 2023
Revised: 11 March 2023
Received: 02 October 2022
Published in TKDD Volume 17, Issue 9

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Community detection
  2. graph clustering
  3. inductive graph representation learning

Qualifiers

  • Research-article

Funding Sources

  • Council of Hong Kong under the Research Impact Fund

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)192
  • Downloads (Last 6 weeks)21
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Link prediction in bipartite networks via effective integration of explicit and implicit relationsNeurocomputing10.1016/j.neucom.2023.127016566:COnline publication date: 4-Mar-2024
  • (2024)Community detection in social networks using machine learning: a systematic mapping studyKnowledge and Information Systems10.1007/s10115-024-02201-866:12(7205-7259)Online publication date: 1-Dec-2024
  • (2023)Temporal Link Prediction: A Unified Framework, Taxonomy, and ReviewACM Computing Surveys10.1145/362582056:4(1-40)Online publication date: 9-Nov-2023
  • (2023)RaftGP: Random Fast Graph Partitioning2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363495(1-7)Online publication date: 25-Sep-2023
  • (2023)Adaptive graph contrastive learning for community detectionApplied Intelligence10.1007/s10489-023-05046-w53:23(28768-28786)Online publication date: 13-Oct-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media