ABSTRACT
We derive two extensions of the celebrated K-means algorithm as a tool for community detection in feature-rich networks. We define a data-recovery criterion additively combining conventional least-squares criteria for approximation of the network link data and the feature data at network nodes by a partition along with its within-cluster "centers". The dimension of the space at which the method operates is the sum of the number of nodes and the number of features, which may be high indeed. To tackle the so-called curse of dimensionality, we may replace the innate Euclidean distance with cosine distance sometimes. We experimentally validate our proposed methods and demonstrate their efficiency by comparing them to most popular approaches.
- R. Interdonato, M. Atzmueller, S. Gaito, R. Kanawati, C. Largeron, and A. Sala, "Feature-rich networks: going beyond complex network topologies," Applied Network Science, vol. 4, 2019.Google Scholar
- P. Chunaev., "Community detection in node-attributed social networks: a survey," Computer Science Review, vol. 100286, no. 37, 2020.Google Scholar
- D. Steinley, "K-means clustering: a half-century synthesis," British Journal of Mathematical and Statistical Psychology, vol. 59, no. 1, pp. 1--34, 2006.Google ScholarCross Ref
- Y. Li, J. Wang, B. Pullman, N. Bandeira, and Y. Papakonstantinou, "Index-based, high-dimensional, cosine threshold querying with optimality guarantees," Theory of Computing Systems, vol. 65, no. 1, pp. 42--83, 2021.Google ScholarDigital Library
- M. B. Magara, S. O. Ojo, and T. Zuva, "A comparative analysis of text similarity measures and algorithms in research paper recommender systems," in conference on information communications technology and society (ICTAS). IEEE, 2018, pp. 1--5.Google Scholar
- D. Arthur and S. Vassilvitskii, "k-means++: The advantages of careful seeding," in Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 2006, pp. 1027--1035.Google Scholar
- W. Ye, L. Zhou, X. Sun, C. Plant, and C. Böhm., "Attributed graph clustering with unimodal normalized cut." in M. Ceci, J. Hollmén, L. Todorovski, C. Vens, S. Džeroski (Eds.) Machine Learning and Knowledge Discovery in Databases, 2017, p. 601--616.Google ScholarCross Ref
- D. Combe, C. Largeron, M. Géry, and E. Egyed-Zsigmond., "I-louvain: An attributed graph clustering method,," in E. Fromont, T. De Bie, M. van Leeuwen (Eds.), Advances in Intelligent Data Analysis XIV,, 2015, pp. 181--192.Google Scholar
- S. Cavallari, V. W. Zheng, H. Cai, K. C. Chang, and E. Cambria., "Learning community embedding with community detection and node embedding on graphs." in Proceedings of the 2017 ACM Conference on Information and Knowledge Management, ACM, 2017, pp. 377--386.Google Scholar
- H. Sun, F. He, J. Huang, Y. Sun, Y. Li, C. Wang, L. He, Z. Sun, and X. Jia., "Network embedding for community detection in attributed networks,," ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 14, no. 3, pp. 1--25, 2020.Google ScholarDigital Library
- F. Bianchi, D. Grattarola, and C. Alippi, "Spectral clustering with graph neural networks for graph pooling," in In International Conference on Machine Learning (PMLR), 2020, November, pp. 874--883.Google Scholar
- A. Tsitsulin, J. Palowitch, B. Perozzi, and E. Müller, "Graph clustering with graph neural networks." arXiv preprint arXiv:2006.16904, 2020.Google Scholar
- C. Wang, S. Pan, R. Hu, G. Long, J. Jiang, and C. Zhang, "Attributed graph clustering: A deep attentional embedding approach." arXiv preprint arXiv:1906.06532., 2019.Google Scholar
- N. Stanley, T. Bonacci, R. Kwitt, M. Niethammer, and P. Mucha, "Stochastic block models with multiple continuous attributes." Applied Network Science, vol. 4, no. 1, pp. 1--22, 2019.Google ScholarCross Ref
- L. Peel, D. Larremore, and A. Clauset, "The ground truth about metadata and community detection in networks," Science advances, vol. 3, no. 5, p. e1602548, 2017.Google ScholarCross Ref
- M. Newman and A. Clauset, "Structure and inference in annotated networks," Nature Communications, vol. 7, p. 11863, 2016.Google ScholarCross Ref
- A. Bojchevski and S. Günnemanz., "Bayesian robust attributed graph clustering: Joint learning of partial anomalies and group structure." in Thirty-Second AAAI Conference on Artificial Intelligence, 2018, pp. -.Google Scholar
- J. Yang, J. McAuley, and J. Leskovec, "Community detection in networks with node attributes." in IEEE 13th International Conference on Data Mining, 2013, pp. 1151--1156.Google Scholar
- D. Jin, J. He, B. Chai, and D. He, "Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularity," Frontiers of Computer Science, vol. 15, no. 4, pp. 1--11, 2021.Google ScholarDigital Library
- X. Luo, Z. Liu, M. Shang, and M. Zhou, "Highly-accurate community detection via pointwise mutual information-incorporated symmetric non-negative matrix factorization," IEEE Transactions on Network Science and Engineering, 2020.Google Scholar
- X. Wang, D. Jin, X. Cao, L. Yang, and W. Zhang, "Semantic community identification in large attribute networks." in Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI'16,, 2016, pp. 265--271.Google Scholar
- J. Cao, H. Wanga, D. Jin, and J. Dang., "Combination of links and node contents for community discovery using a graph regularization approach," Future Generation Computer Systems, vol. 91, pp. 361--370, 2019.Google ScholarDigital Library
- S. Shalileh and M. B., "A data recovery method for community detection in feature-rich networks." in Proceedings of the 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2020, pp. 99--104.Google Scholar
- B. Mirkin, Clustering: A Data Recovery Approach, 2nd ed. CRC Press, 2012.Google Scholar
- E. Lazega, The Collegial Phenomenon: The Social Mechanisms of Cooperation Among Peers in a Corporate Law Partnership. Oxford University Press, 2001.Google ScholarCross Ref
- T. Snijders. Lawyers data set, https://bit.ly/3gtgVbq.Google Scholar
- R. Cross and A. Parker., The hidden power of social networks: Understanding how work really gets done in organizations. Harvard Business Press, 2004.Google Scholar
- C. Jia, Y. Li, M. Carson, X. Wang, and J. Yu, "Node attribute-enhanced community detection in complex networks," Scientific Reports, vol. 7, no. 1, pp. 1--15, 2017.Google Scholar
- O. Shchur, M. Mumme, A. Bojchevski, and S. Günnemann, "Pitfalls of graph neural network evaluation," arXiv preprint arXiv:1811.05868, 2018.Google Scholar
- L. Hubert and P. Arabie, "Comparing partitions,," Journal of Classification, vol. 2, no. 1, pp. 193--218, 1985.Google ScholarCross Ref
- A. Strehl and J. Ghosh, "Cluster ensembles---a knowledge reuse framework for combining multiple partitions," Journal of machine learning research, pp. 583--617, 2002.Google Scholar
Index Terms
- Community detection in feature-rich networks to meet K-means
Recommendations
Detecting Communities in Feature-Rich Networks with a K-Means Method
Intelligent Data Engineering and Automated Learning – IDEAL 2021AbstractThe main result of this paper is an extension of the K-means algorithm to the issue of community detection in feature-rich networks. This is based on a data-recovery criterion additively combining conventional least-squares criteria for ...
Community Detection in Feature-Rich Networks Using Data Recovery Approach
AbstractThe problem of community detection in a network with features at its nodes takes into account both the graph structure and node features. The goal is to find relatively dense groups of interconnected entities sharing some features in common. There ...
A data recovery method for community detection in feature-rich networks
ASONAM '20: Proceedings of the 12th IEEE/ACM International Conference on Advances in Social Networks Analysis and MiningThe problem of community detection in a network with features at its nodes takes into account both the graph structure and node features. The goal is to find relatively dense groups of interconnected entities sharing some features in common. We apply ...
Comments