skip to main content
10.1145/3487351.3488356acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
short-paper

Community detection in feature-rich networks to meet K-means

Published:19 January 2022Publication History

ABSTRACT

We derive two extensions of the celebrated K-means algorithm as a tool for community detection in feature-rich networks. We define a data-recovery criterion additively combining conventional least-squares criteria for approximation of the network link data and the feature data at network nodes by a partition along with its within-cluster "centers". The dimension of the space at which the method operates is the sum of the number of nodes and the number of features, which may be high indeed. To tackle the so-called curse of dimensionality, we may replace the innate Euclidean distance with cosine distance sometimes. We experimentally validate our proposed methods and demonstrate their efficiency by comparing them to most popular approaches.

References

  1. R. Interdonato, M. Atzmueller, S. Gaito, R. Kanawati, C. Largeron, and A. Sala, "Feature-rich networks: going beyond complex network topologies," Applied Network Science, vol. 4, 2019.Google ScholarGoogle Scholar
  2. P. Chunaev., "Community detection in node-attributed social networks: a survey," Computer Science Review, vol. 100286, no. 37, 2020.Google ScholarGoogle Scholar
  3. D. Steinley, "K-means clustering: a half-century synthesis," British Journal of Mathematical and Statistical Psychology, vol. 59, no. 1, pp. 1--34, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  4. Y. Li, J. Wang, B. Pullman, N. Bandeira, and Y. Papakonstantinou, "Index-based, high-dimensional, cosine threshold querying with optimality guarantees," Theory of Computing Systems, vol. 65, no. 1, pp. 42--83, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. B. Magara, S. O. Ojo, and T. Zuva, "A comparative analysis of text similarity measures and algorithms in research paper recommender systems," in conference on information communications technology and society (ICTAS). IEEE, 2018, pp. 1--5.Google ScholarGoogle Scholar
  6. D. Arthur and S. Vassilvitskii, "k-means++: The advantages of careful seeding," in Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 2006, pp. 1027--1035.Google ScholarGoogle Scholar
  7. W. Ye, L. Zhou, X. Sun, C. Plant, and C. Böhm., "Attributed graph clustering with unimodal normalized cut." in M. Ceci, J. Hollmén, L. Todorovski, C. Vens, S. Džeroski (Eds.) Machine Learning and Knowledge Discovery in Databases, 2017, p. 601--616.Google ScholarGoogle ScholarCross RefCross Ref
  8. D. Combe, C. Largeron, M. Géry, and E. Egyed-Zsigmond., "I-louvain: An attributed graph clustering method,," in E. Fromont, T. De Bie, M. van Leeuwen (Eds.), Advances in Intelligent Data Analysis XIV,, 2015, pp. 181--192.Google ScholarGoogle Scholar
  9. S. Cavallari, V. W. Zheng, H. Cai, K. C. Chang, and E. Cambria., "Learning community embedding with community detection and node embedding on graphs." in Proceedings of the 2017 ACM Conference on Information and Knowledge Management, ACM, 2017, pp. 377--386.Google ScholarGoogle Scholar
  10. H. Sun, F. He, J. Huang, Y. Sun, Y. Li, C. Wang, L. He, Z. Sun, and X. Jia., "Network embedding for community detection in attributed networks,," ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 14, no. 3, pp. 1--25, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. F. Bianchi, D. Grattarola, and C. Alippi, "Spectral clustering with graph neural networks for graph pooling," in In International Conference on Machine Learning (PMLR), 2020, November, pp. 874--883.Google ScholarGoogle Scholar
  12. A. Tsitsulin, J. Palowitch, B. Perozzi, and E. Müller, "Graph clustering with graph neural networks." arXiv preprint arXiv:2006.16904, 2020.Google ScholarGoogle Scholar
  13. C. Wang, S. Pan, R. Hu, G. Long, J. Jiang, and C. Zhang, "Attributed graph clustering: A deep attentional embedding approach." arXiv preprint arXiv:1906.06532., 2019.Google ScholarGoogle Scholar
  14. N. Stanley, T. Bonacci, R. Kwitt, M. Niethammer, and P. Mucha, "Stochastic block models with multiple continuous attributes." Applied Network Science, vol. 4, no. 1, pp. 1--22, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  15. L. Peel, D. Larremore, and A. Clauset, "The ground truth about metadata and community detection in networks," Science advances, vol. 3, no. 5, p. e1602548, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  16. M. Newman and A. Clauset, "Structure and inference in annotated networks," Nature Communications, vol. 7, p. 11863, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  17. A. Bojchevski and S. Günnemanz., "Bayesian robust attributed graph clustering: Joint learning of partial anomalies and group structure." in Thirty-Second AAAI Conference on Artificial Intelligence, 2018, pp. -.Google ScholarGoogle Scholar
  18. J. Yang, J. McAuley, and J. Leskovec, "Community detection in networks with node attributes." in IEEE 13th International Conference on Data Mining, 2013, pp. 1151--1156.Google ScholarGoogle Scholar
  19. D. Jin, J. He, B. Chai, and D. He, "Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularity," Frontiers of Computer Science, vol. 15, no. 4, pp. 1--11, 2021.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. X. Luo, Z. Liu, M. Shang, and M. Zhou, "Highly-accurate community detection via pointwise mutual information-incorporated symmetric non-negative matrix factorization," IEEE Transactions on Network Science and Engineering, 2020.Google ScholarGoogle Scholar
  21. X. Wang, D. Jin, X. Cao, L. Yang, and W. Zhang, "Semantic community identification in large attribute networks." in Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI'16,, 2016, pp. 265--271.Google ScholarGoogle Scholar
  22. J. Cao, H. Wanga, D. Jin, and J. Dang., "Combination of links and node contents for community discovery using a graph regularization approach," Future Generation Computer Systems, vol. 91, pp. 361--370, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Shalileh and M. B., "A data recovery method for community detection in feature-rich networks." in Proceedings of the 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2020, pp. 99--104.Google ScholarGoogle Scholar
  24. B. Mirkin, Clustering: A Data Recovery Approach, 2nd ed. CRC Press, 2012.Google ScholarGoogle Scholar
  25. E. Lazega, The Collegial Phenomenon: The Social Mechanisms of Cooperation Among Peers in a Corporate Law Partnership. Oxford University Press, 2001.Google ScholarGoogle ScholarCross RefCross Ref
  26. T. Snijders. Lawyers data set, https://bit.ly/3gtgVbq.Google ScholarGoogle Scholar
  27. R. Cross and A. Parker., The hidden power of social networks: Understanding how work really gets done in organizations. Harvard Business Press, 2004.Google ScholarGoogle Scholar
  28. C. Jia, Y. Li, M. Carson, X. Wang, and J. Yu, "Node attribute-enhanced community detection in complex networks," Scientific Reports, vol. 7, no. 1, pp. 1--15, 2017.Google ScholarGoogle Scholar
  29. O. Shchur, M. Mumme, A. Bojchevski, and S. Günnemann, "Pitfalls of graph neural network evaluation," arXiv preprint arXiv:1811.05868, 2018.Google ScholarGoogle Scholar
  30. L. Hubert and P. Arabie, "Comparing partitions,," Journal of Classification, vol. 2, no. 1, pp. 193--218, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  31. A. Strehl and J. Ghosh, "Cluster ensembles---a knowledge reuse framework for combining multiple partitions," Journal of machine learning research, pp. 583--617, 2002.Google ScholarGoogle Scholar

Index Terms

  1. Community detection in feature-rich networks to meet K-means
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ASONAM '21: Proceedings of the 2021 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
          November 2021
          693 pages
          ISBN:9781450391283
          DOI:10.1145/3487351

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 19 January 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          ASONAM '21 Paper Acceptance Rate22of118submissions,19%Overall Acceptance Rate116of549submissions,21%

          Upcoming Conference

          KDD '24
        • Article Metrics

          • Downloads (Last 12 months)13
          • Downloads (Last 6 weeks)1

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader