Is Community Detection Fully Unsupervised? The Case of Weighted Graphs

Connes, Victor; Dugué, Nicolas; Guille, Adrien

doi:10.1007/978-3-030-05411-3_21

Victor Connes⁸,
Nicolas Dugué⁸ &
Adrien Guille⁹

Part of the book series: Studies in Computational Intelligence ((SCI,volume 812))

Included in the following conference series:

International Conference on Complex Networks and their Applications

3117 Accesses

Abstract

In the field of NLP, word embeddings have recently attracted a lot of attention. A textual corpus is represented as a sparse words co-occurrences matrix. Then, the matrix can be factorized, for example using SVD, which allows to obtain a shorter matrix with dense and continuous vectors. To help SVD, PMI measure is applied on the initial co-occurrence matrix, assigning a relevant weight to the co-occurrences by normalizing them using both the considered words frequencies. In this paper, we follow this idea to study if weighted networks can benefit from pre-processing that can help community detection. We first design a benchmark using LFR networks. Then, we consider PMI and another NLP inspired measure as a preprocessing of the links weights, and show that PMI worsens the results while the other one improves them. By distinguishing links inside communities and links between communities into two classes, we show that this is due to the weights distributions of these links. Links between communities are in average bigger, leading to bigger values of PMI. From this analysis, we design another set of experiments that show that it is possible to classify efficiently links into these two classes, using a small set of features. Finally, we introduce the Supervised Label Propagation (SLP) algorithm that takes into account the classification results during the propagation. This algorithm clearly improves the results, leading us to a major questioning: is community detection on weighted networks a fully unsupervised task? We conclude with our thoughts on this topic.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Experiments and graphs can be found at https://github.com/nicolasdugue/WeightedCommunityDetection/

References

Barthélemy, M., Barrat, A., Pastor-Satorras, R., Vespignani, A.: Characterization and modeling of weighted networks. Phys. A Stat. Mech. Appl. 346(1), 34–43 (2005)
Google Scholar
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008(10), P10,008 (2008)
Google Scholar
Bruna, J., Li, X.: Community detection with graph neural networks. arXiv:1705.08415 (2017)
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Google Scholar
De Meo, P., Ferrara, E., Fiumara, G., Ricciardello, A.: A novel measure of edge centrality in social networks. Knowl.-Based Syst. 30, 136–150 (2012)
Google Scholar
Dugué, N., Labatut, V., Perez, A.: A community role approach to assess social capitalists visibility in the twitter network. Soc. Netw. Anal. Min. 5(1), 26 (2015)
Google Scholar
Hubert, L., Arabie, P.: Comparing partitions. J. Classification 2(1), 193–218 (1985)
Google Scholar
Lancichinetti, A., Fortunato, S.: Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E 80(1), 016,118 (2009)
Google Scholar
Levy, O., Goldberg, Y., Dagan, I.: Improving distributional similarity with lessons learned from word embeddings. Trans. Assoc. Computat. Linguistics 3, 211–225 (2015)
Google Scholar
Lu, X., Kuzmin, K., Chen, M., Szymanski, B.K.: Adaptive modularity maximization via edge weighting scheme. Informat. Sci. 424, 55–68 (2018)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. Advanc. Neural Informat. Process. Syst. 3111–3119 (2013)
Google Scholar
Newman, M.E.: Analysis of weighted networks. Phys. Rev. E 70(5), 056,131 (2004)
Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3), 036,106 (2007)
Google Scholar
Sarkar, S., Dong, A.: Community detection in graphs using singular value decomposition. Phys. Rev. E 83, 046,114 (2011)
Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles–a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3(Dec), 583–617 (2002)
Google Scholar
Van Laarhoven, T., Marchiori, E.: Network community detection with edge classifiers trained on LFR graphs. In: ESANN (2013)
Google Scholar
Wang, J., Leng, M.: A new active learning semi-supervised community detection algorithm in complex networks. In: Proceedings of Recent Developments in Mechatronics and Intelligent Robotics (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

Le Mans Université, LIUM, EA 4023, Laboratoire d’Informatique de l’Université du Mans, Le Mans, France
Victor Connes & Nicolas Dugué
University of Lyon, ERIC, 5, avenue Pierre Mendès France, 69676, Bron Cedex, France
Adrien Guille

Authors

Victor Connes
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Dugué
View author publications
You can also search for this author in PubMed Google Scholar
Adrien Guille
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicolas Dugué .

Editor information

Editors and Affiliations

Nokia Bell Labs, Cambridge, UK
Luca Maria Aiello
IUT Lumière, University of Lyon, Bron Cedex, France
Chantal Cherifi
LE2I UMR CNRS 6306 9, University of Burgundy, Dijon Cedex, France
Hocine Cherifi
Mathematical Institute, University of Oxford, Oxford, UK
Renaud Lambiotte
Department of Computer Science and Technology, The Computer Laboratory, University of Cambridge, Cambridge, UK
Pietro Lió
Center for Complex Networks and Systems Research, School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA
Luis M. Rocha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Connes, V., Dugué, N., Guille, A. (2019). Is Community Detection Fully Unsupervised? The Case of Weighted Graphs. In: Aiello, L., Cherifi, C., Cherifi, H., Lambiotte, R., Lió, P., Rocha, L. (eds) Complex Networks and Their Applications VII. COMPLEX NETWORKS 2018. Studies in Computational Intelligence, vol 812. Springer, Cham. https://doi.org/10.1007/978-3-030-05411-3_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-05411-3_21
Published: 02 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05410-6
Online ISBN: 978-3-030-05411-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics