Skip to main content
Log in

Enriching networks with edge insertion to improve community detection

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

Community detection is a broad area of study in network science, in which its correct detection helps to get information about the groups and the relationships between their nodes. Community detection algorithms use the available snapshot of a network to detect its underlying communities. But, if this snapshot is incomplete, the algorithms may not recover the correct communities. This work proposes a set of link prediction heuristics using different network properties to estimate a more complete version of the network and improve the community detection algorithms. Each heuristic returns the most likely edges to be observed in a future version of the network. We performed experiments on real-world and artificial networks with different insertion sizes, comparing the results with two approaches: (i) without using edge insertion and (ii) using the EdgeBoost algorithm, based on node similarity measures. The experiments show that some of our proposed heuristics improve the results of traditional community detection algorithms. This improvement is even more prominent for networks with poorly defined structures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Source code available at https://github.com/EricCamacho/LINE.

  2. Library with network analysis tools and that can be used in Python, R, and C/C++ languages (Csardi and Nepusz 2006).

References

  • Adamic LA, Adar E (2003) Friends and neighbors on the web. Social Netw 25(3):211–230

    Article  Google Scholar 

  • Al Hasan M, Zaki MJ (2011) A survey of link prediction in social networks. In: Social network data analytics. Springer, pp 243–275

  • Ana L, Jain AK (2003) Robust data clustering. In: 2003 IEEE computer society conference on computer vision and pattern recognition, 2003. Proceedings, IEEE, vol 2, pp II–128

  • Atay Y, Koc I, Babaoglu I, Kodaz H (2017) Community detection from biological and social networks: a comparative analysis of metaheuristic algorithms. Appl Soft Comput 50:194–211

    Article  Google Scholar 

  • Ayoub J, Lotfi D, El Marraki M, Hammouch A (2020) Accurate link prediction method based on path length between a pair of unlinked nodes and their degree. Social Netw Anal Min 10(1):1–13

    Article  Google Scholar 

  • Bilenko M, Mooney R, Cohen W, Ravikumar P, Fienberg S (2003) Adaptive name matching in information integration. IEEE Intell Syst 18(5):16–23

    Article  Google Scholar 

  • Biswas A, Biswas B (2017) Community-based link prediction. Multimed Tools Appl 76(18):18619–18639

    Article  Google Scholar 

  • Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):P10008

    Article  MATH  Google Scholar 

  • Burgess M, Adar E, Cafarella M (2016) Link-prediction enhanced consensus clustering for complex networks. PLoS ONE 11(5):e0153384

    Article  Google Scholar 

  • Cheng HM, Ning YZ, Yin Z, Yan C, Liu X, Zhang ZY (2018) Community detection in complex networks using link prediction. Mod Phys Lett B 32(01):1850004

    Article  MathSciNet  Google Scholar 

  • Choumane A, Awada A, Harkous A (2020) Core expansion: a new community detection algorithm based on neighborhood overlap. Social Netw Anal Min 10:1–11

    Article  Google Scholar 

  • Chunaev P (2020) Community detection in node-attributed social networks: a survey. Comput Sci Rev 37:100286

    Article  MathSciNet  MATH  Google Scholar 

  • Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111

    Article  Google Scholar 

  • Csardi G, Nepusz T (2006) The igraph software package for complex network research. InterJ Complex Syst 1695:1–9

    Google Scholar 

  • Fortunato S, Barthelemy M (2007) Resolution limit in community detection. Proc Natl Acad Sci 104(1):36–41

    Article  Google Scholar 

  • Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826

    Article  MathSciNet  MATH  Google Scholar 

  • Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218

    Article  MATH  Google Scholar 

  • Interdonato R, Tagarelli A, Ienco D, Sallaberry A, Poncelet P (2017) Local community detection in multilayer networks. Data Min Knowl Discov 31(5):1444–1479

    Article  MathSciNet  MATH  Google Scholar 

  • Javed MA, Younis MS, Latif S, Qadir J, Baig A (2018) Community detection in networks: a multidisciplinary review. J Netw Comput Appl 108:87–111

    Article  Google Scholar 

  • Jonsson PF, Cavanna T, Zicha D, Bates PA (2006) Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis. BMC Bioinform 7(1):2

    Article  Google Scholar 

  • Kanawati R (2014) Yasca: an ensemble-based approach for community detection in complex networks. In: International computing and combinatorics conference, Springer, pp 657–666

  • Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78(4):046110

    Article  Google Scholar 

  • Leicht EA, Holme P, Newman ME (2006) Vertex similarity in networks. Phys Rev E 73(2):026120

    Article  Google Scholar 

  • Li W, Huang C, Wang M, Chen X (2017) Stepping community detection algorithm based on label propagation and similarity. Phys A Stat Mech Appl 472:145–155

    Article  Google Scholar 

  • Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inform Sci Technol 58(7):1019–1031

    Article  Google Scholar 

  • Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003) The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54(4):396–405

    Article  Google Scholar 

  • Makris C, Pispirigos G, Rizos IO (2020) A distributed bagging ensemble methodology for community prediction in social networks. Information 11(4):199

    Article  Google Scholar 

  • Malhotra D, Goyal R (2020) Link prediction in complex networks using information-theoretic measures. J Complex Netw 8(4):cnaa035

    Article  MathSciNet  Google Scholar 

  • McCune B, Grace JB, Urban DL (2002) Analysis of ecological communities, vol 28. MjM software design Gleneden Beach, OR

  • Nassar H, Benson AR, Gleich DF (2020) Neighborhood and pagerank methods for pairwise link prediction. Social Netw Anal Min 10(1):1–13

    Article  Google Scholar 

  • Newman ME (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74(3):036104

    Article  MathSciNet  Google Scholar 

  • Nicolini C, Bordier C, Bifone A (2017) Community detection in weighted brain connectivity networks beyond the resolution limit. Neuroimage 146:28–39

    Article  Google Scholar 

  • Ostilli M, Yoneki E, Leung IX, Mendes JF, Lió P, Crowcroft J (2010) Ising model of rumour spreading in interacting communities. Tech. rep., University of Cambridge, Computer Laboratory

  • Pachev B, Webb B (2018) Fast link prediction for large networks using spectral embedding. J Complex Netw 6(1):79–94

    Article  MathSciNet  MATH  Google Scholar 

  • Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: International symposium on computer and information sciences, Springer, pp 284–293

  • Poulin V, Théberge F (2019) Ensemble clustering for graphs: comparisons and applications. Appl Netw Sci 4(1):1–13

    Article  Google Scholar 

  • Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D (2004) Defining and identifying communities in networks. Proc Natl Acad Sci 101(9):2658–2663

    Article  Google Scholar 

  • Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106

    Article  Google Scholar 

  • Reichardt J, Bornholdt S (2006) Statistical mechanics of community detection. Phys Rev E 74(1):016110

    Article  MathSciNet  Google Scholar 

  • Rossetti G, Cazabet R (2018) Community discovery in dynamic networks: a survey. ACM Comput Surv 51(2):1–37

    Article  Google Scholar 

  • Rosvall M, Bergstrom CT (2007) An information-theoretic framework for resolving community structure in complex networks. Proc Natl Acad Sci 104(18):7327–7331

    Article  Google Scholar 

  • Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123

    Article  Google Scholar 

  • Salton G (1989) Automatic text processing: the transformation, analysis, and retrieval of, vol 169. Addison-Wesley, Reading

    Google Scholar 

  • Sorensen TA (1948) A method of establishing groups of equal amplitude in plant sociology based on similarity of species content and its application to analyses of the vegetation on Danish commons. Biol Skar 5:1–34

    Google Scholar 

  • Stegehuis C, van der Hofstad R, van Leeuwaarden JS (2016) Epidemic spreading on complex networks with community structures. Sci Rep 6:29748

    Article  Google Scholar 

  • Su Y, Wang B, Cheng F, Zhang L, Zhang X, Pan L (2017) An algorithm based on positive and negative links for community detection in signed networks. Sci Rep 7(1):1–12

    Article  Google Scholar 

  • Tagarelli A, Amelio A, Gullo F (2017) Ensemble-based community detection in multilayer networks. Data Min Knowl Discov 31(5):1506–1543

    Article  MathSciNet  MATH  Google Scholar 

  • Taguchi H, Murata T, Liu X (2020) Bimlpa: community detection in bipartite networks by multi-label propagation. In: International conference on network science, Springer, pp 17–31

  • Valverde-Rebaza JC, de Andrade Lopes A (2012) Link prediction in complex networks based on cluster information. In: Brazilian symposium on artificial intelligence, Springer, pp 92–101

  • Yan B, Gregory S (2012) Detecting community structure in networks using edge prediction methods. J Stat Mech Theory Exp 2012(09):P09008

    Article  Google Scholar 

  • Yan B, Gregory S (2012b) Finding missing edges in networks based on their community structure. Phys Rev E 85(5):056112

    Article  Google Scholar 

  • Yen TC, Larremore DB (2020) Community detection in bipartite networks with stochastic block models. Phys Rev E 102(3):032309

    Article  MathSciNet  Google Scholar 

  • Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473

    Article  Google Scholar 

  • Zare H, Shooshtari P, Gupta A, Brinkman RR (2010) Data reduction for spectral clustering to analyze high throughput flow cytometry data. BMC Bioinform 11(1):403

    Article  Google Scholar 

  • Zhang X, Xia Z, Xu S, Wang J (2014) Ensemble method: community detection based on game theory. Int J Mod Phys B 28(30):1450211

    Article  MathSciNet  Google Scholar 

  • Zhao X, Liang J, Wang J (2021) A community detection algorithm based on graph compression for large-scale social networks. Inf Sci 551:358–372

    Article  MathSciNet  Google Scholar 

Download references

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Éric Tadeu Camacho de Oliveira.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

de Oliveira, É.T.C., de França, F.O. Enriching networks with edge insertion to improve community detection. Soc. Netw. Anal. Min. 11, 89 (2021). https://doi.org/10.1007/s13278-021-00803-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-021-00803-6

Keywords

Navigation