Abstract
A central theme of network analysis, these days, is the detection of community structure as it offers a coarse-grained view of the network at hand. A more interesting and challenging task in network analysis involves the detection of overlapping community structure due to its wide-spread applications in synthesising and interpreting the data arising from social, biological and other diverse fields. Certain real-world networks possess a large number of nodes whose memberships are spread through multiple groups. This phenomenon called community structure with pervasive overlaps has been addressed partially by the development of a few well-known algorithms. In this paper, we presented an algorithm called Interaction Coefficient-based Local Community Detection (IC-LCD) that not only uncovers the community structures with pervasive overlaps but do so efficiently. The algorithm extracted communities through a local expansion strategy which underlie the notion of interaction coefficient. We evaluated the performance of IC-LCD on different parameters such as speed, accuracy and stability on a number of synthetic and real-world networks, and compared the results with well-known baseline algorithms, namely DEMON, OSLOM, SLPA and COPRA. The results give a clear indication that IC-LCD gives competitive performance with the chosen baseline algorithms in uncovering the community structures with pervasive overlaps. The time complexity of IC-LCD is \(\mathcal {O}(nc_{\max })\), where n is the number of nodes, and \(c_{\max }\) is the maximum size of a community detected in a network.




Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Adamic, L.A., Glance, N.: The political blogosphere and the 2004 U.S. election: divided they blog. In: Proceedings of the 3rd International Workshop on Link Discovery, LinkKDD’05, pp. 36–43. ACM, New York (2005). https://doi.org/10.1145/1134271.1134277
Ahn, Y.Y., Bagrow, J.P., Lehmann, S.: Link communities reveal multiscale complexity in networks. Nature 466(7307), 761–764 (2010). https://doi.org/10.1038/nature09182
Bu, D., Zhao, Y., Cai, L., Xue, H., Zhu, X., Lu, H., Zhang, J., Sun, S., Ling, L., Zhang, N., Li, G., Chen, R.: Topological structure analysis of the protein-protein interaction network in budding yeast. Nucleic Acids Res. 31(9), 2443–2450 (2003)
Coscia, M., Rossetti, G., Giannotti, F., Pedreschi, D.: DEMON: a local-first discovery method for overlapping communities. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’12, pp. 615–623. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2339530.2339630
Costa, G., Ortale, R.: Topic-aware joint analysis of overlapping communities and roles in social media. Int. J. Data Sci. Anal. 9(4), 415–429 (2020)
Ding, Z., Zhang, X., Sun, D., Luo, B.: Overlapping community detection based on network decomposition. Sci. Rep. 6, 24115 (2016). https://doi.org/10.1038/srep24115
Dunn, J.C.: A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters (1973)
Fan, X., Cao, L., Da Xu, R.Y.: Dynamic infinite mixed-membership stochastic blockmodel. IEEE Trans. Neural Netw. Learn. Syst. 26(9), 2072–2085 (2015). https://doi.org/10.1109/TNNLS.2014.2369374
Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’00, pp. 150–160. ACM, New York (2000). https://doi.org/10.1145/347090.347121
Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010). https://doi.org/10.1016/j.physrep.2009.11.002
Fortunato, S., Barthélemy, M.: Resolution limit in community detection. PNAS 104(1), 36–41 (2007). https://doi.org/10.1073/pnas.0605965104
Fortunato, S., Hric, D.: Community detection in networks: a user guide. Phys. Rep. 659, 1–44 (2016). https://doi.org/10.1016/j.physrep.2016.09.002. Community detection in networks: A user guide
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. PNAS 99(12), 7821–7826 (2002). https://doi.org/10.1073/pnas.122653799
Gleiser, P.M., Danon, L.: Community structure in jazz. Advs. Complex Syst. 06(04), 565–573 (2003). https://doi.org/10.1142/S0219525903001067
Gregory, S.: An algorithm to find overlapping community structure in networks. In: European Conference on Principles of Data Mining and Knowledge Discovery, pp. 91–102. Springer, Berlin (2007)
Gregory, S.: A fast algorithm to find overlapping communities in networks. In: Daelemans, W., Goethals, B., Morik, K. (eds.) Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, pp. 408–423. Springer, Berlin (2008)
Gregory, S.: Finding overlapping communities in networks by label propagation. New J. Phys. 12(10), 103018 (2010). https://doi.org/10.1088/1367-2630/12/10/103018
Guimerá, R., Amaral, L.A.N.: Cartography of complex networks: modules and universal roles. J. Stat. Mech. 2005(P02001), P02001-1–P02001-13 (2005). https://doi.org/10.1088/1742-5468/2005/02/P02001
Havemann, F., Heinz, M., Struck, A., Gläser, J.: Identification of overlapping communities and their hierarchy by locally calculating community-changing resolution levels. J. Stat. Mech. 2011(01), P01023 (2011). https://doi.org/10.1088/1742-5468/2011/01/P01023
He, D., Jin, D., Chen, Z., Zhang, W.: Identification of hybrid node and link communities in complex networks. Sci. Rep. 5, 8638 (2015). https://doi.org/10.1038/srep08638
Knuth, D.E.: The Standford Graph-Base: A Platform for Combinatorial Computing. Addition-Wesley, Reading (1993)
Kumar, P., Dohare, R.: A neighborhood proximity based algorithm for overlapping community structure detection in weighted networks. Front. Comput. Sci. (2019). https://doi.org/10.1007/s11704-019-8098-0
Kumar, P., Dohare, R.: Formalising and detecting community structures in real world complex networks. J. Syst. Sci. Complex. 34, 180–205 (2021). https://doi.org/10.1007/s11424-020-9252-3
Lancichinetti, A., Fortunato, S.: Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E 80(1), 016118 (2009). https://doi.org/10.1103/PhysRevE.80.016118
Lancichinetti, A., Fortunato, S., Kertész, J.: Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 11(3), 033015 (2009). https://doi.org/10.1088/1367-2630/11/3/033015
Lancichinetti, A., Radicchi, F., Ramasco, J.J., Fortunato, S.: Finding statistically significant communities in networks. PLoS ONE 6(4) (2011). https://doi.org/10.1371/journal.pone.0018961
Lázár, A., Abel, D., Vicsek, T.: Modularity measure of networks with overlapping communities. EPL (Europhys. Lett.) 90(1), 18001 (2010). https://doi.org/10.1209/0295-5075/90/18001
Lee, C., Reid, F., McDaid, A., Hurley, N.: Detecting highly overlapping community structure by greedy clique expansion. arXiv:1002.1827 [physics] (2010)
Leskovec, J., Krevl, A.: SNAP Datasets: stanford large network dataset collection. http://snap.stanford.edu/data (2014)
Lu, Z., Sun, X., Wen, Y., Cao, G., Porta, T.L.: Algorithms and applications for community detection in weighted networks. IEEE Trans. Parallel Distrib. Syst. 26(11), 2916–2926 (2015). https://doi.org/10.1109/TPDS.2014.2370031
McDaid, A., Hurley, N.: Detecting highly overlapping communities with model-based overlapping seed expansion. In: 2010 International Conference on Advances in Social Networks Analysis and Mining, pp. 112–119 (2010). https://doi.org/10.1109/ASONAM.2010.77
Newman, M.E.J.: Network datasets from Newman. http://www-personal.umich.edu/~mejn/netdata/
Newman, M.E.J.: The structure of scientific collaboration networks. PNAS 98(2), 404–409 (2001). https://doi.org/10.1073/pnas.98.2.404
Newman, M.E.J.: Detecting community structure in networks. Eur. Phys. J. B 38(2), 321–330 (2004). https://doi.org/10.1140/epjb/e2004-00124-y
Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69(6), 066133 (2004). https://doi.org/10.1103/PhysRevE.69.066133
Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74(3), 036104 (2006). https://doi.org/10.1103/PhysRevE.74.036104
Newman, M.E.J.: Modularity and community structure in networks. PNAS 103(23), 8577–8582 (2006). https://doi.org/10.1073/pnas.0601602103
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004). https://doi.org/10.1103/PhysRevE.69.026113
Nicosia, V., Mangioni, G., Carchiolo, V., Malgeri, M.: Extending the definition of modularity to directed graphs with overlapping communities. J. Stat. Mech. 2009(03), P03024 (2009). https://doi.org/10.1088/1742-5468/2009/03/P03024
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005). https://doi.org/10.1038/nature03607
Qi, Y., Ge, H.: Modularity and dynamics of cellular networks. PLoS Comput. Biol. 2(12), e174 (2006). https://doi.org/10.1371/journal.pcbi.0020174
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76(3), 036106 (2007). https://doi.org/10.1103/PhysRevE.76.036106
Reichardt, J., Bornholdt, S.: Detecting fuzzy community structures in complex networks with a Potts model. Phys. Rev. Lett. 93(21), 218701 (2004)
Rossetti, G., Milli, L., Cazabet, R.: CDLIB: a python library to extract, compare and evaluate communities from complex networks. Appl. Netw. Sci. 4(1), 52 (2019). https://doi.org/10.1007/s41109-019-0165-9
Shen, H., Cheng, X., Cai, K., Hu, M.B.: Detect overlapping and hierarchical community structure in networks. Physica A 388(8), 1706–1712 (2009). https://doi.org/10.1016/j.physa.2008.12.021
Sun, H., Jia, X., Huang, R., Wang, P., Wang, C., Huang, J.: Distance dynamics based overlapping semantic community detection for node-attributed networks. Comput. Intell. (2020)
Sun, H., Liu, J., Huang, J., Wang, G., Jia, X., Song, Q.: LinkLPA: a link-based label propagation algorithm for overlapping community detection in networks. Comput. Intell. 33(2), 308–331 (2017). https://doi.org/10.1111/coin.12087
Tripathi, B., Parthasarathy, S., Sinha, H., Raman, K., Ravindran, B.: Adapting community detection algorithms for disease module identification in heterogeneous biological networks. Front. Genet. 10, 164 (2019)
Wang, Y., Bu, Z., Yang, H., Li, H.J., Cao, J.: An effective and scalable overlapping community detection approach: integrating social identity model and game theory. Appl. Math. Comput. 390, 125601 (2021). https://doi.org/10.1016/j.amc.2020.125601
Watts, D.J., Strogatz, S.H.: Collective dynamics of small-world networks. Nature 393(6684), 440–442 (1998). https://doi.org/10.1038/30918
Wei, Y., Singh, L., Gallagher, B., Buttler, D.: Overlapping target event and story line detection of online newspaper articles. In: 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA), pp. 222–232 (2016). https://doi.org/10.1109/DSAA.2016.30
White, S., Smyth, P.: A spectral clustering approach to finding communities in graphs. In: Proceedings of the 2005 SIAM International Conference on Data Mining, Proceedings, pp. 274–285. Society for Industrial and Applied Mathematics (2005)
Xie, J., Kelley, S., Szymanski, B.K.: Overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput. Surv. 45(4), 43:1–43:35 (2013). https://doi.org/10.1145/2501654.2501657
Xie, J., Szymanski, B.K., Liu, X.: SLPA: uncovering overlapping communities in social networks via a speaker–listener interaction dynamic process. pp. 344–349. IEEE (2011). https://doi.org/10.1109/ICDMW.2011.154
Yang, J., Leskovec, J.: Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, WSDM’13, pp. 587–596. Association for Computing Machinery, New York (2013). https://doi.org/10.1145/2433396.2433471
Yang, J., Leskovec, J.: Overlapping communities explain core–periphery organization of networks. Proc. IEEE 102(12), 1892–1902 (2014). https://doi.org/10.1109/JPROC.2014.2364018
Yang, J., McAuley, J.J., Leskovec, J.: Community Detection in Networks with Node Attributes (2013)
Zhang, F., Ma, A., Wang, Z., Ma, Q., Liu, B., Huang, L., Wang, Y.: A central edge selection based overlapping community detection algorithm for the detection of overlapping structures in protein–protein interaction networks. Molecules 23(10), 2633 (2018)
Zhang, S., Wang, R.S., Zhang, X.S.: Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Physica A 374(1), 483–490 (2007). https://doi.org/10.1016/j.physa.2006.07.023
Zhang, Y., Yin, D., Wu, B., Long, F., Cui, Y., Bian, X.: Plinkshrink: a parallel overlapping community detection algorithm with link-graph for large networks. Soc. Netw. Anal. Min. 9(1), 66 (2019)
Author information
Authors and Affiliations
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Here we shall illustrate the concepts of node and edge interaction coefficients, with the help of examples. Then we shall see how the seed expansion takes place using these coefficients.
1.1 Appendix A.1: Illustration of node and edge interaction coefficients
Consider the graph given below (Fig. 5).
Take \(C_1 = \lbrace u_0 \rbrace \), and \(C_2 = \lbrace v_0 \rbrace \). Let us expand \(C_1\), and \(C_2\) using the node interaction coefficient \(\xi _{\text {node}}\), taking \(\xi _0 = 0.5\). Note that \(N_{C_1} = \lbrace u_1, u_2, u_3, u_4 \rbrace \) and \(N_{C_2} = \lbrace v_1, v_2, v_3, v_4 \rbrace \). For each \(1 \le i \le 4\), we have
This means \(C_1\) would not expand. However, for each \(1 \le i \le 4\) we have
which means \(C_2\) can expand to all its neighbours, and becomes \(C_2 = \lbrace v_0, v_1, v_2, v_3, v_4 \rbrace \). So, \(N_{C_2} = \lbrace v_5, v_6, \ldots , v_{12} \rbrace \). Now for each \(5 \le i \le 12\), we have
which means \(C_2\) cannot be expanded further. The case we have considered is specific. But, it captures the two important types of seeds which are—highly clustered, and lowly clustered. The same strategy, such as the one based on node interaction coefficient, will not work for the expansion of both the kinds of seeds.
Therefore, we have introduced the concept of edge interaction coefficient \(\xi _{{\text {edge}}}\). An edge \(e_{uv}\) essentially interacts with a subgraph C through its endpoints u and v. To arrive at a formula for \(\xi _\text {edge}(e_{uv}, C)\), we use the following assumption: If both u and v have more neighbours in C, then \(e_{uv}\) interacts with C highly. So, look at the quantity
To normalise it we can divide it by the minimum or the maximum of the degrees of u and v. Moreover, we wish \(\xi _\text {edge}(e_{uv}, C)\) to be highest when \(N_u \subseteq V_C \backslash \lbrace v \rbrace \), \(N_v \subseteq V_C \backslash \lbrace u \rbrace \), and \(d_u = d_v\). Keeping, all these requirements, we get Eq. (2). It is apparent that \(0 \le \xi _{\text {edge}} \le 1\). It can be seen that \(\xi _\text {edge}(e_{uv}, C) = 1\) iff \(d_u = d_v\) and \(\left| N_u \cap V_C\right| = \left| N_v \cap V_C \right| \). In the denominator of Eq. (2) we have taken \(\max \lbrace d_u, d_v \rbrace \) instead of \(\min \lbrace d_u, d_v \rbrace \). To see why let us look at the case given in the picture below.

The node u has 3 neighbours in C, and 6 neighbours outside C. So, \(\xi _\text {node}(u, C) = 1/3\) which is much smaller than the threshold \(\xi _0\). Consequently, u must not join C in any case. However, v has 5 neighbours in C and just 2 neighbours outside C. So, v would surely join C. Now let us compute the node interaction coefficient of u with \(C \cup \lbrace v \rbrace \). We have
Thus u would not join \(C \cup \lbrace v \rbrace \) too. Consider, now the case when \(\min \lbrace d_u, d_v \rbrace \) is the numerator in Eq. (2). Then
In this case u joins C. Thus \(\min \lbrace d_u, d_v \rbrace \) is not an appropriate choice for the denominator in Eq. (2).
1.2 Appendix A.2: Illustration of seed expansion phase
We illustrate the GET-NEW-NODES() procedure through examples. Note that we do not specify any criterion for selecting seeds, so any node may serve as a seed. Then it may well happen that certain seeds, especially the low degree nodes, stop expanding after growing to few nodes, or do not expand at all. Let us consider a few examples assuming that \(\xi _0 = 0.5\) and \(n_{\min } = 4\).
Example 1
Consider the graph given in Fig. 6.
Let \(C = \lbrace 30, 31 \rbrace \). Then \(N_C = \lbrace 25, 26, 29 \rbrace \). In order to compute \(V_{\text {new}}\), the steps followed before the augmentation step are listed in Table 6. No node of \(N_C\) is added to \(V_{\text {new}}\), leaving \(V_{\text {new}}\) empty. On the other hand, during the augmentation step, we find that \(\xi _\text {node}(25, C \cup N_C) = 1/2, \xi _\text {node}(26, C \cup N_C) = 3/4\) and \(\xi _\text {node}(29, C \cup N_C) = 3/5\), which makes \(V_{\text {new}} = \lbrace 25, 26, 29 \rbrace \).
Example 2
This time consider the subgraph \(C = \lbrace 1,2,3 \rbrace \) in the graph given in Fig. 6. Here \(N_C = \lbrace 4,5,14,22,23 \rbrace \). Then before the augmentation step \(V_{\text {new}}\) remains empty as shown in Table 7. However, in this case even the augmentation step does not help, as \(\xi _\text {node}(u, C \cup N_C) < \xi _0\) for all \(u \in N_C\), and so, \(V_{\text {new}} = \varnothing \). Thus the subgraph C is not expandable to a full community. Such groups of nodes are likely to join multiple communities and form the basis for pervasive overlaps.
Rights and permissions
About this article
Cite this article
Kumar, P., Dohare, R. An interaction-based method for detecting overlapping community structure in real-world networks. Int J Data Sci Anal 14, 27–44 (2022). https://doi.org/10.1007/s41060-022-00314-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41060-022-00314-3