Disjoint and Non-Disjoint Community Detection with Control of Overlaps Between Communities

Ben NCir, Chiheb-Eddine; Maiza, Ismail; Bouaguel, Waad; Essoussi, Nadia

doi:10.1007/s42979-020-00391-w

Disjoint and Non-Disjoint Community Detection with Control of Overlaps Between Communities

Original Research
Published: 04 January 2021

Volume 2, article number 15, (2021)
Cite this article

SN Computer Science Aims and scope Submit manuscript

Chiheb-Eddine Ben NCir ORCID: orcid.org/0000-0003-4014-8264^1,2,
Ismail Maiza²,
Waad Bouaguel^1,2 &
…
Nadia Essoussi²

361 Accesses
Explore all metrics

Abstract

Overlapping community detection has become an important challenge in networks analysis that motivates researchers to propose community detection methods that best fit existing complex and non-disjoint structures in real-world networks such as social, scientific and collaborative networks. Existing overlapping community detection methods usually build large overlaps between communities, larger than expected, and do not allow users to interact with the system to regulate this size, except those allowing to include hard constraints. To solve these issues, we propose a novel non-disjoint community detection method, referred to as CDCO, which easily allows users to interact with the system and regulate overlaps between communities based on existing relationships between nodes in the network. In the same way that allowing to analysts to control the number of communities or the minimal number of actors in the community, CDCO allows to regulate overlaps using an $\alpha$ parameter which can favor or penalize overlaps. The regulation of overlaps is introduced in the objective criterion and optimized iteratively during the community detection process. Extensive experiments, conducted on both simulated and real-world networks having different sizes of overlaps, show the importance of the regulation of overlaps when a non-disjoint partitioning of the network is needed and show that CDCO outperforms existing conventional methods in terms of both F-measure and NMI.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel two-step approach for overlapping community detection in social networks

Article 14 October 2017

A comparative study of overlapping community detection methods from the perspective of the structural properties

Article Open access 20 August 2020

A SAT-Based Framework for Overlapping Community Detection in Networks

Notes

Available at : http://snap.stanford.edu/data/com-Amazon.html.
Available at : http://snap.stanford.edu/data/com-DBLP.html.
Available at : http://snap.stanford.edu/data/com-Youtube.html.
Available at : http://snap.stanford.edu/data/com-LiveJournal.html.

References

Chakrabarti D, Faloutsos C. Graph mining: laws, generators, and algorithms. ACM Comput Surv. 2006;38(1):2.
Article Google Scholar
Agarwal N, Liu H, Tang L, Yu PS. Identifying the influential bloggers in a community. In: Proceedings of the 2008 international conference on web search and data mining, ACM, pp. 207–218, 2008.
Bedi P, Sharma C. Community detection in social networks. Interdiscip Rev Data Min Knowl Discov. 2016;6(3):115–35.
Article Google Scholar
Li W, Jiang S, Jin Q. Overlap community detection using spectral algorithm based on node convergence degree. Future Gener Comput Syst. 2018;79:408–16.
Article Google Scholar
Liben-Nowell D, Kleinberg J. The link-prediction problem for social networks. J Assoc Inf Sci Technol. 2007;58(7):1019–31.
Article Google Scholar
He K, Li Y, Soundarajan S, Hopcroft JE. Hidden community detection in social networks. Inf Sci. 2018;425:92–106.
Article MathSciNet Google Scholar
Huang M, Zou G, Zhang B, Liu Y, Yajun G, Jiang K. Overlapping community detection in heterogeneous social networks via the user model. Inf Sci. 2018;432:164–84.
Article MathSciNet Google Scholar
N’Cir C-E, Cleuziou G, Essoussi N. Generalization of c-means for identifying non-disjoint clusters with overlap regulation. Pattern Recogn Lett. 2014;45:92–8.
Article Google Scholar
Lim S, Ryu S, Kwon S, Jung K, Lee J-G. Linkscan*: overlapping community detection using the link-space transformation. In: 2014 IEEE 30th international conference on data engineering, pp. 292–303, 2014.
Fortunato S. Community detection in graphs. Phys Rep. 2010;486(3):75–174.
Article MathSciNet Google Scholar
Hajkacem MAB, N’cir C-EB, Essoussi N. One-pass mapreduce-based clustering method for mixed large scale data. J Intell Inf Syst. 2019;52(3):619–36.
Article Google Scholar
Mori J, Sugiyama T, Matsuo Y. Real-world oriented information sharing using social networks. In Proceedings of the 2005 international ACM SIGGROUP conference on Supporting group work, ACM, pp. 81–84, 2005.
Stanley W. Advances in social network analysis: research in the social and behavioral sciences. Thousand Oaks: Sage Publications; 1994.
Google Scholar
Wang F-Y, Carley KM, Zeng D, Mao W. Social computing: From social informatics to social intelligence. IEEE Intell Syst. 2007;22(2):79–83.
Article Google Scholar
Tang L, Liu H. Community detection and mining in social media. Synth Lect Data Min Knowl Discov. 2010;2(1):1–137.
Article MathSciNet Google Scholar
Raghavan UN, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E Stat Nonlinear Soft Matter Phys. 2007a;76(3 Pt 2):036106.
Article Google Scholar
Wasserman S, Faust K, et al. Social network analysis: methods and applications, vol. 8. Cambridge: Cambridge University Press; 1994.
Book Google Scholar
Abello J, Resende MGC, Sudarsky S. Massive quasi-clique detection. In: Latin American symposium on theoretical informatics, Springer, pp. 598–612, 2002.
Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech Theory Exp. 2008;2008(10):P10008.
Article Google Scholar
Ovelgönne M, Geyer-Schulz A. An ensemble learning strategy for graph clustering. Graph Partit Graph Clust. 2012;588:187.
Article MathSciNet Google Scholar
Hoff PD, Raftery AE, Handcock MS. Latent space approaches to social network analysis. J Am Stat Assoc. 2002;97(460):1090–8.
Article MathSciNet Google Scholar
Borg I, Groenen P. Modern multidimensional scaling: theory and applications. J Educ Meas. 2003;40(3):277–80.
Article Google Scholar
Raghavan UN, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E. 2007b;76(3):036106.
Article Google Scholar
Newman MEJ. Finding community structure in networks using the eigenvectors of matrices. Phys Rev E. 2006;74(3):036104.
Article MathSciNet Google Scholar
Xie J, Kelley S, Szymanski BK. Overlapping community detection in networks: the state-of-the-art and comparative study. ACM Comput Surv. 2013;45(4):1–35.
Article Google Scholar
Palla G, Derényi I, Farkas I, Vicsek T. Uncovering the overlapping community structure of complex networks in nature and society. Nature. 2005;435(7043):814–8.
Article Google Scholar
Adamcsek B, Palla G, Farkas IJ, Derényi I, Vicsek T. Cfinder: locating cliques and overlapping modules in biological networks. Bioinformatics. 2006;22(8):1021–3.
Article Google Scholar
Kumpula JM, Kivelä M, Kaski K, Saramäki J. Sequential algorithm for fast clique percolation. Phys Rev E. 2008;78(2):026109.
Article Google Scholar
Lancichinetti A, Fortunato S. Community detection algorithms: a comparative analysis. Phys Rev E. 2009;80(5):056117.
Article Google Scholar
Rosvall M, Bergstrom CT. Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci. 2008;105(4):1118–23.
Article Google Scholar
Lancichinetti A, Fortunato S, Kertész J. Detecting the overlapping and hierarchical community structure in complex networks. New J Phys. 2009;11(3):033015.
Article Google Scholar
Lancichinetti A, Radicchi F, Ramasco JJ, Fortunato S. Finding statistically significant communities in networks. PloS One. 2011;6(4):e18961.
Article Google Scholar
Lee C, Reid F, McDaid A, Hurley N. Detecting highly overlapping community structure by greedy clique expansion. In: Workshop on social network mining and analysis, 2010.
Ahn Y-Y, James PB, Sune L. Link communities reveal multiscale complexity in networks. Nature. 2010;466(7307):761.
Article Google Scholar
Wu Z, Lin Y, Wan H, Tian S. A fast and reasonable method for community detection with adjustable extent of overlapping. In: 2010 IEEE international conference on intelligent systems and knowledge engineering, IEEE, pp. 376–379, 2010.
Evans TS, Lambiotte R. Line graphs of weighted networks for overlapping communities. Eur Phys J B. 2010;77(2):265–72.
Article Google Scholar
Xie J, Szymanski BK. Community detection using a neighborhood strength driven label propagation algorithm. In: 2011 IEEE network science workshop, IEEE, pp. 188–195, 2011.
Gregory S. Finding overlapping communities in networks by label propagation. New J Phys. 2010;12(10):103018.
Article Google Scholar
Mirkin B. The method of principal clusters. Autom Remote Control. 1987;48:1379–88.
MATH Google Scholar
Depril D, Mechelen I, Wilderjans T. Lowdimensional additive overlapping clustering. CLA J. 2012;29(10):297–32020.
Article MathSciNet Google Scholar
Depril D, Van Mechelen I, Mirkin BG. Algorithms for additive clustering of rectangular data tables. Comput Stat Data Anal. 2008;52(11):4923–38.
Article MathSciNet Google Scholar
Maiza MI, N’cir CB, Essoussi N. Overlap regulation for additive overlapping clustering methods. In: 2016 IEEE Tenth international conference on research challenges in information science (RCIS), pp. 1–6, 2016.
Amigó E, Gonzalo J, Artiles J, Verdejo F. A comparison of extrinsic clustering evaluation metrics based on formal constraints. Inf Retr. 2009;12(4):461–86.
Article Google Scholar
McDaid AF, Greene D, Hurley N. Normalized mutual information to evaluate overlapping community finding algorithms. arXiv preprint arXiv:1110.2515, 2011.
Altaf-Ul-Amin MD, Yoko S, Kenji M, Ken K, Shigehiko K. Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinform. 2006;7(1):207.
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Business, University of Jeddah, Jeddah, Saudi Arabia
Chiheb-Eddine Ben NCir & Waad Bouaguel
LARODEC Laboratory, University of Tunis, Tunis, Tunisia
Chiheb-Eddine Ben NCir, Ismail Maiza, Waad Bouaguel & Nadia Essoussi

Authors

Chiheb-Eddine Ben NCir
View author publications
You can also search for this author in PubMed Google Scholar
Ismail Maiza
View author publications
You can also search for this author in PubMed Google Scholar
Waad Bouaguel
View author publications
You can also search for this author in PubMed Google Scholar
Nadia Essoussi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chiheb-Eddine Ben NCir.

Ethics declarations

Conflict of interest

This article does not contain any studies with human participants or animals performed by any of the authors. On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Given a network containing 8 connected nodes as described in Fig. 6. We assume we want to organize the network into 2 communities $C_1$ and $C_2$ using the proposed CDCO method. We considered in this illustrative example that nodes $v_2$ and $v_4$ are initialized as the community attractor of $C_1$ and $C_2$ respectively. We give in the following the different steps of the proposed method to decide the assignment of node $v_8$ to $C_1$, to $C_2$ or to both $C_1$ and $C_2$. We give a step by step execution of the proposed method using two cases : the first case when $\alpha =0$ which leads to a non-disjoint assignment of node $v_8$ while in the second case we increase the value of $\alpha =2$ to reduce the overlaps between the two communities. We show how we can easily adjust the size of overlaps.

A.1 First Case with $\alpha =0$

1.
Evaluate the degree of connectivity between $v_8$ and each community using Eq. (3):
$$\begin{aligned}&{\text {Conn}}(v_8, C_1)= {\text {Conn}}(v_8,v_2)= \frac{1}{3+2-1}= \frac{1}{4} = 0.25 \\&{\text {Conn}}(v_8, C_2)= {\text {Conn}}(v_8,v_4)= \frac{2}{3+2-2}=\frac{2}{3}= 0.66 \end{aligned}$$
2.
Evaluate the degree of connectivity between $v_8$ and the combination of the community using Eq. (4):
$$\begin{aligned} {\text {Conn}}(v_8, (C_1C_2) )= {\text {Conn}}(v_8,(v_2\cup v_4) )= \frac{3}{3+4-3}=\frac{3}{4} =0.75 \end{aligned}$$
We can show from these results that the degree of connectivity of the node $v_8$ is maximal if assigned to both $C_1$ and $C_2$. However, to decide its final assignment we must evaluate the local error of node $v_8$ using Eq. (7) for each of these alternatives and take the alternative with the minimal error.
3.
Evaluate the local error of $v_8$ when assigned to the nearest community ($C_2$) using Eq. (7):
$$\left( {1 + \frac{2}{3}} \right)^{0} \left( {1 - \frac{2}{3}} \right) = 0.33$$
4.
Evaluate the local error of $v_8$ when assigned to the first and to the second community ($C_2C_1$) using Eq. (7):
$$\begin{aligned} \left( 2+\frac{3}{4}\right) ^0 \left( 1-\frac{3}{4}\right) = 0.25 \end{aligned}$$

We show now that the minimal error is obtained when $v_8$ is assigned to both ($C_2C_1$). Therefore, to minimize the objective criterion (Eq. 5), $v_8$ must be assigned to the first and to the second community. These steps must be repeated for each node $v_i$ in the network to build the partitioning matrix C. We report in Fig. 6b the obtained partitioning on this illustrative example by using the proposed method with $\alpha =0$.

A.2 Second Case: Reduce Overlaps by Using $\alpha =2$

1.
The first and the second steps described in the first case still valid for this case. We will compute now local error with $\alpha =2$
2.
Evaluate the local error of $v_8$ when assigned to the nearest community ($C_2$) using Eq. (7):
$$\begin{aligned} \left( 1+\frac{2}{3}\right) ^2 \left( 1-\frac{2}{3}\right) = 0.9 \end{aligned}$$
3.
Evaluate the local error of $v_8$ when assigned to the first and the second community ($C_2C_1$) using Eq. (7):
$$\begin{aligned} \left( 2+\frac{3}{4}\right) ^2 \left( 1-\frac{3}{4}\right) = 1.8 \end{aligned}$$

We show now that the minimal error is obtained when $v_8$ is only assigned to the second community ($C_2$). Therefore, to minimize the objective criterion (Eq. 5), $v_8$ must be assigned to the second community ($C_2$). These steps must be repeated for each node $v_i$ in the network to build the partitioning matrix C. We report in Fig. 6c the obtained partitioning on this illustrative example by using the proposed method with $\alpha =2$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ben NCir, CE., Maiza, I., Bouaguel, W. et al. Disjoint and Non-Disjoint Community Detection with Control of Overlaps Between Communities. SN COMPUT. SCI. 2, 15 (2021). https://doi.org/10.1007/s42979-020-00391-w

Download citation

Received: 28 April 2020
Accepted: 03 November 2020
Published: 04 January 2021
DOI: https://doi.org/10.1007/s42979-020-00391-w

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Disjoint and Non-Disjoint Community Detection with Control of Overlaps Between Communities

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A novel two-step approach for overlapping community detection in social networks

A comparative study of overlapping community detection methods from the perspective of the structural properties

A SAT-Based Framework for Overlapping Community Detection in Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A

A.1 First Case with \(\alpha =0\)

A.2 Second Case: Reduce Overlaps by Using \(\alpha =2\)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Disjoint and Non-Disjoint Community Detection with Control of Overlaps Between Communities

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A novel two-step approach for overlapping community detection in social networks

A comparative study of overlapping community detection methods from the perspective of the structural properties

A SAT-Based Framework for Overlapping Community Detection in Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A

Appendix A

A.1 First Case with \(\alpha =0\)

A.2 Second Case: Reduce Overlaps by Using \(\alpha =2\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now