Label propagation based evolutionary clustering for detecting overlapping and non-overlapping communities in dynamic networks
Introduction
In recent years, the research of complex networks has attracted more and more attention owing to their great potential in capturing natural and social phenomena. Since complex networks usually change over time, dynamic networks are formed. The discovery of communities in dynamic networks has become a critical task.
Evolutionary clustering is an effective method for detecting communities in dynamic networks. Chakrabarti et al. [1] first addressed this issue and proposed an evolutionary clustering framework. The framework assumes that the cluster structure of a dynamic network changes little in a very short time, and therefore each community in the dynamic network should smooth out over time. For smoothing, the framework trades-off two distinct criteria, the snapshot quality and the history cost, at each timestamp. The snapshot quality simply reflects how well the clustering result captures the current network, and the history cost determines how much the current clustering result has deviated from the previous clustering result. Obviously, higher snapshot quality and lower history cost are expected in order to perform well. Inspired by this framework, several evolutionary clustering methods have been proposed [2], [3], [4].
However, all of the evolutionary clustering methods were designed for detecting disjoint communities in dynamic networks. In real social networks, communities are overlapped sometimes. Some other methods [5], [6] can be used for the discovery of overlapping communities in dynamic networks, but both are incremental.
In this paper, we first improve the Dominant Label Propagation Algorithm DLPA [7] and propose a more stable algorithm DLPA+. Then, we present an evolutionary clustering approach DLPAE for dynamic networks based on DLPA+. According to DLPAE, community labels of nodes are voted by their neighbors, and a confidence is attached to each neighbor. During evolutionary clustering, confidences are computed between nodes and their neighbors. Each confidence here consists of two parts: the confidence of nodes in the current network and the confidence of the same nodes in the network at last timestamp. A user-defined parameter α is a trade-off between the two. After that, we compute confidences’ variance of each node and update nodes’ labels in a descending order according to the variance. In the setting of DLPAE, each node can possess one or more community labels with belonging coefficients no less than a threshold, and this property endows DLPAE with the ability to detect non-overlapping and overlapping communities in dynamic networks. By iteratively updating nodes’ labels, each node in the dynamic network keeps one or more community labels in the end. For the discovery of non-overlapping communities, we choose the labels with the greatest belonging coefficients as the community labels of nodes, and for the discovery of overlapping communities, we preserve all the labels.
We summarize the main contributions of this paper as follows: (1) We improve the dominant label propagation algorithm DLPA and make it more stable. (2) We propose a novel clustering approach DLPAE for dynamic networks. (3) DLPAE has the ability to detect overlapping and non-overlapping communities in dynamic networks.
The rest of the paper is organized as follows: in Section 2 we review the related work. Section 3 presents the preliminaries. We show the improved dominant label propagation algorithm DLPA+ in Section 4 and our evolutionary clustering approach DLPAE is described in Section 5. The experimental results and analysis are given in Section 6. Finally, Section 7 concludes the study.
Section snippets
Related work
Community discovery in complex networks is a challenging research issue in recent years [8], [9], [10], [11]. Label propagation [12], [13], [14] has been shown as a very efficient approach in this field owing to its simplicity and near-linear time complexity. However, all of these algorithms can only handle disjoint communities. Recently, two improved label propagation algorithms COPRA [15] and SLPA [16] were proposed to reveal overlapping communities in complex networks. We also proposed a
Preliminaries
We have proposed a dominant label propagation algorithm DLPA [7] based on the traditional label propagation algorithm LPA [12]. It can be used for the discovery of overlapping and non-overlapping communities in networks. In the following, we introduce the related definitions which can be found in the literature [7].
The improved algorithm DLPA+
DLPA efficiently detects non-overlapping and overlapping communities in networks. By controlling the value of the inflation parameter in, we can control the number of labels each node preserved and further control the overlap rate. However, DLPA updates nodes’ labels in a random order, which leads to a stability concern.
In order to resolve this problem, we improve DLPA by introducing the concept of Confidence Variance which is defined as follows:where
Algorithm description
For dynamic networks, the structures of networks change over time with the appearance and disappearance of nodes and edges, which also leads to the changes of nodes’ confidences. Taking the dynamic network in Fig. 1 for example, the confidence of node 5 to node 1 is 0.143 at timestamp t, and it changes to 0.119 at timestamp . Likewise, the confidence of node 7 w.r.t its neighbor node 9 is 0.288 at timestamp , while it changes to 0 at timestamp .
In summary, the changes of nodes’
Datasets
DLPA+ We evaluate the performance of DLPA+ using 8 synthetic networks and 4 real networks, which are shown in Table 4, Table 5 respectively. The 8 synthetic networks are generated through the method described by Lancichinetti et al. [22]. In Table 4, n and k represent the number of nodes and the average degree of nodes respectively. maxk is the maximum degree of nodes. minc and maxc denote the maximum and minimum community size. and are the number of overlapping nodes and the number of
Conclusion
The discovery of communities in dynamic networks is a critical research issue with wide applications. In this paper, we proposed a novel evolutionary clustering approach for dynamic networks. According to the approach, community labels of each node are determined by their neighbors. We take a special and network-structure-related order to update nodes’ labels which makes our approach behave stable and show higher accuracy. By iteratively updating nodes’ labels, each node can keep one or more
Acknowledgments
We would like to thank anonymous reviewers greatly for their valuable comments. The work was supported in part by the National Science Foundation of China grants 61173093, 61202182 and 61474299, the Fundamental Research Funds for the Central Universities of China grants K5051323001 and BDY10, Shaanxi Postdoctoral Science Foundation, Natural Science Basic Research Plan in Shaanxi Province of China grants 2014JQ8359. Any opinions, findings, and conclusions expressed here are those of the authors
References (31)
Community detection in graphs
Phys. Rep.
(2010)- et al.
Advanced modularity-specialized label propagation algorithm for detecting communities in networks
Phys. A: Stat. Mech. Appl.
(2010) - et al.
Evolutionary clustering
- et al.
Evolutionary spectral clustering by incorporating temporal smoothness
- et al.
Facetnet: a framework for analyzing communities and their evolutions in dynamic networks
- et al.
A particle-and-density based evolutionary clustering method for dynamic networks
Proc. VLDB Endowment
(2009) - et al.
Overlapping communities in dynamic networks: their detection and mobile applications
- et al.
Detection of overlapping communities in dynamical social networks
- et al.
Detecting overlapping communities in networks via dominant label propagation
Chinese Phys. B
(2015) - et al.
Overlapping community detection in networks: the state-of-the-art and comparative study
ACM Comput. Surv. (CSUR)
(2013)
Shrink: a structural clustering algorithm for detecting hierarchical communities in networks
Towards online multiresolution community detection in large-scale networks
PloS one
Near linear time algorithm to detect community structures in large-scale networks
Phys. Rev. E
Detecting network communities by propagating labels under constraints
Phys. Rev. E
Finding overlapping communities in networks by label propagation
New J. Phys.
Cited by (31)
A label propagation-based method for community detection in directed signed social networks
2022, Physica A: Statistical Mechanics and its ApplicationsLILPA: A label importance based label propagation algorithm for community detection with application to core drug discovery
2020, NeurocomputingCitation Excerpt :By modeling radars as nodes, their coupled relations as edges and the information spread by radars as label, we can construct a network. We find that existing label propagation based algorithms consider that the weight of labels of a node (i.e., belonging coefficients and label influence) is unchanged when the labels are launched from a node to other nodes [11–18], which may lead to inaccurate label weight and communities. Like radar transmission, when labels are launched from a node to other nodes, the weight of labels should change according to the node importance of launcher, the distances among nodes and the belonging coefficients of labels.
GLLPA: A Graph Layout based Label Propagation Algorithm for community detection
2020, Knowledge-Based SystemsCommunity detection with the Label Propagation Algorithm: A survey
2019, Physica A: Statistical Mechanics and its ApplicationsA game theoretic algorithm to detect overlapping community structure in networks
2018, Physics Letters, Section A: General, Atomic and Solid State PhysicsA cascade information diffusion based label propagation algorithm for community detection in dynamic social networks
2018, Journal of Computational ScienceCitation Excerpt :F1(C1, C2) is the harmonic mean of precision and recall between The results of implementing CIDLPA, SLPAD [20], DLPAE [7] and ILCD [18] on all available datasets including real and synthetic networks are shown in Figs. 1–7 respectively. We chooses SLPAD and DLPAE because both are able to detect overlapping communities based on the label propagation approach in dynamic social networks.