Exploring the transition behavior of nodes in temporal networks based on dynamic community detection

https://doi.org/10.1016/j.future.2020.02.006Get rights and content

Highlights

  • Experiments on 15 real-world dynamic networks to explore the transition behavior of nodes.

  • Using the decision tree to find the node-level features that have a general impact on node transition.

  • Average neighbor degree and node’s degree are the more important features in this study.

  • We also find some interesting phenomena in real dynamic networks.

Abstract

Community detection and community evolution tracking are two important tasks in dynamic complex network analysis. Recently, a variety of models and methods have been proposed for detecting the community structure and analyzing their evolution. However, all these methods are only committed to improving the performance of community detection or identifying evolutionary events, ignoring the internal relevance between the structure of each snapshot of the dynamic network and the evolution pattern of communities, especially the structural features of nodes and their dynamic transition behavior. To cope with this problem, we firstly conduct experiments on 15 real-world dynamic networks to explore the transition behavior of nodes in dynamic networks, which is one of the most influential evolutionary patterns in temporal community detection. Firstly, we obtain the temporal community structure based on very successful temporal community detection methods. Secondly, we extract features of nodes based on the structure of the dynamic network, and take the community transition behavior of nodes as the binary classification problem. Finally, we use the decision tree to find the node-level features that have a general impact on node transition. Experiments indicate that the degree and average neighbor degree of nodes have the most common indispensable impact on the node transition behavior, which are very helpful for modeling dynamic complex networks in future.

Introduction

Complex network analysis [1], [2] has received increasing attention from researchers in different fields, including computer science, social science, and physical science [3], [4], [5]. Complex networks always consist of nodes and edges, which represent the objects and the interactions between the objects, respectively. For example, in a social network, nodes could be the social accounts and edges represent the following or followed relationships between accounts. As one of the most important and powerful data structures, analyzing and modeling complex networks can be used for many missions, such as social interaction pattern analysis, social recommendation and protein functional modules recognition. As the most fundamental tasks in complex networks, node identification, link prediction and information dissemination have been widely studied and concerned. In addition, community detection is also one of the most significant tasks, which is usually defined as identifying tightly linked subgraphs from complex networks and benefiting from other tasks.

In general, detecting community structures can help us recognize meaningful modules of a network. A variety of works for community detection have been developed, such as modularity-based methods [6], model-based methods [7], [8] and random walk-based methods [9], [10], [11], where comprehensive surveys can be seen in [12], [13]. However, all these methods assume that the target network is static, that is, the network structure is invariant. Virtually, the network structure varies over time, i.e. dynamic networks. More specifically, in a dynamic network, the nodes may birth or death with time and links between two nodes may appear or disappear. For dynamic network modeling, we usually reply to it as a series of snapshots or slices, each of which can be regarded as a static network. From the perspective of community detection, compared with static networks, detecting the dynamic community poses new challenges [14], among which, how to fuse consecutive snapshot networks to improve performance of community detection and how to describe the evolution of communities are the most important.

Take a co-author network as an example, just as shown in Fig. 1, we show two snapshots of the dynamic network based on the DBLP data [15]. The nodes and edges are the authors and their cooperative relationship, and nodes with the same color represent the same community to which they belong. These three communities are the authors from data mining, database and machine learning, respectively. From the last snapshot to the next, a very important phenomenon is that the research field of some nodes has changed, for example, an author from the database joins into the data mining with the time going by and varying of the network. This is a critical behavior of community detection in dynamic networks, i.e. the transition behavior of nodes, which is the most widely considered dynamic pattern and also is our concern in this paper.

In recent years, more and more attention has been paid to dynamic community detection and different methods have been proposed, including two-step methods, evolutionary clustering methods and model-based methods. Two-step based methods [16], [17] usually apply a static community detection algorithm to each snapshot, and then perform community matching step at adjacent time slices. This kind of methods is not accurate enough because data in the real world is often noisy. Moreover, such a two-step process usually results in unstable community structures and consequentially, unwarranted community evolution [15]. Evolutionary clustering is firstly devoted to clustering the stream data and has been developed for dynamic community detection, the previous or historical network or community information are integrated into the community detection in following subsequent network snapshots, such as the evolutionary spectral clustering, dynamic non-negative matrix factorization and multi-objective evolutionary clustering [18], [19], this type of methods is still the most widely studied and used. The model-based methods [20], [21] usually define a series of network generation mechanisms to reconstruct the dynamic complex network and analyze the evolution of communities, such as the dynamic stochastic block model DSBM [15], which denoted the dynamic pattern based on the classic SBM and transforming community detection and evolution into the parameter estimation. On the whole, the model-based methods have very high computational complexity.

As we all know, all the existing methods for dynamic community detection are focusing on the performance of community detection and the evolutionary patterns or events, while ignoring the internal relevance between the structure varying of dynamic network and the evolution pattern of communities. Therefore, we are interested in, how the structural information of nodes affects the community transitions. In other words, community evolution is usually driven by node transition, and the relationship between the transition behavior and the local varying of nodes is our concern. Although some model-based methods (e.g. [22], [23]) use the degree of nodes to improve the accuracy of community detection, these methods only make the node distribution within a community following the power law and do not reveal the relationship between nodes degree and community evolution. As we have discussed, what kind of nodes are more likely to transfer their communities? Are there more statistical features related to the transfer behavior of nodes? Which is the most important feature? We believe that this could help us design more suitable models for community discovery in dynamic networks.

Our motivation is to explore which local structure information or features of the node has important impact on the transition behavior of nodes in dynamic networks, and which structural feature has a larger influence and which one has a small impact. So in this paper, for a given dynamic network, we firstly obtain its community structure based on three very successful temporal community detection methods. Then, we extract the ten features of nodes based on the structure of the previous snapshot network, and take the community transition behavior of nodes as the binary classification problem. In detail, we use the decision tree as the classification model to find the node-level features that have a general impact on node transition and analyze the community evolution on all the snapshots of the dynamic network. We take the framework on 15 real-world dynamic networks shows that the degree and average neighbor degree of nodes are the most two important features impacting on the node transition behavior. We believe that this is very helpful for modeling dynamic complex networks in future. The specific contributions of this paper are as follows:

  • As far as we know, this paper is the first exploration of the problem that what kind of nodes is more likely to transfer its community, it is the most important behavior in dynamic networks.

  • We extract the community features of the nodes belonging and features of the nodes themselves, and treat the node’s community transition as a binary classification problem, then use these features to classify whether the nodes are transferred or not.

  • We find that the important common feature of the node’s community transition is node’s average neighbor degree and node’s degree. And node’s average neighbor degree is even more important than node’s degree, which is inconsistent with our previous understanding.

Section snippets

Related work

Community detection is a fundamental task in complex network analysis, which can offer insight into the network formation mechanism and prediction [13], [24].

There have been a variety of methods proposed for community detection, including modularity optimization methods, spectral clustering methods and model-based methods. For example, Liu et al. [25] proposed a modularity optimization method using simulated annealing with a k-means iterative procedure to realize the model selection, which

Proposed framework

In this section, we introduce how to find the most critical structural features that affect the transition behavior of nodes across the snapshots.

The proposed framework is depicted in Fig. 2, first of all, we use some temporal community detection methods to detect node community membership and node transition behaviors. Then, we extract the structural features of nodes and use them as classification features. We assume that the node community transition is only related to node structural

Experiment

In this section, we first introduce the details of the 15 real-world data sets used throughout this paper. Then we show the binary classification results in 15 data sets and our findings in feature importance experiment, that is, node degree and node average neighbor degree are the two most important structural features for node community transition.

Case study

In this section, we use part of the DBLP data [15] to show the impact of node degree and node average neighbor degree on node community transition. DBLP is a well-studied data set in many research area, especially in complex network analysis. Our data is extracted from DBLP, and it contains the co-authorship information among the papers from 28 conferences over 10 years (19972007). These conferences cover three main research areas, including data mining, database and machine learning.

Conclusion

In this paper, we first consider the node’s community transition as a binary classification problem. Through the analysis of 15 real-world dynamic networks, it is found that the degree and average neighbor degree of nodes are the two significant features that affect the pattern of node’s community transition. In fact, we observe that node average neighbor degree is more important than the node degree, which is inconsistent with our previous understanding and also corrects our previous cognition

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (61902278, 51438009) and the National Key R&D Program of China (2018YFC0831000).

Tianpeng Li received the B.S. degree from school of software engineering, Tianjin university in 2017. He is currently pursuing the M.S. degree with the School of College of Intelligence and Computing, Tianjin University. His research interests include machine learning, dynamic complex network analysis, dynamic community detection and community evolution in dynamic network and probabilistic graphical models and its applications in computer science.

References (47)

  • DakicheN. et al.

    Tracking community evolution in social networks: A survey

    Inf. Process. Manage.

    (2018)
  • GirvanM. et al.

    Community structure in social and biological networks

    Proc. Natl. Acad. Sci.

    (2002)
  • DžamićD. et al.

    Ascent–descent variable neighborhood decomposition search for community detection by modularity maximization

    Ann. Oper. Res.

    (2019)
  • KarrerB. et al.

    Stochastic blockmodels and community structure in networks

    Phys. Rev. E

    (2011)
  • D. He, Z. Feng, D. Jin, X. Wang, W. Zhang, Joint identification of network communities and semantics via integrative...
  • AiroldiE.M. et al.

    Mixed membership stochastic blockmodels

    J. Mach. Learn. Res.

    (2008)
  • M. Qiao, J. Yu, W. Bian, Q. Li, D. Tao, Improving stochastic block models by incorporating power-law degree...
  • YangT. et al.

    Detecting communities and their evolutions in dynamic social networks—a Bayesian approach

    Mach. Learn.

    (2011)
  • PallaG. et al.

    Quantifying social group evolution

    Nature

    (2007)
  • SunY. et al.

    Matrix based community evolution events detection in online social networks

  • KimM.-S. et al.

    A particle-and-density based evolutionary clustering method for dynamic networks

    Proc. VLDB Endow.

    (2009)
  • ChakrabartiD. et al.

    Evolutionary clustering

  • FanX. et al.

    Dynamic infinite mixed-membership stochastic blockmodel

    IEEE Trans. Neural Netw. Learn. Syst.

    (2014)
  • Cited by (0)

    Tianpeng Li received the B.S. degree from school of software engineering, Tianjin university in 2017. He is currently pursuing the M.S. degree with the School of College of Intelligence and Computing, Tianjin University. His research interests include machine learning, dynamic complex network analysis, dynamic community detection and community evolution in dynamic network and probabilistic graphical models and its applications in computer science.

    Wenjun Wang is currently a Professor at the School of College of Intelligence and Computing, Tianjin University, Chief expert of major projects of the National Social Science Foundation, the big data specially-invited expert of Tianjin Public Security Bureau and the director of the Tianjin Engineering Research Center of Big Data on Public Security. His research interests include computational social science, large-scale data mining, intelligence analysis and multi-layer complex network modeling. He was the principal investigator or was responsible for more than 50 research projects, including the Major Project of National Social Science Fund, the Major Research Plan of the National Natural Science Foundation, the National Science2̆013technology Support Plan Project of China, etc. He has published more than 50 papers on main international journals and conferences.

    Xunxun Wu received the B.S. degree in mathematics from Shandong University, Jinan, China, in 2016. She is currently pursuing the M.S. degree with the School of College of Intelligence and Computing, Tianjin University, Tianjin, China. Her current research interests include complex network analysis and data mining, and currently working on community detection, community evolution in dynamic networks, and probabilistic graphical model.

    Huaming Wu received the B.E. and M.S. degrees from Harbin Institute of Technology, China in 2009 and 2011, respectively, both in electrical engineering. He received the Ph.D. degree with the highest honor in computer science at Free University of Berlin, Germany in 2015. He is currently an associate professor in the Center for Applied Mathematics, Tianjin University. His research interests include mobile cloud computing, edge computing, fog computing, internet of things (IoTs), and deep learning.

    Pengfei Jiao received the Ph.D. degrees in computer science from Tianjin University, Tianjin, China, in 2018. He is a lecture with the Center of Biosafety Research and Strategy of Tianjin University. His current research interests include complex network analysis and data mining, and currently working on community detection and link predication, community evolution in dynamic networks, network embedding and applications of statistical network model.

    Yandong Yu, an associate professor, works in the Department of Computer Science of Jining Normal University, Wulanchabu, Inner Mongolia. She is currently a visiting scholar in the School of College of Intelligence and Computing at tianjin university. In 2003, she obtained the bachelor degree of computer science and technology from Tianjin Normal University. In 2011, she obtained the master degree of computer technology engineering from Inner Mongolia University. Her research interests include big data analysis, complex networks, network security and so on.

    View full text