A Probabilistic Framework for Structural Analysis and Community Detection in Directed Networks | IEEE Journals & Magazine | IEEE Xplore

A Probabilistic Framework for Structural Analysis and Community Detection in Directed Networks


Abstract:

There is growing interest in structural analysis of directed networks. Two major points that need to be addressed are: 1) a formal and precise definition of the graph clu...Show More

Abstract:

There is growing interest in structural analysis of directed networks. Two major points that need to be addressed are: 1) a formal and precise definition of the graph clustering and community detection problem in directed networks and 2) algorithm design and evaluation of community detection algorithms in directed networks. Motivated by these, we develop a probabilistic framework for structural analysis and community detection in directed networks based on our previous work in undirected networks. By relaxing the assumption from symmetric bivariate distributions in our previous work to bivariate distributions that have the same marginal distributions in this paper, we can still formally define various notions for structural analysis in directed networks, including centrality, relative centrality, community, and modularity. We also extend three commonly used community detection algorithms in undirected networks to directed networks: the hierarchical agglomerative algorithm, the partitional algorithm, and the fast unfolding algorithm. These are made possible by two modularity preserving and sparsity preserving transformations. In conjunction with the probabilistic framework, we show these three algorithms converge in a finite number of steps. In particular, we show that the partitional algorithm is a linear time algorithm for large sparse graphs. Moreover, the outputs of the hierarchical agglomerative algorithm and the fast unfolding algorithm are guaranteed to be communities. These three algorithms can also be extended to general bivariate distributions with some minor modifications. We also conduct various experiments by using two sampling methods in directed networks: 1) PageRank and 2) random walks with self-loops and backward jumps.
Published in: IEEE/ACM Transactions on Networking ( Volume: 26, Issue: 1, February 2018)
Page(s): 31 - 46
Date of Publication: 30 October 2017

ISSN Information:


Contact IEEE to Subscribe

References

References is not available for this document.