Abstract
Community structure has been recognized as an important statistical feature of networked systems over the past decade. A lot of work has been done to discover isolated communities from a network, and the focus was on developing of algorithms with high quality and good performance. However, there is less work done on the discovery of overlapping community structure, even though it could better capture the nature of network in some real-world applications. For example, people are always provided with varying characteristics and interests, and are able to join very different communities in their social network. In this context, we present a novel overlapping community structures detecting algorithm which first finds the seed sets by the spectral partition and then extends them with a special random walks technique. At every expansion step, the modularity function Q is chosen to measure the expansion structures. The function has become one of the popular standards in community detecting and is defined in Newman and Girvan (Phys. Rev. 69:026113, 2004). We also give a theoretic analysis to the whole expansion process and prove that our algorithm gets the best community structures greedily. Extensive experiments are conducted in real-world networks with various sizes. The results show that overlapping is important to find the complete community structures and our method outperforms the C-means in quality.
Similar content being viewed by others
References
Adamcsek, B., Palla, G., Farkas, I., Derényi, I., Vicsek, T.: CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics 22, 1021–1023 (2006)
Andersen, R., Lang, K.J.: Communities from seed sets. In: Proceedings of the 15th International World Wide Web Conference, Edinburgh, 23–26 May 2006
Baumes, J., Goldberg, M., Krishnamoorty, M., Magdon-Ismail, M., Preston, N.: Finding communities by clustering a graph into overlapping subgraphs. In: Proc. IADIS Applied Computing, pp. 97–104, Algarve, 22–25 February 2005
Baumes, J., Goldberg, M., Krishnamoorty, M., Magdon-Ismail, M.: Efficient identification of overlapping communities. In: Intelligence and Security Informatics (LNCS 3495), pp. 27–36. Springer, New York (2005)
Brandes, U., Delling, D., Gaertler, M., Goerke, R., Hoefer, M., Nikoloski, Z., Wagner, D.: Maximizing modularity is hard. Physics 0608255 (2006)
Burioni, R., Cassi, D.: Random walks on graphs: ideas techniques and results. J. Phys. A, Math. Gen. 38(8), Article R01, March (2005)
Ding, C.H.Q., He, X., Zha, H., Gu, M., Simon, H.D.: A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of ICDM, pp. 107–114, San Jose, 29 November–2 December 2001
Duch, J., Arenas, A.: Community detection in complex networks using extremal optimization. Phys. Rev. E 72, 027104 (2005)
Gkantsidis, C., Mihail, M., Saberi, A.: Conductance and congestion in power law graphs. Sigmetrics 148–159 (2003)
Greco, G., Greco, S., Zumpano, E.: Web communities: models and algorithms. World Wide Web J. 7(1), 58C82 (2004)
Gregory, S.: An algorithm to find overlapping community structure in networks. In: Proceedings of the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Sep., pp. 91–102. Springer, New York (2007)
Hou, J., Zhang, Y.: Constructing good quality web page communities. In: Proc. of Thirteenth Australasian Database Conference (ADC2002), Melbourne, January–February 2002
Hou, J., Zhang, Y.: Utilizing hyperlink transitivity to improve web page clustering. In: Proceedings of the 14th Australasian Database Conference (ADC 2003), pp. 49–57, Adelaide, February 2003
Huang, J., Zhu, T., Schuurmans, D.: Web communities identication from random walks. In: Joint European Conference on Machine Learning and European Conferenceon Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD-06), Berlin, 18–22 September 2006
Kannan, R., Lová sz, L., Montenegro, R.: Blocking conductance and mixing in random walks. Comb. Probab. Comput. 15, 541–570 (2006)
Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49, 291–307 (1970)
Lovász, L.: Random walks on graphs: a survey. In: Combinatorics, Paul Erdös is eighty, vol. 2 (Keszthely, 1993), pp. 353–397, Bolyai Soc. Math. Stud. 2, János Bolyai Math. Soc., Budapest (1996)
Montenegro, R., Tetali, P.: Mathematical aspects of mixing times in markov chains. Found. Trends Theor. Comp. Sci. 1 (2006). doi:http://10.1561/0400000003
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. 69, 026113 (2004)
Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E 74, 036104 (2006)
Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. U. S. A. 103, 8577 (2006)
Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst. 14, 849–856 (2002)
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814–818 (2005)
Pothen, A., Simon, H., Liou, K.-P.: Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11, 430–452 (1990)
Sidiropoulos, A., Pallis, G., Katsaros, D., Stamos, K., Vakali, A., Manolopoulos, Y.: Prefetching in content distribution networks via web communities identification and outsourcing. World Wide Web J. 11(1), 39–70 (2008)
Scott, J.: Social Network Analysis: a Handbook, 2nd edn. Sage, London (2000)
Simon, H.D.: Partitioning of unstructured problems for parallel processing. Comput. Syst. Eng. 2(2–3), 135–148 (1991)
Spielman, D.A., Teng, S.-H.: Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In: ACM STOC-04, pp. 81–90. ACM, New York (2004)
Wei, F., Wang, C., Ma, L., Zhou, A.: Detecting Overlapping Community Structures in Networks with Global Partition and Local Expansion. APWeb, LNCS 4976 (2008)
White, S., Smyth, P.: A spectral clustering approach to finding communities in graphs. In: SIAM International Conference on Data Mining, Newport Beach, 21–23 April 2005
Zhang SH, Wang RS, Zhang XS: Identification of overlapping community structure in complex networks using fuzzy c-means clustering. Phys. A-Stat. mech. Appl. 374(1), 483–490, Jan. 15 (2007)
Zhang, Y., Yu, J.X., Hou, J.: Web Communities: Analysis and Construction. Springer, Berlin Heidelberg New York (2006)
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Wei, F., Qian, W., Wang, C. et al. Detecting Overlapping Community Structures in Networks. World Wide Web 12, 235–261 (2009). https://doi.org/10.1007/s11280-009-0060-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-009-0060-x