Skip to main content
Log in

Weighted-spectral clustering algorithm for detecting community structures in complex networks

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

The community structure is a common non-trivial topological feature of many complex real-world networks. Existing methods for identifying the community structure are generally based on statistical-type properties, such as the degree of centrality, the shortest path betweenness centrality, the modularity, and so forth. However, the form of the community structure may vary widely, even if the number of vertices and edges are fixed. Consequently, it is difficult to be certain of the exact number of clusters within the network. Clustering schemes which require the number of clusters to be specified in advance often misjudge the community structure and yield a poor clustering performance as a result. Accordingly, the present study proposes a clustering algorithm, designated as the Weighted-Spectral Clustering Algorithm, capable of detecting the community structure of a network with no prior knowledge of the cluster number. The proposed method is tested on both computer-generated networks and several real-world networks for which the community structures are already known. The results confirm the ability of the proposed algorithm to partition the network into an appropriate number of clusters in every case.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723. doi:10.1109/TAC.1974.1100705

    Article  MathSciNet  MATH  Google Scholar 

  • Amaral LAN, Scala A, Barthelemy M, Stanley HE (2000) Classes of small-world networks. Proc Natl Acad Sci 97(21):11149–11152. doi:10.1073/pnas.200327197

    Article  Google Scholar 

  • Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512

    Article  MathSciNet  MATH  Google Scholar 

  • Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. NIPS 14:585–591

    Google Scholar 

  • Biemann C (2006) Chinese whispers: an efficient graph clustering algorithm and its application to natural language processing problems. In: Proceedings of the first workshop on graph based methods for natural language processing. Association for Computational Linguistics, pp 73–80

  • Brandes U (2008) On variants of shortest-path betweenness centrality and their generic computation. Soc Netw 30(2):136–145. doi:10.1016/j.socnet.2007.11.001

    Article  Google Scholar 

  • Capocci A, Servedio VD, Caldarelli G, Colaiori F (2005) Detecting communities in large networks. Phys A Stat Mech Appl 352(2):669–676. doi:10.1016/j.physa.2004.12.050

    Article  MATH  Google Scholar 

  • Chung FR (1997) Spectral graph theory, vol. 92. American Mathematical Soc

  • Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111

    Article  Google Scholar 

  • Coppersmith D, Winograd S (1987) Matrix multiplication via arithmetic progressions. In: Proceedings of the nineteenth annual ACM symposium on Theory of computing, pp 1–6

  • Evans TS (2010) Clique graphs and overlapping communities. J Stat Mech Theory Exp 2010(12):P12037. doi:10.1088/1742-5468/2010/12/P12037

    Article  Google Scholar 

  • Fay D, Haddadi H, Thomason A, Moore AW, Mortier R, Jamakovic A, Rio M (2010) Weighted spectral distribution for internet topology analysis: theory and applications. IEEE/ACM Trans Netw 18(1):164–176. doi:10.1109/TNET.2009.2022369

    Article  Google Scholar 

  • Figueiredo MA, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396. doi:10.1109/34.990138

    Article  Google Scholar 

  • Fortunato S (2010) Community detection in graphs. Phys Rep 486(3):75–174

    Article  MathSciNet  Google Scholar 

  • Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976. doi:10.1126/science.1136800

    Article  MathSciNet  MATH  Google Scholar 

  • Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826. doi:10.1073/pnas.122653799

    Article  MathSciNet  MATH  Google Scholar 

  • Good BH, de Montjoye YA, Clauset A (2010) Performance of modularity maximization in practical contexts. Phys Rev E 81(4):046106

    Article  MathSciNet  Google Scholar 

  • Hecking T, Steinert L, Gohnert T, Hoppe HU (2014) Incremental clustering of dynamic bipartite networks. In: Network intelligence conference (ENIC), 2014 European. IEEE, pp. 9–16

  • Huang Z (2010) Link prediction based on graph topology: the predictive value of generalized clustering coefficient. SSRN 1634014

  • Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recognit Lett 31(8):651–666. doi:10.1016/j.patrec.2009.09.011

    Article  Google Scholar 

  • Jure L, Andrej K (2014) SNAP datasets: stanford large network dataset collection. http://snap.stanford.edu/data/

  • Knuth DE (1993) The Stanford GraphBase: a platform for combinatorial computing, vol 37. Addison-Wesley, Reading

    MATH  Google Scholar 

  • LaSalle D, Karypis G (2015) Multi-threaded modularity based graph clustering using the multilevel paradigm. J Parallel Distrib Comput 76:66–80

    Article  Google Scholar 

  • Leskovec J, Huttenlocher D, Kleinberg J (2010) Predicting positive and negative links in online social networks. In: Proceedings of the 19th international conference on World wide web. ACM, pp 641–650

  • Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2008) Statistical properties of community structure in large social and information networks. In: Proceedings of the 17th international conference on World Wide Web. ACM, pp 695–704. doi:10.1145/1367497.1367591

  • Leskovec J, Mcauley JJ (2012) Learning to discover social circles in ego networks. In: Advances in neural information processing systems, pp 539–547

  • Li Z (2012) A non-MCMC procedure for fitting dirichlet process mixture models. Doctoral dissertation. University of Saskatchewan

  • Lusseau D (2003) The emergent properties of a dolphin social network. Proc R Soc Lond B Biol Sci 270(Suppl 2):S186–S188. doi:10.1098/rsbl.2003.0057

    Article  Google Scholar 

  • Lusseau D, Schneider K, Boisseau OJ, Haase P, Slooten E, Dawson SM (2003) The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behav Ecol Sociobiol 54(4):396–405. doi:10.1007/s00265-003-0651-y

    Article  Google Scholar 

  • Meila M, Shi J (2001) A random walks view of spectral segmentation. In: Proceedings of the 8th international workshop on artificial intelligence and statistics

  • Micheloyannis S, Pachou E, Stam CJ, Breakspear M, Bitsios P, Vourkas M, Zervakis M (2006) Small-world networks and disturbed functional connectivity in schizophrenia. Schizophr Res 87(1):60–66. doi:10.1016/j.schres.2006.06.028

    Article  Google Scholar 

  • Mohar B, Alavi Y (1991) The Laplacian spectrum of graphs. Graph Theory Comb Appl 2:871–898

    MathSciNet  MATH  Google Scholar 

  • Nascimento MC, Carvalho AC (2011) A graph clustering algorithm based on a clustering coefficient for weighted graphs. J Braz Comput Soc 17(1):19–29

    Article  MATH  Google Scholar 

  • Nascimento MC, Pitsoulis L (2013) Community detection by modularity maximization using GRASP with path relinking. Comput Oper Res 40(12):3121–3131

    Article  MathSciNet  MATH  Google Scholar 

  • Newman ME (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69(6):066133

    Article  Google Scholar 

  • Newman ME (2006) Modularity and community structure in networks. Proc Natl Acad Sci 103(23):8577–8582. doi:10.1073/pnas.0601602103

    Article  Google Scholar 

  • Pelleg D, Moore AW (2000) X-means: extending K-means with efficient estimation of the number of clusters. In: ICML, pp 727–734

  • Pons P, Latapy M (2005) Computing communities in large networks using random walks. In: Computer and information sciences-ISCIS 2005. Springer, Berlin Heidelberg, pp 284–293. doi:10.1007/11569596_31

  • Santos FC, Pacheco JM, Lenaerts T (2006) Evolutionary dynamics of social dilemmas in structured heterogeneous populations. Proc Natl Acad Sci USA 103(9):3490–3494. doi:10.1073/pnas.0508201103

    Article  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

    Article  MathSciNet  MATH  Google Scholar 

  • Scott J (2011) Social network analysis: developments, advances, and prospects. Soc Netw Anal Min 1(1):21–26. doi:10.1007/s13278-010-0012-6

    Article  Google Scholar 

  • Shen H, Cheng X, Cai K, Hu MB (2009) Detect overlapping and hierarchical community structure in networks. Phys A Stat Mech Appl 388(8):1706–1712. doi:10.1016/j.physa.2008.12.021

    Article  Google Scholar 

  • Stam CJ, Jones BF, Nolte G, Breakspear M, Scheltens P (2007) Small-world networks and functional connectivity in Alzheimer’s disease. Cereb Cortex 17(1):92–99. doi:10.1093/cercor/bhj127

    Article  Google Scholar 

  • Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc B (Stat Methodol) 63(2):411–423. doi:10.1111/1467-9868.00293

    Article  MathSciNet  MATH  Google Scholar 

  • Van Dongen SM (2001) Graph clustering by flow simulation. Ph.D. Thesis, Dutch National Research Institute for Mathematics and Computer Science, University of Utrecht, Netherlands

  • Wang J, Li M, Wang H, Pan Y (2012) Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol Bioinform 9(4):1070–1080

    Article  Google Scholar 

  • Wasserman S (1994) Social network analysis: methods and applications, vol 8. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’networks. Nature 393(6684):440–442. doi:10.1038/30918

    Article  Google Scholar 

  • Waxman BM (1988) Routing of multipoint connections. IEEE J Sel Areas Commun 6(9):1617–1622. doi:10.1109/49.12889

    Article  Google Scholar 

  • Wehmuth K, Gomes ATA, Ziviani A, Da Silva APC (2010) On the joint dynamics of network diameter and spectral gap under node removal. In: LAWDN-Latin-American workshop on dynamic networks

  • Wehmuth K, Ziviani A (2011) Distributed location of the critical nodes to network robustness based on spectral analysis. Network operations and management symposium (LANOMS) (2011) 7th Latin American. IEEE. doi:10.1109/LANOMS.2011.6102259

  • Xiang B, Chen EH, Zhou T (2009) Finding community structure based on subgraph similarity. Complex networks. Springer, Berlin Heidelberg, pp 73–81

    Chapter  Google Scholar 

  • Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33:452–473

Download references

Acknowledgments

The authors would like to thank the Ministry of Science and Technology, ROC, for the financial support of this study under Grant No. MOST 103-2221-E-006 -147-MY3.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui-Tang Lin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, TS., Lin, HT. & Wang, P. Weighted-spectral clustering algorithm for detecting community structures in complex networks. Artif Intell Rev 47, 463–483 (2017). https://doi.org/10.1007/s10462-016-9488-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-016-9488-4

Keywords

Navigation