Abstract
Community structure is a fundamental feature for many networks. The problem of discovering communities in those networks thus has been attracting a lot of research. However, due to the rapid increase of networks’ scale and the availability of real communities in many networks, the task of detecting communities in large real networks remains a challenging problem. In this paper, we study the structure of various large real networks and their ground-truth community structures and observe an interesting phenomenon: the difference of degrees (abbreviated as dod) of connected nodes follows a heavy-tailed distribution with an approximate power-law tail for large dod, in both original network and community structure but to different extents. With the aim to explore the effect of this observation on identifying communities in real large networks, we propose a weighting strategy and further embed it into two prominent community detection algorithms. Comparisons against the state of the arts demonstrate a very promising performance of the proposed weighting strategy.
Similar content being viewed by others
Notes
References
Ahn YY, Bagrow J, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761–764. doi:10.1038/nature09182
Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science. doi:10.1126/science.286.5439.509
Berry JW, Hendrickson B, LaViolette RA, Phillips CA (2011) Tolerating the community detection resolution limit with edge weighting. Phys Rev E 83:056119. doi:10.1103/PhysRevE.83.056119
Bhat SY, Abulais M (2015) OCMiner: a density-based overlapping community detection method for social networks. Intell Data Anal 19(4):917–947. doi:10.3233/IDA-150751
Blondel VD, Guillaume JJ, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theo Exp. doi:10.1088/1742-5468/2008/10/P10008
Broder AZ, Charikar M, Frieze AM, Mitzenmacher M (1998) Min-wise independent permutations (extended abstract), In: Proceedings of the 30th Annual ACM Symposium on the Theory of Computing (STOC), Dallas, USA. pp. 327–336. doi:10.1145/276698.276781
Chen Q, Wu TT, Fang M (2013) Detecting local community structures in complex networks based on local degree central nodes. Phys A 392:529–537. doi:10.1016/j.physa.2012.09.012
Ciglan M, Laclavik M, Norvag K (2013) On community detection in real-world networks and the importance of assortativity. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), Chicago, USA. pp. 1007–1015. doi:10.1145/2487575.2487666
Clauset A (2005) Finding local community structure in networks. Phys Rev E 72:026132. doi:10.1103/PhysRevE.72.026132
Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51(4):661–703. doi:10.1137/070710111
Danon L, Duch J, Diaz-Guilera A, Arenas A (2005) Comparing community structure identification. J Stat Mech Theo Exp. doi:10.1088/1742-5468/2005/09/P09008
Fortunato S (2010) Community detection in graphs. Phys Rep 486:75–174. doi:10.1016/j.physrep.2009.11.002
Galbrum E, Gionis A, Tatti N (2014) Overlapping community detection in labeled graphs. Data Min Knowl Disc 28:1586–1610. doi:10.1007/s10618-014-0373-y
Gleich DF, Seshadhri C (2012) Vertex neighborhoods, low conductance cuts, and good seeds for local community methods. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 597–605. doi:10.1145/2339530.2339628
Gopalan P, Mimno D, Gerrish SM, Freedman MJ, Blei DM (2012) Scalable inference of overlapping communities. In: Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS)
Grabowicz PA, Aiello LM, Eguiluz VM, Jaimes A (2013) Distinguishing topical and social groups based on common identity and bond theory. In: Proceedings of the 6th Web Search and Data Mining (WSDM), pp 627–636. doi:10.1145/2433396.2433475
Hlaoui A, Wang S (2004) Distinguishing between overlapping components in mixture models. In: Proceedings of the IASTED International Conference on Neural Networks and Computational Intelligence, pp 158–163
Hu HB, Wang XF (2009) Disassortative mixing in online social networks. EPL 86:18003. doi:10.1209/0295-5075/86/18003
Hu Y, Yang B (2015) Enhanced link clustering with observations on ground truth to discover social circles. Knowl Based Syst 73:227–235. doi:10.1016/j.knosys.2014.10.006
Khadivi A, Rad AA, Hasler M (2011) Network community detection enhancement by proper weighting. Phys Rev E 83:046104. doi:10.1103/PhysRevE.83.046104
Lancichinetti A, Fortunato S, Kertesz J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11:033015. doi:10.1088/1367-2630/11/3/033015
Lancichinetti A, Radicchi F, Ramasco JJ, Fortunato S (2011) Finding statistically significant communities in networks. PLoS One 6(4):e18961. doi:10.1371/journal.pone.0018961
Lancichinetti A, Kivela M, Saramak J, Fortunato S (2010) Characterizing the community structure of complex networks. PLoS One 5(8):e11976. doi:10.1371/journal.pone.0011976
LaSalle D, Karypis G (2015) Multi-threaded modularity based graph clustering using the multilevel paradigm. J Parallel Distrib Comput 76:66–80. doi:10.1016/j.jpdc.2014.09.012
Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 177–187. doi:10.1145/1081870.1081893
Leskovec J, Lang KJ, Mahoney M (2010) Empirical comparison of algorithms for networks community detection. In: Proceedings of the 19th International World Wide Web Conference (WWW), pp 631–640. doi:10.1145/1772690.1772755
Leskovec J, McAuley JJ (2014) Discovering social circles in ego networks. ACM Trans Knowl Discov Data. doi:10.1145/2556612
Li G, Pan Z, Xiao B, Huang L (2014) Community discovery and importance analysis in social network. Intell Data Anal 18(3):495–510. doi:10.3233/IDA-140653
Luo F, Wang JZ, Promislow E (2006) Exploring local community structures in large networks. In: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp 233–239. doi:10.1109/WI.2006.72
Newman MEJ (2003) Mixing patterns in networks. Phys Rev E 67:026126. doi:10.1103/PhysRevE.67.026126
Newman MEJ (2002) Assortative mixing in networks. Phys Rev Lett 89:208701. doi:10.1103/PhysRevLett.89.208701
Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113. doi:10.1103/PhysRevE.69.026113
Newman MEJ (2005) Power laws, Pareto distributions and Zipf’s law. Contemp Phys 46(5):323–351. doi:10.1080/00107510500052444
Nguyen NP, Dinh TN, Tokala S, Thai MT (2011) Overlapping communities in dynamic networks: their detection and mobile applications. In: Proceedings of the 17th Annual ACM International Conference on Mobile Computing and Networking (MobiCom), pp 85–96. doi:10.1145/2030613.2030624
Prat-Perez A, Dominguez-Sal D, Larriba-Pey JL (2014) High quality, scalable and parallel community detection for large real graphs. In: Proceedings of the 23rd International World Wide Web Conference (WWW), pp 225–236. doi:10.1145/2566486.2568010
Pons P, Latapy M (2006) Computing communities in large networks using random walks. J Graph Algorithms Appl 10(2):191–218. doi:10.7155/jgaa.00124
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structure in large scale networks. Phys Rev E 76(3):036106. doi:10.1103/PhysRevE.76.036106
Satuluri V, Parthasarathy S, Ruan Y (2011) Local graph sparsification for scalable clustering. In: Proceedings of the 2011 ACM Conference on Management of Data (SIGMOD), pp 721–732. doi:10.1145/1989323.1989399
Staudt CL, Meyerhenke H (2013) Engineering high-performance community detection heuristics for massive graphs. In: Proceedings of the 42nd International Conference on Parallel Processing (ICPP), pp 180–189. doi:10.1109/ICPP.2013.27
Wang CH, Lai JH, Yu PS (2014) NEIWalk: community discovery in dynamic content-based networks. IEEE Trans Knowl Data Eng 26(7):1734–1748. doi:10.1109/TKDE.2013.153
Wang M, Wang C, Yu JX, Zhang J (2015) Community detection in social networks: an in-depth benchmarking study with a procedure-oriented framework. Proc VLDB Endow 8(10):998–1009. doi:10.14778/2794367.2794370
Yang J, Leskovec J (2013) Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of the 6th Web Search and Data Mining (WSDM), pp 587–596. doi:10.1145/2433396.2433471
Yang J, Leskovec J (2015) Defining and evaluating networks communities based on ground-truth. Knowl Inf Syst 42(1):181–213. doi:10.1007/s10115-013-0693-z
Yang J, Leskovec J (2012) Structure and overlaps of communities in networks. arXiv:1205.6228
Yang TB, Jin R, Chi Y, Zhu S (2009) Combining link and content for community detection: a discriminative approach. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 927–936. doi:10.1145/1557019.1557120
Yang J, McAuley JJ, Leskovec J (2013) Community detection in networks with node attributes. In: Proceedings of the 2013 IEEE International Conference on Data Mining, pp 1151–1156. doi:10.1109/ICDM.2013.167
Acknowledgments
This work is supported by Sichuan Provincial Project of International Scientific and Technical Exchange and Research Collaboration Programs. We also thank Srinivasan Parthasarathy for his providing for the code of L-Spar.
Author information
Authors and Affiliations
Corresponding author
Additional information
An erratum to this article is available at http://dx.doi.org/10.1007/s00521-017-3033-5.
Rights and permissions
About this article
Cite this article
Hu, Y., Yang, B. Characterizing the structure of large real networks to improve community detection. Neural Comput & Applic 28, 2321–2333 (2017). https://doi.org/10.1007/s00521-016-2264-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-016-2264-1