Skip to main content
Log in

Characterizing the structure of large real networks to improve community detection

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

An Erratum to this article was published on 29 April 2017

Abstract

Community structure is a fundamental feature for many networks. The problem of discovering communities in those networks thus has been attracting a lot of research. However, due to the rapid increase of networks’ scale and the availability of real communities in many networks, the task of detecting communities in large real networks remains a challenging problem. In this paper, we study the structure of various large real networks and their ground-truth community structures and observe an interesting phenomenon: the difference of degrees (abbreviated as dod) of connected nodes follows a heavy-tailed distribution with an approximate power-law tail for large dod, in both original network and community structure but to different extents. With the aim to explore the effect of this observation on identifying communities in real large networks, we propose a weighting strategy and further embed it into two prominent community detection algorithms. Comparisons against the state of the arts demonstrate a very promising performance of the proposed weighting strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://snap.stanford.edu/ncp.

References

  1. Ahn YY, Bagrow J, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761–764. doi:10.1038/nature09182

    Article  Google Scholar 

  2. Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science. doi:10.1126/science.286.5439.509

    MathSciNet  MATH  Google Scholar 

  3. Berry JW, Hendrickson B, LaViolette RA, Phillips CA (2011) Tolerating the community detection resolution limit with edge weighting. Phys Rev E 83:056119. doi:10.1103/PhysRevE.83.056119

    Article  Google Scholar 

  4. Bhat SY, Abulais M (2015) OCMiner: a density-based overlapping community detection method for social networks. Intell Data Anal 19(4):917–947. doi:10.3233/IDA-150751

    Article  Google Scholar 

  5. Blondel VD, Guillaume JJ, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theo Exp. doi:10.1088/1742-5468/2008/10/P10008

    Google Scholar 

  6. Broder AZ, Charikar M, Frieze AM, Mitzenmacher M (1998) Min-wise independent permutations (extended abstract), In: Proceedings of the 30th Annual ACM Symposium on the Theory of Computing (STOC), Dallas, USA. pp. 327–336. doi:10.1145/276698.276781

  7. Chen Q, Wu TT, Fang M (2013) Detecting local community structures in complex networks based on local degree central nodes. Phys A 392:529–537. doi:10.1016/j.physa.2012.09.012

    Article  Google Scholar 

  8. Ciglan M, Laclavik M, Norvag K (2013) On community detection in real-world networks and the importance of assortativity. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), Chicago, USA. pp. 1007–1015. doi:10.1145/2487575.2487666

  9. Clauset A (2005) Finding local community structure in networks. Phys Rev E 72:026132. doi:10.1103/PhysRevE.72.026132

    Article  Google Scholar 

  10. Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51(4):661–703. doi:10.1137/070710111

    Article  MathSciNet  MATH  Google Scholar 

  11. Danon L, Duch J, Diaz-Guilera A, Arenas A (2005) Comparing community structure identification. J Stat Mech Theo Exp. doi:10.1088/1742-5468/2005/09/P09008

    Google Scholar 

  12. Fortunato S (2010) Community detection in graphs. Phys Rep 486:75–174. doi:10.1016/j.physrep.2009.11.002

    Article  MathSciNet  Google Scholar 

  13. Galbrum E, Gionis A, Tatti N (2014) Overlapping community detection in labeled graphs. Data Min Knowl Disc 28:1586–1610. doi:10.1007/s10618-014-0373-y

    Article  MathSciNet  Google Scholar 

  14. Gleich DF, Seshadhri C (2012) Vertex neighborhoods, low conductance cuts, and good seeds for local community methods. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 597–605. doi:10.1145/2339530.2339628

  15. Gopalan P, Mimno D, Gerrish SM, Freedman MJ, Blei DM (2012) Scalable inference of overlapping communities. In: Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS)

  16. Grabowicz PA, Aiello LM, Eguiluz VM, Jaimes A (2013) Distinguishing topical and social groups based on common identity and bond theory. In: Proceedings of the 6th Web Search and Data Mining (WSDM), pp 627–636. doi:10.1145/2433396.2433475

  17. Hlaoui A, Wang S (2004) Distinguishing between overlapping components in mixture models. In: Proceedings of the IASTED International Conference on Neural Networks and Computational Intelligence, pp 158–163

  18. Hu HB, Wang XF (2009) Disassortative mixing in online social networks. EPL 86:18003. doi:10.1209/0295-5075/86/18003

    Article  Google Scholar 

  19. Hu Y, Yang B (2015) Enhanced link clustering with observations on ground truth to discover social circles. Knowl Based Syst 73:227–235. doi:10.1016/j.knosys.2014.10.006

    Article  Google Scholar 

  20. Khadivi A, Rad AA, Hasler M (2011) Network community detection enhancement by proper weighting. Phys Rev E 83:046104. doi:10.1103/PhysRevE.83.046104

    Article  Google Scholar 

  21. Lancichinetti A, Fortunato S, Kertesz J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11:033015. doi:10.1088/1367-2630/11/3/033015

    Article  Google Scholar 

  22. Lancichinetti A, Radicchi F, Ramasco JJ, Fortunato S (2011) Finding statistically significant communities in networks. PLoS One 6(4):e18961. doi:10.1371/journal.pone.0018961

    Article  Google Scholar 

  23. Lancichinetti A, Kivela M, Saramak J, Fortunato S (2010) Characterizing the community structure of complex networks. PLoS One 5(8):e11976. doi:10.1371/journal.pone.0011976

    Article  Google Scholar 

  24. LaSalle D, Karypis G (2015) Multi-threaded modularity based graph clustering using the multilevel paradigm. J Parallel Distrib Comput 76:66–80. doi:10.1016/j.jpdc.2014.09.012

    Article  Google Scholar 

  25. Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 177–187. doi:10.1145/1081870.1081893

  26. Leskovec J, Lang KJ, Mahoney M (2010) Empirical comparison of algorithms for networks community detection. In: Proceedings of the 19th International World Wide Web Conference (WWW), pp 631–640. doi:10.1145/1772690.1772755

  27. Leskovec J, McAuley JJ (2014) Discovering social circles in ego networks. ACM Trans Knowl Discov Data. doi:10.1145/2556612

    Google Scholar 

  28. Li G, Pan Z, Xiao B, Huang L (2014) Community discovery and importance analysis in social network. Intell Data Anal 18(3):495–510. doi:10.3233/IDA-140653

    Google Scholar 

  29. Luo F, Wang JZ, Promislow E (2006) Exploring local community structures in large networks. In: Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, pp 233–239. doi:10.1109/WI.2006.72

  30. Newman MEJ (2003) Mixing patterns in networks. Phys Rev E 67:026126. doi:10.1103/PhysRevE.67.026126

    Article  MathSciNet  Google Scholar 

  31. Newman MEJ (2002) Assortative mixing in networks. Phys Rev Lett 89:208701. doi:10.1103/PhysRevLett.89.208701

    Article  Google Scholar 

  32. Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026113. doi:10.1103/PhysRevE.69.026113

    Article  Google Scholar 

  33. Newman MEJ (2005) Power laws, Pareto distributions and Zipf’s law. Contemp Phys 46(5):323–351. doi:10.1080/00107510500052444

    Article  Google Scholar 

  34. Nguyen NP, Dinh TN, Tokala S, Thai MT (2011) Overlapping communities in dynamic networks: their detection and mobile applications. In: Proceedings of the 17th Annual ACM International Conference on Mobile Computing and Networking (MobiCom), pp 85–96. doi:10.1145/2030613.2030624

  35. Prat-Perez A, Dominguez-Sal D, Larriba-Pey JL (2014) High quality, scalable and parallel community detection for large real graphs. In: Proceedings of the 23rd International World Wide Web Conference (WWW), pp 225–236. doi:10.1145/2566486.2568010

  36. Pons P, Latapy M (2006) Computing communities in large networks using random walks. J Graph Algorithms Appl 10(2):191–218. doi:10.7155/jgaa.00124

    Article  MathSciNet  MATH  Google Scholar 

  37. Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structure in large scale networks. Phys Rev E 76(3):036106. doi:10.1103/PhysRevE.76.036106

    Article  Google Scholar 

  38. Satuluri V, Parthasarathy S, Ruan Y (2011) Local graph sparsification for scalable clustering. In: Proceedings of the 2011 ACM Conference on Management of Data (SIGMOD), pp 721–732. doi:10.1145/1989323.1989399

  39. Staudt CL, Meyerhenke H (2013) Engineering high-performance community detection heuristics for massive graphs. In: Proceedings of the 42nd International Conference on Parallel Processing (ICPP), pp 180–189. doi:10.1109/ICPP.2013.27

  40. Wang CH, Lai JH, Yu PS (2014) NEIWalk: community discovery in dynamic content-based networks. IEEE Trans Knowl Data Eng 26(7):1734–1748. doi:10.1109/TKDE.2013.153

    Article  Google Scholar 

  41. Wang M, Wang C, Yu JX, Zhang J (2015) Community detection in social networks: an in-depth benchmarking study with a procedure-oriented framework. Proc VLDB Endow 8(10):998–1009. doi:10.14778/2794367.2794370

    Article  Google Scholar 

  42. Yang J, Leskovec J (2013) Overlapping community detection at scale: a nonnegative matrix factorization approach. In: Proceedings of the 6th Web Search and Data Mining (WSDM), pp 587–596. doi:10.1145/2433396.2433471

  43. Yang J, Leskovec J (2015) Defining and evaluating networks communities based on ground-truth. Knowl Inf Syst 42(1):181–213. doi:10.1007/s10115-013-0693-z

    Article  Google Scholar 

  44. Yang J, Leskovec J (2012) Structure and overlaps of communities in networks. arXiv:1205.6228

  45. Yang TB, Jin R, Chi Y, Zhu S (2009) Combining link and content for community detection: a discriminative approach. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 927–936. doi:10.1145/1557019.1557120

  46. Yang J, McAuley JJ, Leskovec J (2013) Community detection in networks with node attributes. In: Proceedings of the 2013 IEEE International Conference on Data Mining, pp 1151–1156. doi:10.1109/ICDM.2013.167

Download references

Acknowledgments

This work is supported by Sichuan Provincial Project of International Scientific and Technical Exchange and Research Collaboration Programs. We also thank Srinivasan Parthasarathy for his providing for the code of L-Spar.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bo Yang.

Additional information

An erratum to this article is available at http://dx.doi.org/10.1007/s00521-017-3033-5.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, Y., Yang, B. Characterizing the structure of large real networks to improve community detection. Neural Comput & Applic 28, 2321–2333 (2017). https://doi.org/10.1007/s00521-016-2264-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-016-2264-1

Keywords

Navigation