Skip to main content
Log in

Community detection in social networks using user frequent pattern mining

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Recently, social networking sites are offering a rich resource of heterogeneous data. The analysis of such data can lead to the discovery of unknown information and relations in these networks. The detection of communities including ‘similar’ nodes is a challenging topic in the analysis of social network data, and it has been widely studied in the social networking community in the context of underlying graph structure. Online social networks, in addition to having graph structures, include effective user information within networks. Using this information leads to enhance quality of community discovery. In this study, a method of community discovery is provided. Besides communication among nodes to improve the quality of the discovered communities, content information is used as well. This is a new approach based on frequent patterns and the actions of users on networks, particularly social networking sites where users carry out their preferred activities. The main contributions of proposed method are twofold: First, based on the interests and activities of users on networks, some small communities of similar users are discovered, and then by using social relations, the discovered communities are extended. The F-measure is used to evaluate the results of two real-world datasets (Blogcatalog and Flickr), demonstrating that the proposed method principals to improve the community detection quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Adnan M, Alhajj R, Rokne J (2009) Identifying social communities by frequent pattern mining. In: Paper presented at the 13th international conference information visualisation

  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Paper presented at the proceedings of the 20th international conference on very large data bases

  3. Balasundaram B, Butenko S, Hicks IV (2011) Clique relaxations in social network analysis: the maximum k-plex problem. Oper Res 59(1):133–142

    Article  MathSciNet  MATH  Google Scholar 

  4. Berlingerio M, Bonchi F, Bringmann B, Gionis A (2009) Mining graph evolution rules. In: Paper presented at the joint European conference on machine learning and knowledge discovery in databases

  5. Borgatti SP, Everett MG (1992) Graph colorings and power in experimental exchange networks. Soc Netw 14(3):287–308

    Article  Google Scholar 

  6. Bron C, Kerbosch J (1973) Algorithm 457: finding all cliques of an undirected graph. Commun ACM 16(9):575–577

    Article  MATH  Google Scholar 

  7. Charu C, Aggarwal R (2011) Social network data analytic. Springer, Berlin

    MATH  Google Scholar 

  8. Clauset A, Moore C, Newman ME (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101

    Article  Google Scholar 

  9. Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111

    Article  Google Scholar 

  10. Coscia M, Giannotti F, Pedreschi D (2011) A classification for community discovery methods in complex networks. Stat Anal Data Mining 4(5):512–546

    Article  MathSciNet  Google Scholar 

  11. De Meo P, Nocera A, Terracina G, Ursino D (2011) Recommendation of similar users, resources and social networks in a social internetworking scenario. Inf Sci 181(7):1285–1305

    Article  MATH  Google Scholar 

  12. de Santana VF, Baranauskas MCC (2015) WELFIT: a remote evaluation tool for identifying Web usage patterns through client-side logging. Int J Hum Comput Stud 76:40–49

    Article  Google Scholar 

  13. Dinh TN, Xuan Y, Thai MT (2009) Towards social-aware routing in dynamic communication networks. In: Paper presented at the IEEE 28th international performance computing and communications conference

  14. Eliassi-Rad T, Henderson K, Papadimitriou S, Faloutsos C (2010) A hybrid community discovery framework for complex networks. In: Paper presented at the SIAM conference on data mining

  15. Everett MG, Borgatti SP (1996) Exact colorations of graphs and digraphs. Soc Netw 18(4):319–331

    Article  Google Scholar 

  16. Feng M, Li J, Dong G, Wong L (2009) Maintenance of frequent patterns: a survey. In: Zhao Y, Zhang C, Cao L (eds) Post-mining of association rules: techniques for effective knowledge extraction, pp 273–293. Hershey, PA: information science reference. doi:10.4018/978-1-60566-404-0.ch014

  17. Flake GW, Lawrence S, Giles CL, Coetzee FM (2002) Self-organization and identification of web communities. Computer 35(3):66–70

    Article  Google Scholar 

  18. Franz M, Ward T, McCarley JS, Zhu W-J (2001) Unsupervised and supervised clustering for topic tracking. In: Paper presented at the proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval

  19. Ganley D, Lampe C (2009) The ties that bind: social network principles in online communities. Decis Support Syst 47(3):266–274

    Article  Google Scholar 

  20. Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826

    Article  MathSciNet  MATH  Google Scholar 

  21. Goyal A, Bonchi F, Lakshmanan LV (2008) Discovering leaders from community actions. In: Paper presented at the proceedings of the 17th ACM conference on information and knowledge management

  22. Guimera R, Amaral LAN (2005) Functional cartography of complex metabolic networks. Nature 433(7028):895–900

    Article  Google Scholar 

  23. Hofman JM, Wiggins CH (2008) Bayesian approach to network modularity. Phys Revi Lett 100(25):258701

    Article  Google Scholar 

  24. Ito H, Iwama K (2009) Enumeration of isolated cliques and pseudo-cliques. ACM Trans Algorithms (TALG) 5(4):40

    MathSciNet  MATH  Google Scholar 

  25. Ito H, Iwama K, Osumi T (2005) Linear-time enumeration of isolated cliques. In: Paper presented at the European symposium on algorithms

  26. Kanawati R (2011) LICOD: Leaders identification for community detection in complex networks. In: Paper presented at the privacy, security, risk and trust (PASSAT) and 2011 IEEE third international conference on social computing (SocialCom)

  27. Khorasgani RR, Chen J, Zaïane OR (2010) Top leaders community detection approach in information networks. In: Paper presented at the 4th SNA-KDD workshop on social network mining and analysis, Washington, DC

  28. Kiss C, Bichler M (2008) Identification of influencers—measuring influence in customer networks. Decis Support Syst 46(1):233–253

    Article  Google Scholar 

  29. Komusiewicz C, Hüffner F, Moser H, Niedermeier R (2009) Isolation concepts for efficiently enumerating dense subgraphs. Theor Comput Sci 410(38):3640–3654

    Article  MathSciNet  MATH  Google Scholar 

  30. Kumar R, Raghavan P, Rajagopalan S, Tomkins A (1999) Trawling the web for emerging cyber-communities. Comput Netw 31(11):1481–1493

    Article  Google Scholar 

  31. Kuramochi M, Karypis G (2005) Finding frequent patterns in a large sparse graph. Data Min Knowl Discov 11(3):243–271

    Article  MathSciNet  Google Scholar 

  32. Lam HW, Wu C (2009) Finding influential ebay buyers for viral marketing a conceptual model of BuyerRank. In: Paper presented at the international conference on advanced information networking and applications

  33. Lehmann S, Schwartz M, Hansen LK (2008) Biclique communities. Phys Rev E 78(1):016108

    Article  MathSciNet  Google Scholar 

  34. Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2008) Statistical properties of community structure in large social and information networks. In: Paper presented at the proceedings of the 17th international conference on world wide web

  35. Lu D, Li Q, Liao SS (2012) A graph-based action network framework to identify prestigious members through member’s prestige evolution. Decis Support Syst 53(1):44–54

    Article  Google Scholar 

  36. Mislove AE (2009) Online social networks: measurement, analysis, and applications to distributed information systems. ProQuest, Rice University, Ann Arbor, United States

  37. Nguyen NP, Dinh TN, Xuan Y, Thai MT (2011) Adaptive algorithms for detecting community structure in dynamic social networks. In: Paper presented at the Proceedings of the IEEE (INFOCOM 2011)

  38. Nijssen S, Kok JN (2004) A quickstart in frequent structure mining can make a difference. In: Paper presented at the Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining

  39. Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818

    Article  Google Scholar 

  40. Pathak N, DeLong C, Banerjee A, Erickson K (2008) Social topic models for community extraction. In: Paper presented at the 2nd SNA-KDD workshop

  41. Qi G-J, Aggarwal CC, Huang T (2012) Community detection with edge content in social media networks. Paper presented at the 2012 IEEE 28th international conference on data engineering

  42. Sachan M, Contractor D, Faruquie TA, Subramaniam LV (2012) Using content and interactions for discovering communities in social networks. In: Paper presented at the proceedings of the 21st international conference on world wide web

  43. Saito K, Yamada T, Kazama K (2008) Extracting communities from complex networks by the k-dense method. IEICE Trans Fundam Electron Commun Comput Sci 91(11):3304–3311

    Article  Google Scholar 

  44. Satuluri V, Parthasarathy S (2009) Scalable graph clustering using stochastic flows: applications to community discovery. In: Paper presented at the proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining

  45. Shen H, Cheng X, Cai K, Hu M-B (2009) Detect overlapping and hierarchical community structure in networks. Phys A Stat Mech Appl 388(8):1706–1712

    Article  Google Scholar 

  46. Troussas C, Virvou M, Caro J, Espinosa KJ (2013) Mining relationships among user clusters in Facebook for language learning. In: Paper presented at the international conference on computer, information and telecommunication systems (CITS)

  47. Uno T, Kiyomi M, Arimura H (2005) LCM ver. 3: collaboration of array, bitmap and prefix tree for frequent itemset mining. In: Paper presented at the proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations

  48. Wasserman S, Faust K (1994) Social network analysis: methods and applications, vol 8. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  49. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  50. Yan X, Han J (2002) gspan: graph-based substructure pattern mining. In: Paper presented at the Proceedings of the IEEE international conference on data mining (ICDM 2002)

  51. Yang T, Jin R, Chi Y, Zhu S (2009) Combining link and content for community detection: a discriminative approach. In: Paper presented at the proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining

  52. Zhou D, Manavoglu E, Li J, Giles CL, Zha H (2006) Probabilistic models for discovering e-communities. In: Paper presented at the proceedings of the 15th international conference on world wide web

  53. Zhou Y, Cheng H, Yu JX (2009) Graph clustering based on structural/attribute similarities. Proc VLDB Endow 2(1):718–729

    Article  Google Scholar 

  54. Zhu Z, Cao G, Zhu S, Ranjan S, Nucci A (2012) A social network based patching scheme for worm containment in cellular networks. In: Thai TM, Pardalos MP (eds) Handbook of optimization in complex networks: communication and social networks. Springer, Berlin, New York, pp 505–533

  55. Zhuge H (2009) Communities and emerging semantics in semantic link network: discovery and learning. IEEE Trans Knowl Data Eng 21(6):785–799

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehrdad Jalali.

Appendix: Structure-based evaluation

Appendix: Structure-based evaluation

We compared our detected communities in terms of structural metrics to other approaches [26, 27].

To demonstrate that the proposed method extracts the consistent community in term of structure and density, we have implemented the proposed method and also two approaches [26, 27] on Last.Fm data set. For approaches in [26, 27], we calculated core communities, each time with different centrality measures (are shown in method column) While in two papers [26, 27], only betweenness, closeness, and degree metrics were mentioned, communities are formed around the cores in each method and are determined by voting from its neighbors (best results are shown from two approaches [26, 27]). We compared our approach with them [26, 27] in terms of structure metrics. Results are shown in Table 6 based on density, diameter and distance. As you can see, even the results of group analysis in our approach are not far from the previous works [26, 27] in terms of average of density, average of distance between nodes, and average of maximum distance (diameter). Results indicate that our approach extracts communication along with similar users and somewhat related. Note the overlap is permitted in each method. This makes the average of density decrease, and average of distance and diameter increase. However, it is intended for all methods. It should be said that the upper the density and lower distance and diameter is better but our approach is not far from the previous approaches, and this shows that our approach is acceptable in extracting the consistent community in term of structure and density; however, our aim is discovery of similar people in terms of performance and relationships in online social networks and is limited to a particular structure and its specific applications.

Table 6 Structure-based evaluation

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Moosavi, S.A., Jalali, M., Misaghian, N. et al. Community detection in social networks using user frequent pattern mining. Knowl Inf Syst 51, 159–186 (2017). https://doi.org/10.1007/s10115-016-0970-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-016-0970-8

Keywords

Navigation