Abstract
Recently, social networking sites are offering a rich resource of heterogeneous data. The analysis of such data can lead to the discovery of unknown information and relations in these networks. The detection of communities including ‘similar’ nodes is a challenging topic in the analysis of social network data, and it has been widely studied in the social networking community in the context of underlying graph structure. Online social networks, in addition to having graph structures, include effective user information within networks. Using this information leads to enhance quality of community discovery. In this study, a method of community discovery is provided. Besides communication among nodes to improve the quality of the discovered communities, content information is used as well. This is a new approach based on frequent patterns and the actions of users on networks, particularly social networking sites where users carry out their preferred activities. The main contributions of proposed method are twofold: First, based on the interests and activities of users on networks, some small communities of similar users are discovered, and then by using social relations, the discovered communities are extended. The F-measure is used to evaluate the results of two real-world datasets (Blogcatalog and Flickr), demonstrating that the proposed method principals to improve the community detection quality.












Similar content being viewed by others
References
Adnan M, Alhajj R, Rokne J (2009) Identifying social communities by frequent pattern mining. In: Paper presented at the 13th international conference information visualisation
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Paper presented at the proceedings of the 20th international conference on very large data bases
Balasundaram B, Butenko S, Hicks IV (2011) Clique relaxations in social network analysis: the maximum k-plex problem. Oper Res 59(1):133–142
Berlingerio M, Bonchi F, Bringmann B, Gionis A (2009) Mining graph evolution rules. In: Paper presented at the joint European conference on machine learning and knowledge discovery in databases
Borgatti SP, Everett MG (1992) Graph colorings and power in experimental exchange networks. Soc Netw 14(3):287–308
Bron C, Kerbosch J (1973) Algorithm 457: finding all cliques of an undirected graph. Commun ACM 16(9):575–577
Charu C, Aggarwal R (2011) Social network data analytic. Springer, Berlin
Clauset A, Moore C, Newman ME (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101
Clauset A, Newman ME, Moore C (2004) Finding community structure in very large networks. Phys Rev E 70(6):066111
Coscia M, Giannotti F, Pedreschi D (2011) A classification for community discovery methods in complex networks. Stat Anal Data Mining 4(5):512–546
De Meo P, Nocera A, Terracina G, Ursino D (2011) Recommendation of similar users, resources and social networks in a social internetworking scenario. Inf Sci 181(7):1285–1305
de Santana VF, Baranauskas MCC (2015) WELFIT: a remote evaluation tool for identifying Web usage patterns through client-side logging. Int J Hum Comput Stud 76:40–49
Dinh TN, Xuan Y, Thai MT (2009) Towards social-aware routing in dynamic communication networks. In: Paper presented at the IEEE 28th international performance computing and communications conference
Eliassi-Rad T, Henderson K, Papadimitriou S, Faloutsos C (2010) A hybrid community discovery framework for complex networks. In: Paper presented at the SIAM conference on data mining
Everett MG, Borgatti SP (1996) Exact colorations of graphs and digraphs. Soc Netw 18(4):319–331
Feng M, Li J, Dong G, Wong L (2009) Maintenance of frequent patterns: a survey. In: Zhao Y, Zhang C, Cao L (eds) Post-mining of association rules: techniques for effective knowledge extraction, pp 273–293. Hershey, PA: information science reference. doi:10.4018/978-1-60566-404-0.ch014
Flake GW, Lawrence S, Giles CL, Coetzee FM (2002) Self-organization and identification of web communities. Computer 35(3):66–70
Franz M, Ward T, McCarley JS, Zhu W-J (2001) Unsupervised and supervised clustering for topic tracking. In: Paper presented at the proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval
Ganley D, Lampe C (2009) The ties that bind: social network principles in online communities. Decis Support Syst 47(3):266–274
Girvan M, Newman ME (2002) Community structure in social and biological networks. Proc Natl Acad Sci 99(12):7821–7826
Goyal A, Bonchi F, Lakshmanan LV (2008) Discovering leaders from community actions. In: Paper presented at the proceedings of the 17th ACM conference on information and knowledge management
Guimera R, Amaral LAN (2005) Functional cartography of complex metabolic networks. Nature 433(7028):895–900
Hofman JM, Wiggins CH (2008) Bayesian approach to network modularity. Phys Revi Lett 100(25):258701
Ito H, Iwama K (2009) Enumeration of isolated cliques and pseudo-cliques. ACM Trans Algorithms (TALG) 5(4):40
Ito H, Iwama K, Osumi T (2005) Linear-time enumeration of isolated cliques. In: Paper presented at the European symposium on algorithms
Kanawati R (2011) LICOD: Leaders identification for community detection in complex networks. In: Paper presented at the privacy, security, risk and trust (PASSAT) and 2011 IEEE third international conference on social computing (SocialCom)
Khorasgani RR, Chen J, Zaïane OR (2010) Top leaders community detection approach in information networks. In: Paper presented at the 4th SNA-KDD workshop on social network mining and analysis, Washington, DC
Kiss C, Bichler M (2008) Identification of influencers—measuring influence in customer networks. Decis Support Syst 46(1):233–253
Komusiewicz C, Hüffner F, Moser H, Niedermeier R (2009) Isolation concepts for efficiently enumerating dense subgraphs. Theor Comput Sci 410(38):3640–3654
Kumar R, Raghavan P, Rajagopalan S, Tomkins A (1999) Trawling the web for emerging cyber-communities. Comput Netw 31(11):1481–1493
Kuramochi M, Karypis G (2005) Finding frequent patterns in a large sparse graph. Data Min Knowl Discov 11(3):243–271
Lam HW, Wu C (2009) Finding influential ebay buyers for viral marketing a conceptual model of BuyerRank. In: Paper presented at the international conference on advanced information networking and applications
Lehmann S, Schwartz M, Hansen LK (2008) Biclique communities. Phys Rev E 78(1):016108
Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2008) Statistical properties of community structure in large social and information networks. In: Paper presented at the proceedings of the 17th international conference on world wide web
Lu D, Li Q, Liao SS (2012) A graph-based action network framework to identify prestigious members through member’s prestige evolution. Decis Support Syst 53(1):44–54
Mislove AE (2009) Online social networks: measurement, analysis, and applications to distributed information systems. ProQuest, Rice University, Ann Arbor, United States
Nguyen NP, Dinh TN, Xuan Y, Thai MT (2011) Adaptive algorithms for detecting community structure in dynamic social networks. In: Paper presented at the Proceedings of the IEEE (INFOCOM 2011)
Nijssen S, Kok JN (2004) A quickstart in frequent structure mining can make a difference. In: Paper presented at the Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining
Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818
Pathak N, DeLong C, Banerjee A, Erickson K (2008) Social topic models for community extraction. In: Paper presented at the 2nd SNA-KDD workshop
Qi G-J, Aggarwal CC, Huang T (2012) Community detection with edge content in social media networks. Paper presented at the 2012 IEEE 28th international conference on data engineering
Sachan M, Contractor D, Faruquie TA, Subramaniam LV (2012) Using content and interactions for discovering communities in social networks. In: Paper presented at the proceedings of the 21st international conference on world wide web
Saito K, Yamada T, Kazama K (2008) Extracting communities from complex networks by the k-dense method. IEICE Trans Fundam Electron Commun Comput Sci 91(11):3304–3311
Satuluri V, Parthasarathy S (2009) Scalable graph clustering using stochastic flows: applications to community discovery. In: Paper presented at the proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining
Shen H, Cheng X, Cai K, Hu M-B (2009) Detect overlapping and hierarchical community structure in networks. Phys A Stat Mech Appl 388(8):1706–1712
Troussas C, Virvou M, Caro J, Espinosa KJ (2013) Mining relationships among user clusters in Facebook for language learning. In: Paper presented at the international conference on computer, information and telecommunication systems (CITS)
Uno T, Kiyomi M, Arimura H (2005) LCM ver. 3: collaboration of array, bitmap and prefix tree for frequent itemset mining. In: Paper presented at the proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations
Wasserman S, Faust K (1994) Social network analysis: methods and applications, vol 8. Cambridge University Press, Cambridge
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, San Francisco
Yan X, Han J (2002) gspan: graph-based substructure pattern mining. In: Paper presented at the Proceedings of the IEEE international conference on data mining (ICDM 2002)
Yang T, Jin R, Chi Y, Zhu S (2009) Combining link and content for community detection: a discriminative approach. In: Paper presented at the proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining
Zhou D, Manavoglu E, Li J, Giles CL, Zha H (2006) Probabilistic models for discovering e-communities. In: Paper presented at the proceedings of the 15th international conference on world wide web
Zhou Y, Cheng H, Yu JX (2009) Graph clustering based on structural/attribute similarities. Proc VLDB Endow 2(1):718–729
Zhu Z, Cao G, Zhu S, Ranjan S, Nucci A (2012) A social network based patching scheme for worm containment in cellular networks. In: Thai TM, Pardalos MP (eds) Handbook of optimization in complex networks: communication and social networks. Springer, Berlin, New York, pp 505–533
Zhuge H (2009) Communities and emerging semantics in semantic link network: discovery and learning. IEEE Trans Knowl Data Eng 21(6):785–799
Author information
Authors and Affiliations
Corresponding author
Appendix: Structure-based evaluation
Appendix: Structure-based evaluation
We compared our detected communities in terms of structural metrics to other approaches [26, 27].
To demonstrate that the proposed method extracts the consistent community in term of structure and density, we have implemented the proposed method and also two approaches [26, 27] on Last.Fm data set. For approaches in [26, 27], we calculated core communities, each time with different centrality measures (are shown in method column) While in two papers [26, 27], only betweenness, closeness, and degree metrics were mentioned, communities are formed around the cores in each method and are determined by voting from its neighbors (best results are shown from two approaches [26, 27]). We compared our approach with them [26, 27] in terms of structure metrics. Results are shown in Table 6 based on density, diameter and distance. As you can see, even the results of group analysis in our approach are not far from the previous works [26, 27] in terms of average of density, average of distance between nodes, and average of maximum distance (diameter). Results indicate that our approach extracts communication along with similar users and somewhat related. Note the overlap is permitted in each method. This makes the average of density decrease, and average of distance and diameter increase. However, it is intended for all methods. It should be said that the upper the density and lower distance and diameter is better but our approach is not far from the previous approaches, and this shows that our approach is acceptable in extracting the consistent community in term of structure and density; however, our aim is discovery of similar people in terms of performance and relationships in online social networks and is limited to a particular structure and its specific applications.
Rights and permissions
About this article
Cite this article
Moosavi, S.A., Jalali, M., Misaghian, N. et al. Community detection in social networks using user frequent pattern mining. Knowl Inf Syst 51, 159–186 (2017). https://doi.org/10.1007/s10115-016-0970-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-016-0970-8