Abstract
An important area of social network research is identifying missing information which is not visible or explicitly represented in the network. Recently, the missing node identification problem was introduced where missing members in the social network structure must be identified. However, previous works did not consider the possibility that information about specific users (nodes) within the network may be known and could be useful in solving this problem. Assuming such information such as user demographic information and users’ historical behavior in the network is known, more effective algorithms for the missing node identification problem could potentially be developed. In this paper, we present three algorithms, SAMI-A, SAMI-C and SAMI-N, which leverage this type of information to perform significantly better than previous missing node algorithms. However, as each of these algorithms and the parameters within these algorithms often perform better in specific problem instances, a mechanism is needed to select the best algorithm and the best variation within that algorithm. Towards this challenge, we also present OASCA, a novel online selection algorithm. We present results that detail the success of the algorithms presented within this paper.
Similar content being viewed by others
Notes
Previous work (Eyal et al. 2011) has found that this number can also be effectively estimated.
References
Adamic LA, Adar E (2003) Friends and neighbors on the web. Soc Netw 25(3):211–230
Backstrom L, Leskovec J (2011) Supervised random walks: predicting and recommending links in social networks. In: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, pp 635–644
Becker R, Chernihov Y, Shavitt Y, Zilberman N (2012) An analysis of the steam community network evolution. In: Electrical & Electronics Engineers in Israel (IEEEI), 2012 IEEE 27th Convention of IEEE, pp 1–5
Brand M (2005) A random walks perspective on maximizing satisfaction and profit. In: SIAM international conference on data mining, pp 12–19
Clauset A, Moore C, Newman MEJ (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98–101
Eslami M, Rabiee HR, Salehi M (2011) Dne: a method for extracting cascaded diffusion networks from social networks. In: SocialCom/PASSAT, pp 41–48
Eyal R, Rosenfeld A, Kraus S (2011) Identifying missing node information in social networks. In: Twenty-Fifth AAAI Conference on Artificial Intelligence
Eyal R, Rosenfeld A, Sina S, Kraus S (2014) Predicting and identifying missing node information in social networks. ACM Transactions on Knowledge Discovery from Data (TKDD) (To Appear at)
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174
Freno A, Garriga G, Keller M (2011) Learning to recommend links using graph structure and node content. In: Neural information processing systems workshop on choice models and preference learning
Gomes CP, Selman B (2001) Algorithm portfolios. Artif Intell (AIJ) 126(1–2):43–62
Gomez-Rodriguez M, Leskovec J, Krause A (2012) Inferring networks of diffusion and influence. TKDD 5(4):21
Gong NZ, Talwalkar A, Mackey LW, Huang L, Shin ECR, Stefanov E, Shi E, Song D (2011) Predicting links and inferring attributes using a social-attribute network (san). CoRR
Guimerà R, Sales-Pardo M (2009) Missing and spurious interactions and the reconstruction of complex networks. Proc Natl Acad Sci 106(52):22073–22078
Halkidi M, Vazirgiannis M (2001) A data set oriented approach for clustering algorithm selection. In: PKDD, pp 165–179
Kadioglu S, Malitsky Y, Sellmann M, Tierney K (2010) Isac—instance-specific algorithm configuration. In: ECAI, pp 751–756
Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43
Kim M, Leskovec J (2011) The network completion problem: inferring missing nodes and edges in networks. In: SIAM international conference on data mining (SDM), 2011
Kim M, Leskovec J (2012) Latent multi-group membership graph model. arXiv:1205.4546
Kossinets G (2003) Effects of missing data in social networks. Soc Netw 28:247–268
Kostakis O, Kinable J, Mahmoudi H, Mustonen K (2011) Improved call graph comparison using simulated annealing. In: Proceedings of the 2011 ACM symposium on applied computing, ser. SAC ’11, pp 1516–1523
Leroy V, Cambazoglu BB, Bonchi F (2010) Cold start link prediction. SIGKDD 2010
Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, pp 177–187
Leskovec J, Faloutsos C (2006)Sampling from large graphs. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 631–636
Liben-Nowell D, Kleinberg J (2007) The link-prediction problem for social networks. J Am Soc Inf Sci Technol 58(7):1019–1031
Lin W, Kong X, Yu PS, Wu Q, Jia Y, Li C (2012) Community detection in incomplete information networks. In: WWW, pp 341–350
McPherson M, Smith-Lovin L, Cook JM (2001) Birds of a feather: Homophily in social networks. Ann Rev Sociol, pp 415–444
Minton S, Johnston MD, Philips AB, Laird P (1992) Minimizing conflicts: a heuristic repair method for constraint satisfaction and scheduling problems. Artif Intell 58(1–3):161–205
Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems 14, MIT Press, pp 849–856
Porter MA, Onnela J-P, Mucha PJ (2009) Communities in networks. Not Am Math Soc 56(9):1082–1097
Rice JR (1976) he algorithm selection problem. Adv Comput 15:118–165
Sadikov E, Medina M, Leskovec J, Garcia-Molina H (2011) Correcting for missing data in information cascades. In: WSDM, pp 55–64
Sina S, Rosenfeld A, Kraus S (2013) Solving the missing node problem using structure and attribute information. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ACM, pp 744–751
Strehl A, Ghosh J (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617
Talman S, Toister R, Kraus S (2005) Choosing between heuristics and strategies: an enhanced model for decision-making. Int Jt Conf Artif Intell 18:324–330
von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416
Yin Z, Gupta M, Weninger T, Han J (2010) Linkrec: a unified framework for link recommendation with user attributes and graph structure. In: Proceedings of the 19th international conference on World wide web. ACM, pp 1211–1212
Yin Z, Gupta M, Weninger T, Han J (2010) A unified framework for link recommendation using random walks. In: Advances in Social Networks Analysis and Mining (ASONAM), 2010 International Conf. on IEEE, pp 152–159
Acknowledgments
This research is based on work supported in part by MAFAT and the Israel Science Foundation Grant #1488/14. Preliminary results of this research were published in the ASONAM 2013 paper entitled ‘Solving the Missing Node Problem using Structure and Attribute Information’.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sina, S., Rosenfeld, A. & Kraus, S. SAMI: an algorithm for solving the missing node problem using structure and attribute information. Soc. Netw. Anal. Min. 5, 54 (2015). https://doi.org/10.1007/s13278-015-0296-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-015-0296-7