Abstract
Data that involves some sort of relationship or interaction can be represented, modelled and analyzed using the notion of a network. To understand the dynamics of networks, the link prediction problem is concerned with predicting the evolution of the topology of a network over time. Previous work in this direction has largely focussed on finding an extensive set of features capable of predicting the formation of a link, often within some domain-specific context. This sometimes results in a “black box” type of approach in which it is unclear how the (often computationally expensive) features contribute to the accuracy of the final predictor. This paper counters these problems by categorising the large set of proposed link prediction features based on their topological scope, and showing that the contribution of particular categories of features can actually be explained by simple structural properties of the network. An approach called the Efficient Feature Set is presented that uses a limited but explainable set of computationally efficient features that within each scope captures the essential network properties. Its performance is experimentally verified using a large number of diverse real-world network datasets. The result is a generic approach suitable for consistently predicting links with high accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Adamic, L.A., Adar, E.: Friends and neighbors on the web. Soc. Netw. 25(3), 211–230 (2003)
Al Hasan, M., Chaoji, V., Salem, S., Zaki, M.: Link prediction using supervised learning. In: Workshop on Link Analysis, Counter-Terrorism and Security (2006)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, 2nd edn. Addison-Wesley, Boston (2011)
Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)
Caruana, R., Karampatziakis, N., Yessenalina, A.: An empirical evaluation of supervised learning in high dimensions. In: Proceedings ICDM, pp. 96–103 (2008)
Choudhury, M.D., Sundaram, H., John, A., Seligmann, D.D.: Social synchrony: predicting mimicry of user actions. In: Proceedings ICCSE, pp. 151–158 (2009)
Fire, M., Tenenboim-Chekina, L., Puzis, R., Lesser, O., Rokach, L., Elovici, Y.: Computationally efficient link prediction in a variety of social networks. ACM Trans. Intell. Syst. Technol. (TIST) 5(1), 10 (2013)
Gómez, V., Kaltenbrunner, A., López, V.: Statistical analysis of social network discussion threads in Slashdot. In: Proceedings WWW, pp. 645–654 (2008)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software. SIGKDD Explor. Newslett. 11(1), 10–18 (2009)
Huang, Z., Li, X., Chen, H.: Link prediction approach to collaborative filtering. In: Proceedings DLT, pp. 141–142 (2005)
Isella, L., Stehlé, J., Barrat, A., Cattuto, C., Pinton, J.-F., den Broeck, W.V.: What’s in a crowd? Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)
Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)
Kleinberg, J.: The small-world phenomenon: an algorithmic perspective. In: Proceedings STOC, pp. 163–170 (2000)
KONECT. Linux mailing list replies network (2015). http://konect.uni-koblenz.de
Lewis, T.G.: Network Science: Theory and Applications. Wiley, New York (2011)
Liben-Nowell, D., Kleinberg, J.: The link prediction problem for social networks. In: Proceedings CIKM, pp. 556–559 (2003)
Lichtenwalter, R.N., Lussier, J.T., Chawla, N.V.: New perspectives and methods in link prediction. In: Proceedings KDD, pp. 243–252 (2010)
Lü, L., Zhou, T.: Link prediction in complex networks: a survey. Physica A Stat. Mech. Appl. 390(6), 1150–1170 (2011)
O’Madadhain, J., Hutchins, J., Smyth, P.: Prediction and ranking algorithms for event-based network data. SIGKDD Explor. Newslett. 7(2), 23–30 (2005)
Opsahl, T., Panzarasa, P.: Clustering in weighted networks. Soc. Netw. 31(2), 155–163 (2009)
Popescul, A., Ungar, L.H.: Statistical relational learning for link prediction. In: IJCAI Workshop on Learning Statistical Models from Relational Data (2003)
Preusse, J., Kunegis, J., Thimm, M., Gottron, T., Staab, S.: Structural dynamics of knowledge networks. In: Proceedings ICWSM (2013)
Sarukkai, R.R.: Link prediction and path analysis using Markov chains. Comput. Netw. 33(1), 377–386 (2000)
Scott, J.: Social Network Analysis. Sage, London (2012)
Takes, F.W.: Algorithms for analyzing and mining real-world graphs. Ph.D. thesis, Leiden University (2014)
Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P.: On the evolution of user interaction in Facebook. In: Proceedings WOSN, pp. 37–42 (2009)
Zhang, B., Liu, R., Massey, D., Zhang, L.: Collecting the internet AS-level topology. SIGCOMM Comput. Commun. Rev. 35(1), 53–61 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
van Engelen, J.E., Boekhout, H.D., Takes, F.W. (2016). Explainable and Efficient Link Prediction in Real-World Network Data. In: Boström, H., Knobbe, A., Soares, C., Papapetrou, P. (eds) Advances in Intelligent Data Analysis XV. IDA 2016. Lecture Notes in Computer Science(), vol 9897. Springer, Cham. https://doi.org/10.1007/978-3-319-46349-0_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-46349-0_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46348-3
Online ISBN: 978-3-319-46349-0
eBook Packages: Computer ScienceComputer Science (R0)