Explainable and Efficient Link Prediction in Real-World Network Data

van Engelen, Jesper E.; Boekhout, Hanjo D.; Takes, Frank W.

doi:10.1007/978-3-319-46349-0_26

Explainable and Efficient Link Prediction in Real-World Network Data

Jesper E. van Engelen¹⁷,
Hanjo D. Boekhout¹⁷ &
Frank W. Takes^17,18

Conference paper
First Online: 21 September 2016

1789 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9897))

Abstract

Data that involves some sort of relationship or interaction can be represented, modelled and analyzed using the notion of a network. To understand the dynamics of networks, the link prediction problem is concerned with predicting the evolution of the topology of a network over time. Previous work in this direction has largely focussed on finding an extensive set of features capable of predicting the formation of a link, often within some domain-specific context. This sometimes results in a “black box” type of approach in which it is unclear how the (often computationally expensive) features contribute to the accuracy of the final predictor. This paper counters these problems by categorising the large set of proposed link prediction features based on their topological scope, and showing that the contribution of particular categories of features can actually be explained by simple structural properties of the network. An approach called the Efficient Feature Set is presented that uses a limited but explainable set of computationally efficient features that within each scope captures the essential network properties. Its performance is experimentally verified using a large number of diverse real-world network datasets. The result is a generic approach suitable for consistently predicting links with high accuracy.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Adamic, L.A., Adar, E.: Friends and neighbors on the web. Soc. Netw. 25(3), 211–230 (2003)
Article Google Scholar
Al Hasan, M., Chaoji, V., Salem, S., Zaki, M.: Link prediction using supervised learning. In: Workshop on Link Analysis, Counter-Terrorism and Security (2006)
Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval, 2nd edn. Addison-Wesley, Boston (2011)
Google Scholar
Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)
Article MathSciNet MATH Google Scholar
Caruana, R., Karampatziakis, N., Yessenalina, A.: An empirical evaluation of supervised learning in high dimensions. In: Proceedings ICDM, pp. 96–103 (2008)
Google Scholar
Choudhury, M.D., Sundaram, H., John, A., Seligmann, D.D.: Social synchrony: predicting mimicry of user actions. In: Proceedings ICCSE, pp. 151–158 (2009)
Google Scholar
Fire, M., Tenenboim-Chekina, L., Puzis, R., Lesser, O., Rokach, L., Elovici, Y.: Computationally efficient link prediction in a variety of social networks. ACM Trans. Intell. Syst. Technol. (TIST) 5(1), 10 (2013)
Google Scholar
Gómez, V., Kaltenbrunner, A., López, V.: Statistical analysis of social network discussion threads in Slashdot. In: Proceedings WWW, pp. 645–654 (2008)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software. SIGKDD Explor. Newslett. 11(1), 10–18 (2009)
Article Google Scholar
Huang, Z., Li, X., Chen, H.: Link prediction approach to collaborative filtering. In: Proceedings DLT, pp. 141–142 (2005)
Google Scholar
Isella, L., Stehlé, J., Barrat, A., Cattuto, C., Pinton, J.-F., den Broeck, W.V.: What’s in a crowd? Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)
Article Google Scholar
Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)
Article MATH Google Scholar
Kleinberg, J.: The small-world phenomenon: an algorithmic perspective. In: Proceedings STOC, pp. 163–170 (2000)
Google Scholar
KONECT. Linux mailing list replies network (2015). http://konect.uni-koblenz.de
Lewis, T.G.: Network Science: Theory and Applications. Wiley, New York (2011)
Google Scholar
Liben-Nowell, D., Kleinberg, J.: The link prediction problem for social networks. In: Proceedings CIKM, pp. 556–559 (2003)
Google Scholar
Lichtenwalter, R.N., Lussier, J.T., Chawla, N.V.: New perspectives and methods in link prediction. In: Proceedings KDD, pp. 243–252 (2010)
Google Scholar
Lü, L., Zhou, T.: Link prediction in complex networks: a survey. Physica A Stat. Mech. Appl. 390(6), 1150–1170 (2011)
Article Google Scholar
O’Madadhain, J., Hutchins, J., Smyth, P.: Prediction and ranking algorithms for event-based network data. SIGKDD Explor. Newslett. 7(2), 23–30 (2005)
Article Google Scholar
Opsahl, T., Panzarasa, P.: Clustering in weighted networks. Soc. Netw. 31(2), 155–163 (2009)
Article Google Scholar
Popescul, A., Ungar, L.H.: Statistical relational learning for link prediction. In: IJCAI Workshop on Learning Statistical Models from Relational Data (2003)
Google Scholar
Preusse, J., Kunegis, J., Thimm, M., Gottron, T., Staab, S.: Structural dynamics of knowledge networks. In: Proceedings ICWSM (2013)
Google Scholar
Sarukkai, R.R.: Link prediction and path analysis using Markov chains. Comput. Netw. 33(1), 377–386 (2000)
Article Google Scholar
Scott, J.: Social Network Analysis. Sage, London (2012)
Google Scholar
Takes, F.W.: Algorithms for analyzing and mining real-world graphs. Ph.D. thesis, Leiden University (2014)
Google Scholar
Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P.: On the evolution of user interaction in Facebook. In: Proceedings WOSN, pp. 37–42 (2009)
Google Scholar
Zhang, B., Liu, R., Massey, D., Zhang, L.: Collecting the internet AS-level topology. SIGCOMM Comput. Commun. Rev. 35(1), 53–61 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

LIACS, Leiden University, Leiden, The Netherlands
Jesper E. van Engelen, Hanjo D. Boekhout & Frank W. Takes
AISSR, University of Amsterdam, Amsterdam, The Netherlands
Frank W. Takes

Authors

Jesper E. van Engelen
View author publications
You can also search for this author in PubMed Google Scholar
Hanjo D. Boekhout
View author publications
You can also search for this author in PubMed Google Scholar
Frank W. Takes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frank W. Takes .

Editor information

Editors and Affiliations

Stockholm University , Stockholm, Sweden
Henrik Boström
Leiden University , Leiden, The Netherlands
Arno Knobbe
University of Porto , Porto, Portugal
Carlos Soares
Stockholm University , Stockholm, Sweden
Panagiotis Papapetrou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

van Engelen, J.E., Boekhout, H.D., Takes, F.W. (2016). Explainable and Efficient Link Prediction in Real-World Network Data. In: Boström, H., Knobbe, A., Soares, C., Papapetrou, P. (eds) Advances in Intelligent Data Analysis XV. IDA 2016. Lecture Notes in Computer Science(), vol 9897. Springer, Cham. https://doi.org/10.1007/978-3-319-46349-0_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-46349-0_26
Published: 21 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46348-3
Online ISBN: 978-3-319-46349-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics