skip to main content
10.1145/2835776.2835815acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Scaling up Link Prediction with Ensembles

Published: 08 February 2016 Publication History

Abstract

A network with $n$ nodes contains O(n2) possible links. Even for networks of modest size, it is often difficult to evaluate all pairwise possibilities for links in a meaningful way. Furthermore, even though link prediction is closely related to missing value estimation problems, such as collaborative filtering, it is often difficult to use sophisticated models such as latent factor methods because of their computational complexity over very large networks. Due to this computational complexity, most known link prediction methods are designed for evaluating the link propensity over a specified subset of links, rather than for performing a global search over the entire networks. In practice, however, it is essential to perform an exhaustive search over the entire networks. In this paper, we propose an ensemble enabled approach to scaling up link prediction, which is able to decompose traditional link prediction problems into subproblems of smaller size. These subproblems are each solved with the use of latent factor models, which can be effectively implemented over networks of modest size. Furthermore, the ensemble enabled approach has several advantages in terms of performance. We show the advantage of using ensemble-based latent factor models with experiments on very large networks. Experimental results demonstrate the effectiveness and scalability of our approach.

References

[1]
L. Adamic and E. Adar. Friends and neighbors on the web. Social Networks, 25, pp. 211--230, 2001.
[2]
C. Aggarwal and S. Parthasarathy. Mining massively incomplete data sets by conceptual reconstruction. KDD, 2001.
[3]
G. Adomavicius, and A. Tuzhilin.Toward the next generation of recommender systems: A survey of thestate-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering (TKDE), 17(6), pp. 734--749, 2005.
[4]
S. F. Adafre and M. Rijke. Discovering missing links in Wikipedia. KDD Workshop on Link Discovery, 2005.
[5]
L. Backstrom, and J. Leskovec. Supervised random walks: predicting and recommending links in social networks. WSDM, 2011.
[6]
M. Bilgic, G. Namata and L. Getoor. Combining collectiveclassification and link prediction. ICDM Workshop on MiningGraphs and Complex Structures, 2007.
[7]
L. Getoor and C. Diehl. Link mining: A survey. SIGKDD Exploration, pp. 3--12, 2005.
[8]
J. R. Doppa, J. Yu, P. Tadepalli and L. Getoor. Chance constrained programs for link prediction. NIPS Workshop on Analyzing Networks and Learning with Graphs, 2009.
[9]
A. K. Menon, and C. Elkan.Link prediction via matrix factorization. Machine Learning andKnowledge Discovery in Databases, pp. 437--452, 2011.
[10]
L. Getoor, N. Friedman, D. Koller and B. Taskar. Learning probabilistic models of relationalstructure. ICML, 2001.
[11]
L. Getoor, N. Friedman, D. Koller and B. Taskar. Learning probabilistic models of link structure. Journal of Machine Learning Research, 3, pp. 679--707, 2002.
[12]
J. Kunegis and A. Lommatzsch. Learning Spectral Graph Transformations for Link Prediction. ICML, 2009.
[13]
D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. Journal of the American society for information science and technology,58(7), pp. 1019--1031, 2007.
[14]
B. Long, Z. Zhang, and P. Yu. Co-clustering by block value decomposition. KDD, 2005.
[15]
R. Lichtenwalter, J. Lussier, and N. Chawla. New perspectives and methods in link prediction. KDD, 2010.
[16]
R. Lichenwater and N. Chawla.Vertex Collocation Profiles: Subgraph Counting for Link Analysis and Prediction. WWW, 2012.
[17]
B. Liu. Web Data Mining. Springer, 2010.
[18]
G. Qi, C. Aggarwal, and T. Huang. Link Predictionacross Networks by Cross-Network Biased Sampling. ICDE, 2013.
[19]
Y. Sun, R. Barber, M. Gupta, C. Aggarwal, J.Han. Co-author Relationship Prediction in Heterogeneous Bibliographic Networks. ASONAM, 2011.
[20]
Y. Sun, J. Han, C. Aggarwal, N. Chawla. When will it happen -- Relationship Prediction in Heterogeneous Information Networks. WSDM, 2012.
[21]
J. Tang, T. Lou, and J. Kleinberg. Inferring social ties across heterogenous networks. WSDM, 2012.
[22]
B. Taskar, M. F. Wong, P. Abbeel and D. Koller. Link prediction in relational data. NIPS, 2003.
[23]
D. Wang, D. Pedreschi, C. Song, F. Giannotti, and A.-L.Barabasi. Human mobility, social ties, and link prediction. KDD, 2011.
[24]
Y. Yang, N. Chawla, Y. Sun, and J. Han. Predicting Links in Multi-Relational and Heterogeneous Networks. ICDM, 2012.
[25]
T. Yang, R. Jin, Y. Chi, and S. Zhu. Combining link and content for community detection: a discriminative approach. KDD, 2009.
[26]
K. Yu, W. Chu, S. Yu, V. Tresp and Z. Xu. Stochastic relational models for discriminative link prediction. NIPS, 2006.
[27]
J. Zhu, J. Hong and G. Hughes. Using Markov models for web site link prediction. HyperText, 2002.
[28]
C. Ding, X. He and H. D. Simon. On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering. SDM, 2005.%
[29]
C. Lee, M. Pham, N. Kim, M. K. Jeong, D. K. J. Lin and W. Art. A Novel Link Prediction Approach for Scale-free Networks. WWW, 2014.
[30]
Lee, Daniel D. and Seung, H. Sebastian. Learning the parts of objects by non-negative matrix factorization. Nature, 401, pp. 788--791, 1999.
[31]
Chao Liu, Hung-chih Yang, Jinliang Fan, Li-Wei He and Yi-Min Wang.Distributed Nonnegative Matrix Factorization for Web-Scale Dyadic Data Analysis on MapReduce. WWW, 2010.
[32]
Ahmed, Nesreen K. and Neville, Jennifer and Kompella, Ramana. Network Sampling: From Static to Streaming Graphs. Transactions on Knowledge Discovery from Data (TKDD), 8(2), pp. 7:1--7:56 2014.
[33]
Jaewon Yang and Jure Leskovec. Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach. WSDM, 2013.
[34]
L. Katz. A New Status Index Derived from Sociometric Analysis. Psychometrika, 18(1), pp. 39--43, 1953.
[35]
Linyuan Lu and Tao Zhou. Link Prediction in Complex Networks: A Survey. Physica A, pp. 1150--1170, 2011.
[36]
Mohammad Al Hasan and Mohammed J. Zaki. A survey of link prediction in social networks. Social Network Data Analytics, 2011.
[37]
J. Leskovec, L. Backstrom, R. Kumar and A. Tomkins. Microscopic Evolution of Social Networks. KDD, 2008.
[38]
Leo Breiman. Bagging Predictors. Machine Learning,24(2), pp. 123--140, 1996.
[39]
N. Barbieri, F. Bonchi and G. Manco. Who to Follow and Why: Link Prediction with Explanations. KDD, 2014.
[40]
J. Tang, S. Chang, C. Aggarwal and H. Liu. Negative Link Prediction in Social Media. WSDM, 2015.
[41]
R. West, A. Paranjape and J. Leskovec. Mining Missing Hyperlinks from Human Navigation Traces:A Case Study of Wikipedia. WWW, 2015.

Cited By

View all
  • (2024)Privacy-Preserving and Secure Industrial Big Data Analytics: A Survey and the Research FrameworkIEEE Internet of Things Journal10.1109/JIOT.2024.335372711:11(18976-18999)Online publication date: 1-Jun-2024
  • (2024)Enhancing Relationship Link Prediction With Hierarchical Feature EnhancementIEEE Access10.1109/ACCESS.2024.350367512(174387-174398)Online publication date: 2024
  • (2024)SnapE – Training Snapshot Ensembles of Link Prediction ModelsThe Semantic Web – ISWC 202410.1007/978-3-031-77844-5_1(3-22)Online publication date: 27-Nov-2024
  • Show More Cited By

Index Terms

  1. Scaling up Link Prediction with Ensembles

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WSDM '16: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining
    February 2016
    746 pages
    ISBN:9781450337168
    DOI:10.1145/2835776
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 February 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. big data
    2. ensembles
    3. link prediction
    4. networks

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    WSDM 2016
    WSDM 2016: Ninth ACM International Conference on Web Search and Data Mining
    February 22 - 25, 2016
    California, San Francisco, USA

    Acceptance Rates

    WSDM '16 Paper Acceptance Rate 67 of 368 submissions, 18%;
    Overall Acceptance Rate 498 of 2,863 submissions, 17%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)10
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 18 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Privacy-Preserving and Secure Industrial Big Data Analytics: A Survey and the Research FrameworkIEEE Internet of Things Journal10.1109/JIOT.2024.335372711:11(18976-18999)Online publication date: 1-Jun-2024
    • (2024)Enhancing Relationship Link Prediction With Hierarchical Feature EnhancementIEEE Access10.1109/ACCESS.2024.350367512(174387-174398)Online publication date: 2024
    • (2024)SnapE – Training Snapshot Ensembles of Link Prediction ModelsThe Semantic Web – ISWC 202410.1007/978-3-031-77844-5_1(3-22)Online publication date: 27-Nov-2024
    • (2021)From Closing Triangles to Higher-Order Motif Closures for Better Unsupervised Online Link PredictionProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3481920(4085-4093)Online publication date: 26-Oct-2021
    • (2020)Clustering and Constructing User Coresets to Accelerate Large-scale Top-K Recommender SystemsProceedings of The Web Conference 202010.1145/3366423.3380283(2177-2187)Online publication date: 20-Apr-2020
    • (2019)A Collaborative Learning Framework to Tag Refinement for Points of InterestProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3292500.3330698(1752-1761)Online publication date: 25-Jul-2019
    • (2019)Link Analytics in GraphsEncyclopedia of Big Data Technologies10.1007/978-3-319-77525-8_320(1110-1117)Online publication date: 20-Feb-2019
    • (2019)Improve Link Prediction Accuracy with Node Attribute SimilaritiesThe 8th International Conference on Computer Engineering and Networks (CENet2018)10.1007/978-3-030-14680-1_41(376-384)Online publication date: 13-Apr-2019
    • (2018)Streaming Link Prediction on Dynamic Attributed NetworksProceedings of the Eleventh ACM International Conference on Web Search and Data Mining10.1145/3159652.3159674(369-377)Online publication date: 2-Feb-2018
    • (2018)Positive and Negative Link Prediction Algorithm Based on Sentiment Analysis in Large Social NetworksWireless Personal Communications: An International Journal10.1007/s11277-018-5499-6102:3(2183-2198)Online publication date: 1-Oct-2018
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media