Link predictability classes in large node-attributed networks

Antonov, Andrey; Stavinova, Elizaveta; Evmenova, Elizaveta; Chunaev, Petr

doi:10.1007/s13278-022-00912-w

Link predictability classes in large node-attributed networks

Original Article
Published: 15 July 2022

Volume 12, article number 81, (2022)
Cite this article

Social Network Analysis and Mining Aims and scope Submit manuscript

Andrey Antonov¹,
Elizaveta Stavinova ORCID: orcid.org/0000-0003-2640-3759¹,
Elizaveta Evmenova¹ &
…
Petr Chunaev¹

306 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, we study how the observed quality of a chosen feature-based link prediction model applied to a part of a large node-attributed network can be further used for the analysis of another part of the network. Namely, we first show that it can be determined on the former part the usage of which features (topological and attributive) of node pairs lead to a certain level of link prediction quality. Based on it, we then construct a link predictability (prediction quality) classifier for the network node pairs that is able to distinguish poorly and highly predictable links by a few selected features of the corresponding nodes. The features are selected to provide a reasonable trade-off between the classifier’s time consumption and quality performance. The classifier is further used in the other part of the network for controlling the link prediction quality typical for the model and the network, without performing the actual link prediction. Our experiments show the good performance of the classifier over all tested real-world networks of various types (at least 0.9208 in terms of ROC-AUC and 0.9224 in terms of Average Precision).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Link Predictability Classes in Complex Networks

A survey on feature extraction and learning techniques for link prediction in homogeneous and heterogeneous complex networks

Article Open access 28 October 2024

An Experiment with Link Prediction in Social Network: Two New Link Prediction Methods

Notes

https://github.com/andrey-antonov-j4133c/link_prediction
Roughly speaking, they indicate the relative importance of predictor variables (features) in linear regression.
https://github.com/slundberg/shap
https://github.com/andrey-antonov-j4133c/attributed_network_research

References

Aggarwal C, He G, Zhao P (2016) Edge classification in networks. In: 2016 IEEE 32nd international conference on data engineering (ICDE), p 1038–1049. https://doi.org/10.1109/ICDE.2016.7498311
Aggarwal CC, Li Y, Yu PS, Zhao Y (2017) On edge classification in networks with structure and content. In: 2017 IEEE 33rd international conference on data engineering (ICDE), p 187–190. https://doi.org/10.1109/ICDE.2017.71
Bounova G, de Weck O (2012) Overview of metrics and their correlation patterns for multiple-metric topology analysis on heterogeneous graph ensembles. Phys Rev E 85:016117. https://doi.org/10.1103/PhysRevE.85.016117
Article Google Scholar
Chatterjee A, Manohar M, Ramadurai G (2016) Statistical analysis of bus networks in india. PLOS ONE 11(12):1–16. https://doi.org/10.1371/journal.pone.0168478
Article Google Scholar
Cukierski W, Hamner B, Yang B (2011) Graph-based features for supervised link prediction. In: The 2011 international joint conference on neural networks, p 1237–1244. https://doi.org/10.1109/IJCNN.2011.6033365
DasGupta A (2010) Urn models in physics and genetics. In: Fundamentals of probability: a first course, p 379–407. Springer, New York. https://doi.org/10.1007/978-1-4419-5780-1_15
Dong X, Yu Z, Cao W, Shi Y, Ma Q (2019) A survey on ensemble learning. Front Comput Sci 14(2):241–258. https://doi.org/10.1007/s11704-019-8208-z
Article Google Scholar
Garcia-Gasulla D, Cortés U, Ayguadé E, Labarta J (2015) Evaluating link prediction on large graphs. In: Artificial intelligence research and development: proceedings of the 18th international conference of the catalan association for artificial intelligence, vol 277, pp 90–100. https://doi.org/10.3233/978-1-61499-578-4-90
García-Pérez G, Aliakbarisani R, Ghasemi A, Serrano MA (2020) Precision as a measure of predictability of missing links in real networks. Phys Rev E 101:052318. https://doi.org/10.1103/PhysRevE.101.052318
Article MathSciNet Google Scholar
Ghasemian A, Hosseinmardi H, Galstyan A, Airoldi EM, Clauset A (2020) Stacking models for nearly optimal link prediction in complex networks. Proc Natl Acad Sci 117(38):23393–23400. https://doi.org/10.1073/pnas.1914950117
Article Google Scholar
Giles CL, Bollacker KD, Lawrence S Citeseer (1998) An automatic citation indexing system. In: Proceedings of the third ACM conference on digital libraries. DL ’98, p 89–98. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/276675.276685
Guimerà R, Danon L, Díaz-Guilera A, Giralt F, Arenas A (2003) Self-similar community structure in a network of human interactions. Phys Rev E 68:065103. https://doi.org/10.1103/PhysRevE.68.065103
Article Google Scholar
Hao Y, Cao X, Fang Y, Xie X, Wang S (2021) Inductive link prediction for nodes having only attribute information. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence. IJCAI’20. https://doi.org/10.5555/3491440.3491608
Jiang M, Chen Y, Chen L (2015) Link prediction in networks with nodes attributes by similarity propagation. arXiv https://doi.org/10.48550/ARXIV.1502.04380
Kaboudan MA (1999) A measure of time series’ predictability using genetic programming applied to stock returns. J Forecast 18(5):345–357. https://doi.org/10.1002/(SICI)1099-131X(199909)18:5<345::AID-FOR744>3.0.CO;2-7
Article Google Scholar
Kim M, Leskovec J (2010) Multiplicative attribute graph model of real-world networks. In: Kumar R, Sivakumar D (eds.) Algorithms and models for the web-graph, p 62–73, Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18009-5_7
Kong Y, Yu T (2018) A deep neural network model using random forest to extract feature representation for gene expression data classification. Scii Reports 8(1) . https://doi.org/10.1038/s41598-018-34833-6
Kovantsev A, Chunaev P, Bochenina K (2021) Evaluating time series predictability via transition graph analysis. In: 2021 International conference on data mining workshops (ICDMW), p 1039–1046. https://doi.org/10.1109/ICDMW53433.2021.00135
Kovantsev A, Gladilin P (2020) Analysis of multivariate time series predictability based on their features. In: 2020 International conference on data mining workshops (ICDMW), p 348–355. https://doi.org/10.1109/ICDMW51313.2020.00055
Kumar A, Singh SS, Singh K, Biswas B (2020) Link prediction techniques, applications, and performance: a survey, Phys A Stat Mech Appl 553:124289. https://doi.org/10.1016/j.physa.2020.124289
Article MathSciNet MATH Google Scholar
Lancichinetti A, Fortunato S, Radicchi F (2008) Benchmark graphs for testing community detection algorithms. Phys Rev E 78:046110. https://doi.org/10.1103/PhysRevE.78.046110
Article Google Scholar
Larremore DB, Clauset A, Buckee CO (2013) A network approach to analyzing highly recombinant malaria parasite genes. PLOS Comput Biol 9(10):1–12. https://doi.org/10.1371/journal.pcbi.1003268
Article Google Scholar
Liben-Nowell D, Kleinberg J (2003) The link prediction problem for social networks. In: Proceedings of the twelfth international conference on information and knowledge management. CIKM ’03, p 556–559. Association for computing machinery, New York, NY, USA. https://doi.org/10.1145/956863.956972
Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Proceedings of the 31st international conference on neural information processing systems. NIPS’17, p 4768–4777. Curran Associates Inc., Red Hook, NY, USA
Lü L, Pan L, Zhou T, Zhang Y-C, Stanley HE (2015) Toward link predictability of complex networks. Proc Natl Acad Sci 112(8):2325–2330. https://doi.org/10.1073/pnas.1424644112
Article MathSciNet MATH Google Scholar
Lü L, Zhou T (2011) Link prediction in complex networks: a survey, Phys A Stat Mech Appl 390(6):1150–1170. https://doi.org/10.1016/j.physa.2010.11.027
Article Google Scholar
Maekawa S, Zhang J, Fletcher G, Onizuka M (2019) General generator for attributed graphs with community structure. In: Proceeding of the ECML/PKDD graph embedding and mining workshop, p 1–5. https://gem-ecmlpkdd.github.io/archive/2019/papers/GEM2019_paper_15.pdf
Martnez V, Berzal F, Cubero J-C (2016) A survey of link prediction incomplex networks. ACM Comput Surv. https://doi.org/10.1145/3012704
Article Google Scholar
McCallum AK, Nigam K, Rennie J, Seymore K (2000) Automating the construction of internet portals with machine learning. Inf Retr 3(2):127–163. https://doi.org/10.1023/a:1009953814988
Article Google Scholar
Menze BH, Kelm BM, Masuch R, Himmelreich U, Bachert P, Petrich W, (2009) Hamprecht FA A comparison of random forest and its gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinformatics 10(1). https://doi.org/10.1186/1471-2105-10-213
Ou Q, Jin Y-D, Zhou T, Wang B-H, Yin B-Q (2007) Power-law strength-degree correlation from resource-allocation dynamics on weighted networks. Phys Rev E 75(2:021102. https://doi.org/10.1103/PhysRevE.75.021102
Article Google Scholar
Pennekamp F, Iles AC, Garland J, Brennan G, Brose U, Gaedke U, Jacob U, Kratina P, Matthews B, Munch S, Novak M, Palamara GM, Rall BC, Rosenbaum B, Tabi A, Ward C, Williams R, Ye H, Petchey OL (2019) The intrinsic predictability of ecological time series and its potential to guide forecasting. Ecol Monogr 89(2):01359. https://doi.org/10.1002/ecm.1359
Article Google Scholar
Sen P, Namata G, Bilgic M, Getoor L, Galligher B, Eliassi-Rad T (2008) Collective classification in network data. AI Mag 29(3):93. https://doi.org/10.1609/aimag.v29i3.2157
Article Google Scholar
Shah N (2020) Scale-free, attributed and class-assortative graph generation to facilitate introspection of graph neural networks. In: Proceedings of the 16th international workshop on mining and learning with graphs (MLG). https://www.mlgworkshop.org/2020/papers/MLG2020_paper_33.pdf
Song C, Lin Q, Ling G, Zhang Z, Chen H, Liao J, Chen C (2020) Locec: Local community-based edge classification in large online social networks. In: 2020 IEEE 36th international conference on data engineering (ICDE), p 1689–1700. https://doi.org/10.1109/ICDE48307.2020.00150
Stavinova E, Evmenova E, Antonov A, (2022) Chunaev: Link predictability classes in complex networks. In: Benito RM, Cherifi C, Cherifi H, Moro E, Rocha LM, Sales-Pardo M (eds.) Complex networks & their applications X, Springer, Cham, p 376–387.https://doi.org/10.1007/978-3-030-93409-5_32
Stavinova E, Bochenina K, Chunaev P (2021) Predictability classes for forecasting clients behavior by transactional data. In: Paszynski M, Kranzlmüller D, Krzhizhanovskaya VV, Dongarra JJ, Sloot PMA (eds.) lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics) vol 12744 LNCS, Springer, Cham, p 187–199. https://doi.org/10.1007/978-3-030-77967-2_16
Su W, Yuan Y, Zhu, M (2015) A relationship between the average precision and the area under the roc curve. In: Proceedings of the 2015 international conference on the theory of information retrieval, p 349–352. https://doi.org/10.1145/2808194.2809481
Wojtas M, Chen K (2020) Feature importance ranking for deep learning. Adv Neural Inf Proc Syst 33:5105–5114
Google Scholar
Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):181–213. https://doi.org/10.1007/s10115-013-0693-z
Article Google Scholar
Zhou T, Lü L, Zhang Y-C (2009) Predicting missing links via local information. Eur Phys J B 71(4):623–630. https://doi.org/10.1140/epjb/e2009-00335-8
Article MATH Google Scholar

Download references

Acknowledgements

This research is financially supported by the Russian Science Foundation, Agreement 17-71-30029, with co-financing of Bank Saint Petersburg, Russia.

Author information

Authors and Affiliations

ITMO University, 16 Birzhevaya Lane, Saint Petersburg, Russia
Andrey Antonov, Elizaveta Stavinova, Elizaveta Evmenova & Petr Chunaev

Authors

Andrey Antonov
View author publications
You can also search for this author inPubMed Google Scholar
Elizaveta Stavinova
View author publications
You can also search for this author inPubMed Google Scholar
Elizaveta Evmenova
View author publications
You can also search for this author inPubMed Google Scholar
Petr Chunaev
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Petr Chunaev.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Antonov, A., Stavinova, E., Evmenova, E. et al. Link predictability classes in large node-attributed networks. Soc. Netw. Anal. Min. 12, 81 (2022). https://doi.org/10.1007/s13278-022-00912-w

Download citation

Received: 10 March 2022
Revised: 14 June 2022
Accepted: 20 June 2022
Published: 15 July 2022
DOI: https://doi.org/10.1007/s13278-022-00912-w

Keywords

Part of a collection:

Complex Networks 2021

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Link predictability classes in large node-attributed networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Link Predictability Classes in Complex Networks

A survey on feature extraction and learning techniques for link prediction in homogeneous and heterogeneous complex networks

An Experiment with Link Prediction in Social Network: Two New Link Prediction Methods

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now