Abstract
Link prediction aims to predict missing links or eliminate spurious links by employing known complex network information. As an unsupervised linear feature representation method, matrix factorization (MF)-based autoencoder (AE) can project the high-dimensional data matrix into the low-dimensional latent space. However, most of the traditional link prediction methods based on MF or AE adopt shallow models and single adjacency matrices, which cannot adequately learn and represent network features and are susceptible to noise. In addition, because some methods require the input of symmetric data matrix, they can only be used in undirected networks. Therefore, we propose a deep manifold matrix factorization autoencoder model using global connectivity matrix, called DM-MFAE-G. The model utilizes PageRank algorithm to get the global connectivity matrix between nodes for the complex network. DM-MFAE-G performs deep matrix factorization on the local adjacency matrix and global connectivity matrix, respectively, to obtain global and local multi-layer feature representations, which contains the rich structural information. In this paper, the model is solved by alternating iterative optimization method, and the convergence of the algorithm is proved. Comprehensive experiments on different real networks demonstrate that the global connectivity matrix and manifold constraints introduced by DM-MFAE-G significantly improve the link prediction performance on directed and undirected networks.
Similar content being viewed by others
References
Chen G, Xu C, Wang J, Feng J, Feng J (2019) Graph regularization weighted nonnegative matrix factorization for link prediction in weighted complex network. Neurocomputing 369:50–60. https://doi.org/10.1016/j.neucom.2019.08.068
Xiao Y, Li R, Lu X, Liu Y (2021) Link prediction based on feature representation and fusion. Inform Sci 548:1–17. https://doi.org/10.1016/j.ins.2020.09.039
Gu S, Chen L, Li B, Liu W, Chen B (2019) Link prediction on signed social networks based on latent space mapping. Appl Intell 49:7513–7528. https://doi.org/10.1007/s10489-018-1284-1
Chen J, Zhang J, Xu X, Fu C, Zhang D, Zhang Q, Xuan Q (2021) E-lstm-d: A deep learning framework for dynamic network link prediction. IEEE Trans Syst Man Cybern: Syst 51(6):3699–3712. https://doi.org/10.1109/TSMC.2019.2932913
Khaksar Manshad M, Meybodi M, Salajegheh A (2021) A new irregular cellular learning automata-based evolutionary computation for time series link prediction in social networks. Appl Intell 51:71–84. https://doi.org/10.1007/s10489-020-01685-5
Buza K, Peŝka L, Koller J (2020) Modified linear regression predicts drug-target interactions accurately. PLOS ONE 15(4):1–18. https://doi.org/10.1371/journal.pone.0230726
Amiri Souri E, Laddach R, Karagiannis SN, Papageorgiou LG, Tsoka S (2022) Novel drug-target interactions via link prediction and network embedding. BMC Bioinformatics 23(1):1–16
Wang H, Le Z (2021) Expert recommendations based on link prediction during the covid-19 outbreak. Scientometrics 126:1–20. https://doi.org/10.1007/s11192-021-03893-3
Gu S, Chen L, Li B, Liu W, Chen B (2019) Link prediction on signed social networks based on latent space mapping. Appl Intell: Int J Artif Intell Neural Netw Complex Problem-Solving Technol 49(2):703–722
Zhang Q, Wang R, Yang J, Xue L (2022) Structural context-based knowledge graph embedding for link prediction. Neurocomputing 470:109–120. https://doi.org/10.1016/j.neucom.2021.10.088
Chen L, Cui J, Tang X, Qian Y, Li Y, Zhang Y (2022) Rlpath: a knowledge graph link prediction method using reinforcement learning based attentive relation path searching and representation learning. Appl Intell 52:4715–4726. https://doi.org/10.1007/s10489-021-02672-0
Lü L, Zhou T (2011) Link prediction in complex networks: A survey. Phys A: Stat Mech Appl 390(6):1150–1170. https://doi.org/10.1016/j.physa.2010.11.027
Pan L, Zhou T, Lü L, Hu, C-K (2016) Predicting missing links and identifying spurious links via likelihood analysis. Scientific Reports 6
Zhao Z, Gou Z, Du Y, Ma J, Li T, Zhang R (2022) A novel link prediction algorithm based on inductive matrix completion. Expert Syst Appl 188:116033. https://doi.org/10.1016/j.eswa.2021.116033
Newman MEJ (2001) Clustering and preferential attachment in growing networks. Phys Rev E 64:025102. https://doi.org/10.1103/PhysRevE.64.025102
Adamic LA, Adar E (2003) Friends and neighbors on the web. Social Netw 25(3):211–230. https://doi.org/10.1016/S0378-8733(03)00009-1
Zhou T, Lü L, Zhang Y-C (2009) Predicting missing links via local information. European Phys J B - Condensed Matter Complex Syst 71:623–630. https://doi.org/10.1140/epjb/e2009-00335-8
Jaccard P (1901) Distribution de la flore alpine dans le bassin des dranses et dans quelques régions voisines. Bulletin de la Societe Vaudoise des Sciences Naturelles 37:241–72. https://doi.org/10.5169/seals-266440
Katz L (1953) A new status index derived from sociometric analysis. Psychometrika 18(1):39–43. https://doi.org/10.1007/BF02289026
Liu W, Lü L (2010) Link prediction based on local random walk. Europhysic Lett 89:58007. https://doi.org/10.1209/0295-5075/89/58007
Tong H, Faloutsos C, Pan J-y (2006) Fast random walk with restart and its applications. In: Sixth International Conference on Data Mining (ICDM’06), pp 613–622. https://doi.org/10.1109/ICDM.2006.70
Zhang X, Zhao C, Wang X, Yi D (2014) Identifying missing and spurious interactions in directed networks. In: Cai Z, Wang C, Cheng S, Wang H, Gao H (eds) Wireless Algorithms, Systems, and Applications. Springer, Cham, pp 470–481
Pan Y, Zou J, Qiu J, Wang S, Hu G, Pan Z (2022) Joint network embedding of network structure and node attributes via deep autoencoder. Neurocomputing 468:198–210. https://doi.org/10.1016/j.neucom.2021.10.032
Lü L, Pan L, Zhou T, Zhang Y-C, Stanley H (2015) Toward link predictability of complex networks. Proc National Academy Sci 112:201424644. https://doi.org/10.1073/pnas.1424644112
Wang W, Feng Y, Jiao P, Yu W (2017) Kernel framework based on non-negative matrix factorization for networks reconstruction and link prediction. Knowledge-Based Syst 137:104–114. https://doi.org/10.1016/j.knosys.2017.09.020
Ma X, Sun P, Wang Y (2018) Graph regularized nonnegative matrix factorization for temporal link prediction in dynamic networks. Phys A: Stat Mech Appl 496:121–136. https://doi.org/10.1016/j.physa.2017.12.092
Tosyali A, Kim J, Choi J, Jeong MK (2019) Regularized asymmetric nonnegative matrix factorization for clustering in directed networks. Pattern Recogn Lett 125:750–757. https://doi.org/10.1016/j.patrec.2019.07.005
Chen G, Xu C, Wang J, Feng J, Feng J (2020) Robust non-negative matrix factorization for link prediction in complex networks using manifold regularization and sparse learning. Phys A: Stat Mech Appl 539:122882. https://doi.org/10.1016/j.physa.2019.122882
Jiao P, Guo X, Jing X, He D, Wu H, Pan S, Gong M, Wang W (2021) Temporal network embedding for link prediction via vae joint attention mechanism. IEEE Transactions on Neural Networks and Learning Systems, 1–14. https://doi.org/10.1109/TNNLS.2021.3084957
Fang Z, Tan S, Wang Y, Lu J (2021) Elementary subgraph features for link prediction with neural networks. IEEE Transactions on Knowledge and Data Engineering, 1–1. https://doi.org/10.1109/TKDE.2021.3132352
Ghorbanzadeh H, Sheikhahmadi A, Jalili M, Sulaimany S (2021) A hybrid method of link prediction in directed graphs. Expert Syst Appl 165:113896. https://doi.org/10.1016/j.eswa.2020.113896
Wang X-W, Chen Y, Liu Y-Y (2020) Link prediction through deep generative model. iScience 23(10):101626. https://doi.org/10.1016/j.isci.2020.101626
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791
Ye F, Chen C, Zheng Z (2018) Deep autoencoder-like nonnegative matrix factorization for community detection. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. CIKM ’18, 1393–1402. https://doi.org/10.1145/3269206.3271697
Chen W-S, Zeng Q, Pan B (2022) A survey of deep nonnegative matrix factorization. Neurocomputing 491:305–320. https://doi.org/10.1016/j.neucom.2021.08.152
Page L, Brin S, Motwani R, Winograd T (1998) The PageRank Citation Ranking: Bringing Order to the Web
Kong D, Ding C, Huang H (2011) Robust nonnegative matrix factorization using l21-norm. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. CIKM ’11, 673–682. https://doi.org/10.1145/2063576.2063676
Hanley JA, Mcneil B (1982) The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143:29–36. https://doi.org/10.1148/radiology.143.1.7063747
Herlocker J, Konstan J, Terveen L, Lui J.C.s, Riedl T (2004) Evaluating collaborative filtering recommender systems. ACM Trans Inform Syst 22:5–53. https://doi.org/10.1145/963770.963772
Rossi RA, Ahmed NK (2015) The network data repository with interactive graph analytics and visualization. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. https://networkrepository.com
Kunegis J (2013) Konect: The koblenz network collection. In: Proceedings of the 22nd International Conference on World Wide Web. WWW ’13 Companion, 1343–1350. https://doi.org/10.1145/2487788.2488173
Batagelj V, Mrvar A (2006) Pajek datasets. http://vlado.fmf.uni-lj.si/pub/networks/data/
Xie T, Zhang H, Liu R, Xiao H (2022) Accelerated sparse nonnegative matrix factorization for unsupervised feature learning. Pattern Recogn Lett 156:46–52. https://doi.org/10.1016/j.patrec.2022.01.020
Lee DD, Seung HS (2000) Algorithms for non-negative matrix factorization. In: Proceedings of the 13th International Conference on Neural Information Processing Systems. NIPS’00, 535–541. MIT Press, Cambridge, MA, USA
Acknowledgements
This research was supported by National Natural Science Foundation of China(Grant No.11571074) and Natural Science Foundation of Fujian Province, China(Grant No.2022J01102).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
This is the first submission of this manuscript and no parts of this manuscript are being considered for publication elsewhere. All authors have approved this manuscript. No author has financial or other contractual agreements that might cause conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 Convergence proof of DM-MFAE-G
Here, we demonstrate the convergence of DM-MFAE-G, i.e., \(\mathcal {L}\) is non-increasing if under the update rules (27), (28) and (29). We first prove the following Theorems 1 to 3.
Theorem 1
Updating \(U_{\textrm{i}}\) using the rule of (27) while fixing \(W_{\textrm{i}}\) and \(H_{\textrm{i}}\), the object function of \(\mathcal {L} \) monotonically decreases.
Theorem 2
Updating \(W_{\textrm{i}}\) using the rule of (28) while fixing \(U_{\textrm{i}}\) and \(H_{\textrm{i}}\), the object function of \(\mathcal {L} \) monotonically decreases.
Theorem 3
Updating \(H_{\textrm{i}}\) using the rule of (29) while fixing \(U_{\textrm{i}}\) and \(W_{\textrm{i}}\), the object function of \(\mathcal {L} \) monotonically decreases.
Lemma 1
Under the rule of (27), the following inequation holds
where
t is the iteration number of times.
Proof
We propose lemmas and definitions based on D. Lee et al. [44] to aid us prove Lemma 1.\(\square \)
Definition 1
Let function \(G(h,h^{'})\) is an auxiliary function of the function L(h), if \(\textrm{ }G\left( H,H^{\mathrm {'}} \right) \ge L\left( h \right) ,G\left( h,h^{\mathrm {'}} \right) =L\left( h \right) \), for any \(h,h^{\mathrm {'}}\) is satisfied.
Lemma 2
If \(G(h,h^{\mathrm {'}})\) is an auxiliary function for L(h), then L(h) is non-increasing under the update
Proof
\(L\left( h^{t+1} \right) \le G\left( h^{t+1} \right) ,h^t\le G\left( h^t \right) ,h^t=L\left( h^t \right) \).\(\square \)
Next, we rewrite the object function \(\mathcal {L} \) and remove the irrelevant terms to get a function that is only related to \(U_i\).
Then we prove that the (27) is updated according to the rules of (34) under a certain auxiliary function. Considering that for the element \(\left( U_i \right) _{kl}\) of \(U_i\). The part of the objective function only related to \(\left( U_i \right) _{kl}\) is denoted as \(\mathcal {L} _{\left( U_i \right) _{kl}}\), the derivative of \(\left( U_i \right) _{kl}\) can be obtained:
where \(A_1=\psi _{i-1}^{T}\psi _{i-1}\), \(A_2=W_iW_{i}^{T}\), \(A_3=H_iH_{i}^{T}\),\(\textrm{ }A_4=\psi _{i-1}^{T}XX^T\psi _{i-1}\), \(A_5=\psi _{i-1}^{T}CC^T\psi _{i-1}\). Therefore, it is shown that for each \(\left( U_i \right) _{kl}\), it is updated according to the rules of (34).
Lemma 3
Assuming that \(\mathcal {L} ^{\mathrm {'}}\) represents the first derivative with respect to the variable \(U_i\), the Eq. (38) is an auxiliary function of \(\mathcal {L} _{\left( U_i \right) _{kl}}\).
Proof
Clearly, \(G\left( U_i,U_i \right) =\mathcal {L} _{\left( U_i \right) _{kl}}\left( U_i \right) \)
We will prove that \(G\left( U_i,\left( U_i \right) _{kl}^{t} \right) \ge \mathcal {L} _{\left( U_i \right) _{kl}}\left( U_i \right) \).
Firstly, the Taylor expansion of \(\mathcal {L} _{\left( U_i \right) _{kl}}\left( U_i \right) \) is as follows:
Then, by comparing (38) and (39), it can be found that to make \(G\left( U_i,\left( U_i \right) _{kl}^{t} \right) \ge \mathcal {L} _{\left( U_i \right) _{kl}}\left( U_i \right) \) hold, the following inequality necessary to be satisfied
Comparing both sides of the inequality, it is clearly that (41) (44) hold:
Thus, inequality (40) holds. \(G\left( U_i,\left( U_i \right) _{kl}^{t} \right) \) is an auxiliary function of \(\mathcal {L} _{\left( U_i \right) _{kl}}\).\(\square \)
Based on the properties of auxiliary functions, Lemma 1 can be proved.
To find the globe minimum of \(G\left( U_i,\left( U_i \right) _{kl}^{t} \right) \) with \(U_{\textrm{i}}\) fixed, we take the derivative of \(G\left( U_i,\left( U_i \right) _{kl}^{t} \right) \) on \(U_{\textrm{i}}\), then according to KKT conditions we get:
According to (36), we have
Therefore, the iteration rule obtained by the auxiliary function (38) under the update (34) is exactly the iteration (27) for \(U_i\). Since the Hessian matrix \(\frac{\partial ^2G\left( U_i,\left( U_i \right) _{kl}^{t} \right) }{\partial \left( U_i \right) _{kl}\partial \left( U_i \right) _{kl}}\) is positive define and \(G\left( U_i,\left( U_i \right) _{kl}^{t} \right) \) is a convex function. Thus, objective function \(\mathcal {L} \) is non-increasing under the update rule (38) for \(\mathcal {L} \), and Theorem 1 is proved.
Due to limited space, the proof of Theorem 2 and 3 are omitted.Because there is a lower bound because the loss function \(\mathcal {L}>0\) holds constant. Therefore, the DM-MFAE-G model converges.
1.2 More details about the experiments
We report the deep factorized feature dimension of each layer \(R=[r_1,r_2,...,r_p]\) in Table 7.
Meanwhile, we consider that the dataset needs to be randomly divided into a training set and a test set first in link prediction experiments, the optimal parameters of the model will change with the different divisions because the division of data is random. Therefore, we give below the approximate range of the optimal parameters \(\alpha \), \(\beta \), \(\gamma \) appearing in each data set during the missing link prediction experiments as a reference in Table 8.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, X., Chen, X. & Zheng, Z. Deep manifold matrix factorization autoencoder using global connectivity for link prediction. Appl Intell 53, 25816–25835 (2023). https://doi.org/10.1007/s10489-023-04887-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04887-9