Abstract
The task of semi-supervised classification aims at assigning labels to all nodes of a graph based on the labels known for a few nodes, called the seeds. One of the most popular algorithms relies on the principle of heat diffusion, where the labels of the seeds are spread by thermo-conductance and the temperature of each node at equilibrium is used as a score function for each label. In this paper, we prove that this algorithm is not consistent unless the temperatures of the nodes at equilibrium are centered before scoring. This crucial step does not only make the algorithm provably consistent on a block model but brings significant performance gains on real graphs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The number of citations of the paper [14] exceeds 4 000 in 2023, according to Google Scholar.
- 2.
- 3.
- 4.
References
Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. (2008)
Berberidis, D., Nikolakopoulos, A.N., Giannakis, G.B.: Adadif: Adaptive diffusions for efficient semi-supervised learning over graphs. In: International Conference on Big Data. IEEE (2018)
Chung, F.R.: Spectral graph theory. American Mathematical Soc. (1997)
Donnat, C., Zitnik, M., Hallac, D., Leskovec, J.: Learning structural node embeddings via diffusion wavelets. In: International Conference on Knowledge Discovery & Data Mining. In: ACM (2018)
Kondor, R.I., Lafferty, J.: Diffusion kernels on graphs and other discrete structures. In: Proceedings of the 19th international conference on machine learning (2002)
Li, Q., An, S., Li, L., Liu, W.: Semi-supervised learning on graph with an alternating diffusion process. CoRR (2019)
Ma, H., King, I., Lyu, M.R.: Mining web graphs for recommendations. IEEE Transactions on Knowledge and Data Engineering (2011)
Newman, M.E.J., Girvan, M.: Mixing patterns and community structure in networks. In: Pastor-Satorras, R., Rubi, M., Diaz-Guilera, A. (eds.) Statistical Mechanics of Complex Networks, pp. 66–87. Springer Berlin Heidelberg, Berlin, Heidelberg (2003). https://doi.org/10.1007/978-3-540-44943-0_5
Rossi, E., Kenlay, H., Gorinova, M.I., Chamberlain, B.P., Dong, X., Bronstein, M.M.: On the unreasonable effectiveness of feature propagation in learning on graphs with missing node features. In: Proceedings of Machine Learning Research (2022)
Thanou, D., Dong, X., Kressner, D., Frossard, P.: Learning heat diffusion graphs. IEEE Transactions on Signal and Information Processing over Networks (2017)
Tremblay, N., Borgnat, P.: Graph wavelets for multiscale community mining. IEEE Transactions on Signal Processing (2014)
Zachary, W.W.: An information flow model for conflict and fission in small groups. J. Anthropol. Res. (1977)
Zhu, X.: Semi-supervised learning with graphs. Ph.D. thesis, Carnegie Mellon University (2005)
Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of the 20th International conference on Machine learning (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix
A Proof of Lemma 1
Proof
In view of (2), we have:
for \( k=2,\ldots ,K\). We deduce:
with
The proof then follows from the fact that
B Proof of Theorem 1
Proof
Let \(\varDelta ^{(1)}_k = T_k - \bar{T}\) be the deviation of temperature of non-seed nodes of block k for the Dirichlet problem associated with label 1. In view of Lemma 1, we have:
For \(p>q\), using the fact that \(\bar{T} \in (0,1)\), we get \(\varDelta ^{(1)}_1 > 0\) and \(\varDelta ^{(1)}_k<0\) for all \(k=2,\ldots ,K\). By symmetry, for each label \(l = 1,\ldots ,K\), \(\varDelta ^{(l)}_l > 0\) and \(\varDelta ^{(l)}_k<0\) for all \(k\ne l\). We deduce that for each block k, \(\hat{y}_i=\arg \max _{l}\varDelta ^{(l)}_k = k\) for each free node i of block k.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Bonald, T., De Lara, N. (2024). A Consistent Diffusion-Based Algorithm for Semi-Supervised Graph Learning. In: Cherifi, H., Rocha, L.M., Cherifi, C., Donduran, M. (eds) Complex Networks & Their Applications XII. COMPLEX NETWORKS 2023. Studies in Computational Intelligence, vol 1141. Springer, Cham. https://doi.org/10.1007/978-3-031-53468-3_23
Download citation
DOI: https://doi.org/10.1007/978-3-031-53468-3_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53467-6
Online ISBN: 978-3-031-53468-3
eBook Packages: EngineeringEngineering (R0)