Abstract
We consider the problem of labeling a partially labeled graph. This setting may arise in a number of situations from survey sampling to information retrieval to pattern recognition in manifold settings. It is also of potential practical importance, when the data is abundant, but labeling is expensive or requires human assistance.
Our approach develops a framework for regularization on such graphs. The algorithms are very simple and involve solving a single, usually sparse, system of linear equations. Using the notion of algorithmic stability, we derive bounds on the generalization error and relate it to structural invariants of the graph. Some experimental results testing the performance of the regularization algorithm and the usefulness of the generalization bound are presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Belkin, M., Niyogi, P.: Using Manifold Structure for Partially Labeled Classification. In: Advances in Neural Information Processing Systems, vol. 15, MIT Press, Cambridge (2003)
Blum, A., Chawla, S.: Learning from Labeled and Unlabeled Data using Graph Mincuts. In: ICML (2001)
Bousquet, O., Elisseeff, A.: Algorithmic Stability and Generalization Performance. In: Advances in Neural Information Processing Systems, vol. 13, pp. 196–202. MIT Press, Cambridge (2001)
Chapelle, O., Weston, J., Scholkopf, B.: Cluster Kernels for Semi-Supervised Learning. In: Becker, S., Thrun, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems, vol. 15,
Chung, F.R.K.: Spectral Graph Theory, Regional Conference Series in Mathematics, vol . 92 (1997)
Devroye, L.P., Wagner, T.J.: Distribution-free Performance Bounds for Potential Function Rules. IEEE Trans. on Information Theory 25(5), 202–207 (1979)
Fiedler, M.: Algebraic connectibity of graphs. Czechoslovak Mathematical Journal 23(98), 298–305 (1973)
Harville, D.: Matrix Algebra From A Statisticinan’s Perspective. Springer, Heidelberg (1997)
Joachims, T.: Transductive Inference for Text Classification using Support Vector Machines. In: Proceedings of ICML 1999, pp. 200–209 (1999)
Kleinberg, J.M., Tardos, É.: Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields. J. ACM 49(5), 616–639 (2002)
Kondor, I.R., Lafferty, J.: Diffusion Kernels on Graphs and Other Discrete Input Spaces. In: Proceedings of ICML (2002)
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text Classification from Labeled in Unlabeled Data. Machine Learning 39(2/3) (2000)
Smola, A. Kondor, R.: Kernels and Regularization on Graphs. COLT/KW (2003)
Szummer, M., Jaakkola, T.: Partially labeled classification with Markov random walks. In: Neural Information Processing Systems (NIPS), vol. 14 (2001)
Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)
Zhou, D., Bousquet, O., Lal, T.N., Weston, J. Schoelkopf, B.: Learning with Local and Global Consistency. Max Planck Institute for Biological Cybernetics Technical Report (June 2003)
Zhu, X., Lafferty, J., Ghahramani, Z.: Semi-supervised learning using Gaussian fields and harmonic functions. In: Machine Learning: Proceedings of the Twentieth International Conference (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Belkin, M., Matveeva, I., Niyogi, P. (2004). Regularization and Semi-supervised Learning on Large Graphs. In: Shawe-Taylor, J., Singer, Y. (eds) Learning Theory. COLT 2004. Lecture Notes in Computer Science(), vol 3120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27819-1_43
Download citation
DOI: https://doi.org/10.1007/978-3-540-27819-1_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22282-8
Online ISBN: 978-3-540-27819-1
eBook Packages: Springer Book Archive