Abstract
We present two new methods for obtaining generalization error bounds in a semi-supervised setting. Both methods are based on approximating the disagreement probability of pairs of classifiers using unlabeled data. The first method works in the realizable case. It suggests how the ERM principle can be refined using unlabeled data and has provable optimality guarantees when the number of unlabeled examples is large. Furthermore, the technique extends easily to cover active learning. A downside is that the method is of little use in practice due to its limitation to the realizable case.
The idea in our second method is to use unlabeled data to transform bounds for randomized classifiers into bounds for simpler deterministic classifiers. As a concrete example of how the general method works in practice, we apply it to a bound based on cross-validation. The result is a semi-supervised bound for classifiers learned based on all the labeled data. The bound is easy to implement and apply and should be tight whenever cross-validation makes sense. Applying the bound to SVMs on the MNIST benchmark data set gives results that suggest that the bound may be tight enough to be useful in practice.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Madani, O., Pennock, D.M., Flake, G.W.: Co-validation: Using model disagreement to validate classification algorithms. In: NIPS 2004 Preproceedings (2004)
Balcan, M.F., Blum, A.: A PAC-style model for learning from labeled and unlabeled data, Draft (2004)
Castelli, V., Cover, T.M.: On the exponential value of labeled samples. Pattern Recognition Letters 16, 105–111 (1995)
Ratsaby, J., Venkatesh, S.S.: Learning from a mixture of labeled and unlabeled examples with parametric side information. In: Proceedings of the 8th Annual Conference on Computational Learning Theory (COLT 1995), New York, NY, USA, pp. 412–417. ACM Press, New York (1995)
Vapnik, V.N.: Statistical Learning Theory. John Wiley and Sons, New York (1998)
Schuurmans, D., Southey, F.: Metric-based methods for adaptive model selection and regularization. Machine Learning 42(1–3), 51–84 (2002)
Bengio, Y., Chapados, N.: Extensions to metric-based model selection. Journal of Machine Learning Research 3, 1209–1227 (2003)
Ben-David, S., Itai, A., Kushilevitz, E.: Learning by distances. Information and Computation 117, 240–250 (1995)
Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3, 463–482 (2002)
Kääriäinen, M.: Relating the Rademacher and VC bounds. Technical Report Report C-2004-57, Department of Computer Science, Series of Publications C (2004)
Valiant, L.G.: A theory of the learnable. Communications of the ACM 27, 1134–1142 (1984)
McAllester, D.A.: PAC-Bayesian stochastic model selection. Machine Learning 51, 5–21 (2003)
Cesa-Bianchi, N., Gentile, C.: Improved risk tail bounds for on-line algorithms A presentation in the (Ab)use of Bounds workshop (2004)
Blum, A., Kalai, A., Langford, J.: Beating the hold-out: bounds for k-fold and progressive cross-validation. In: Proceedings of the 12th Annual Conference on Computational Learning Theory, New York, NY, pp. 203–208. ACM Press, New York (1999)
Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Applications of Mathematics, vol. 31. Springer, Heidelberg (1996)
Langford, J.: Practical prediction theory for classification (2003), A tutorial presented at ICML 2003., Available at http://hunch.net/jl/projects/prediction_bounds/tutorial/tutorial.pdf
Kutin, S., Niyogi, P.: Almost-everywhere algorithmic stability and generalization error. In: Proceedings of Uncertainty in AI, pp. 275–282 (2002)
Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, MIT Press, Cambridge (1999)
Seeger, M.: Bayesian Gaussian Process Models: PAC-Bayesian Generalisation Error Bounds and Sparse Approximations. PhD thesis, University of Edinburgh (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kääriäinen, M. (2005). Generalization Error Bounds Using Unlabeled Data. In: Auer, P., Meir, R. (eds) Learning Theory. COLT 2005. Lecture Notes in Computer Science(), vol 3559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11503415_9
Download citation
DOI: https://doi.org/10.1007/11503415_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26556-6
Online ISBN: 978-3-540-31892-7
eBook Packages: Computer ScienceComputer Science (R0)