Skip to main content

Generalization Error Bounds Using Unlabeled Data

  • Conference paper
Book cover Learning Theory (COLT 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3559))

Included in the following conference series:

Abstract

We present two new methods for obtaining generalization error bounds in a semi-supervised setting. Both methods are based on approximating the disagreement probability of pairs of classifiers using unlabeled data. The first method works in the realizable case. It suggests how the ERM principle can be refined using unlabeled data and has provable optimality guarantees when the number of unlabeled examples is large. Furthermore, the technique extends easily to cover active learning. A downside is that the method is of little use in practice due to its limitation to the realizable case.

The idea in our second method is to use unlabeled data to transform bounds for randomized classifiers into bounds for simpler deterministic classifiers. As a concrete example of how the general method works in practice, we apply it to a bound based on cross-validation. The result is a semi-supervised bound for classifiers learned based on all the labeled data. The bound is easy to implement and apply and should be tight whenever cross-validation makes sense. Applying the bound to SVMs on the MNIST benchmark data set gives results that suggest that the bound may be tight enough to be useful in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Madani, O., Pennock, D.M., Flake, G.W.: Co-validation: Using model disagreement to validate classification algorithms. In: NIPS 2004 Preproceedings (2004)

    Google Scholar 

  2. Balcan, M.F., Blum, A.: A PAC-style model for learning from labeled and unlabeled data, Draft (2004)

    Google Scholar 

  3. Castelli, V., Cover, T.M.: On the exponential value of labeled samples. Pattern Recognition Letters 16, 105–111 (1995)

    Article  Google Scholar 

  4. Ratsaby, J., Venkatesh, S.S.: Learning from a mixture of labeled and unlabeled examples with parametric side information. In: Proceedings of the 8th Annual Conference on Computational Learning Theory (COLT 1995), New York, NY, USA, pp. 412–417. ACM Press, New York (1995)

    Chapter  Google Scholar 

  5. Vapnik, V.N.: Statistical Learning Theory. John Wiley and Sons, New York (1998)

    MATH  Google Scholar 

  6. Schuurmans, D., Southey, F.: Metric-based methods for adaptive model selection and regularization. Machine Learning 42(1–3), 51–84 (2002)

    Article  Google Scholar 

  7. Bengio, Y., Chapados, N.: Extensions to metric-based model selection. Journal of Machine Learning Research 3, 1209–1227 (2003)

    Article  MATH  Google Scholar 

  8. Ben-David, S., Itai, A., Kushilevitz, E.: Learning by distances. Information and Computation 117, 240–250 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  9. Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: Risk bounds and structural results. Journal of Machine Learning Research 3, 463–482 (2002)

    Article  MathSciNet  Google Scholar 

  10. Kääriäinen, M.: Relating the Rademacher and VC bounds. Technical Report Report C-2004-57, Department of Computer Science, Series of Publications C (2004)

    Google Scholar 

  11. Valiant, L.G.: A theory of the learnable. Communications of the ACM 27, 1134–1142 (1984)

    Article  MATH  Google Scholar 

  12. McAllester, D.A.: PAC-Bayesian stochastic model selection. Machine Learning 51, 5–21 (2003)

    Article  MATH  Google Scholar 

  13. Cesa-Bianchi, N., Gentile, C.: Improved risk tail bounds for on-line algorithms A presentation in the (Ab)use of Bounds workshop (2004)

    Google Scholar 

  14. Blum, A., Kalai, A., Langford, J.: Beating the hold-out: bounds for k-fold and progressive cross-validation. In: Proceedings of the 12th Annual Conference on Computational Learning Theory, New York, NY, pp. 203–208. ACM Press, New York (1999)

    Google Scholar 

  15. Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Applications of Mathematics, vol. 31. Springer, Heidelberg (1996)

    MATH  Google Scholar 

  16. Langford, J.: Practical prediction theory for classification (2003), A tutorial presented at ICML 2003., Available at http://hunch.net/jl/projects/prediction_bounds/tutorial/tutorial.pdf

  17. Kutin, S., Niyogi, P.: Almost-everywhere algorithmic stability and generalization error. In: Proceedings of Uncertainty in AI, pp. 275–282 (2002)

    Google Scholar 

  18. Joachims, T.: Making large-scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, MIT Press, Cambridge (1999)

    Google Scholar 

  19. Seeger, M.: Bayesian Gaussian Process Models: PAC-Bayesian Generalisation Error Bounds and Sparse Approximations. PhD thesis, University of Edinburgh (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kääriäinen, M. (2005). Generalization Error Bounds Using Unlabeled Data. In: Auer, P., Meir, R. (eds) Learning Theory. COLT 2005. Lecture Notes in Computer Science(), vol 3559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11503415_9

Download citation

  • DOI: https://doi.org/10.1007/11503415_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26556-6

  • Online ISBN: 978-3-540-31892-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics