Skip to main content

A PAC-Bayes Bound for Tailored Density Estimation

  • Conference paper
Algorithmic Learning Theory (ALT 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6331))

Included in the following conference series:

Abstract

In this paper we construct a general method for reporting on the accuracy of density estimation. Using variational methods from statistical learning theory we derive a PAC, algorithm-dependent bound on the distance between the data generating distribution and a learned approximation. The distance measure takes the role of a loss function that can be tailored to the learning problem, enabling us to control discrepancies on tasks relevant to subsequent inference. We apply the bound to an efficient mixture learning algorithm. Using the method of localisation we encode properties of both the algorithm and the data generating distribution, producing a tight, empirical, algorithm-dependent upper risk bound on the performance of the learner. We discuss other uses of the bound for arbitrary distributions and model averaging.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Shawe-Taylor, J., Dolia, A.: A framework for probability density estimation. In: ICML (2007)

    Google Scholar 

  2. Song, L., Zhang, X., Smola, A., Gretton, A., Schölkopf, B.: Tailoring density estimation via reproducing kernel moment matching. In: ICML 2008: Proceedings of the 25th international conference on Machine learning, pp. 992–999. ACM, New York (2008)

    Chapter  Google Scholar 

  3. McAllester, D.A.: PAC-Bayesian model averaging. In: Proceedings of the Twelfth Annual Conference on Computational Learning Theory, pp. 164–170. ACM Press, New York (1999)

    Chapter  Google Scholar 

  4. Seeger, M.: Pac-Bayesian generalisation error bounds for gaussian process classification. J. Mach. Learn. Res. 3, 233–269 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  5. Langford, J.: Tutorial on practical prediction theory for classification. J. Mach. Learn. Res. 6, 273–306 (2005)

    MathSciNet  Google Scholar 

  6. Audibert, J.Y.: Aggregated estimators and empirical complexity for least square regression. Annales de l’Institut Henri Poincare (B) Probability and Statistics 40(6), 685–736 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  7. Dalalyan, A., Tsybakov, A.B.: Aggregation by exponential weighting, sharp pac-bayesian bounds and sparsity. Mach. Learn. 72(1-2), 39–61 (2008)

    Article  Google Scholar 

  8. Zhang, T.: Information-theoretic upper and lower bounds for statistical estimation. IEEE Transactions on Information Theory 52(4), 1307–1321 (2006)

    Article  Google Scholar 

  9. Seldin, Y., Tishby, N.: A PAC-Bayesian approach to unsupervised learning with application to co-clustering analysis. Journal of Machine Learning Research, 1–46 (03 2010)

    Google Scholar 

  10. Germain, P., Lacasse, A., Laviolette, F., Marchand, M.: A PAC-Bayes risk bound for general loss functions. In: Schölkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems, vol. 19, pp. 449–456. MIT Press, Cambridge (2007)

    Google Scholar 

  11. Ralaivola, L., Szafranski, M., Stempfel, G.: Chromatic PAC-Bayes Bounds for Non-IID Data. In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics AISTATS 2009. JMLR Workshop and Conference Proceedings, vol. 5, pp. 416–423 (2009)

    Google Scholar 

  12. Lever, G., Laviolette, F., Shawe-Taylor, J.: Distribution dependent PAC-Bayes priors. Technical report, University College London (2010)

    Google Scholar 

  13. Catoni, O.: A PAC-Bayesian approach to adaptive classification. Technical report, Laboratoire de Probabilités et Modéles Aléatoires, Universités Paris 6 and Paris 7 (2003)

    Google Scholar 

  14. Audibert, J.Y.: A better variance control for PAC-Bayesian classification. Technical report, Laboratoire de Probabilités et Modéles Aléatoires, Universités Paris 6 and Paris 7 (2004)

    Google Scholar 

  15. Alquier, P.: PAC-Bayesian bounds for randomized empirical risk minimizers. In: Mathematical Methods of StatisticS (2007)

    Google Scholar 

  16. Catoni, O.: Pac-Bayesian supervised classification: The thermodynamics of statistical learning (2007)

    Google Scholar 

  17. Serfling, R.J.: Approximation Theorems of Mathematical Statistics. John Wiley and Sons, Chichester (1980)

    Book  MATH  Google Scholar 

  18. Maurer, A.: A note on the PAC Bayesian theorem (2004)

    Google Scholar 

  19. Germain, P., Lacasse, A., Laviolette, F., Marchand, M.: PAC-Bayesian learning of linear classifiers. In: ICML (2009)

    Google Scholar 

  20. Smola, A., Gretton, A., Song, L., Schölkopf, B.: A Hilbert space embedding for distributions. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds.) ALT 2007. LNCS (LNAI), vol. 4754, pp. 13–31. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  21. Sriperumbudur, B.K., Gretton, A., Fukumizu, K., Lanckriet, G.R.G., Shoelkopf, B.: Injective Hilbert space embeddings of probability measures. In: COLT, pp. 111–122. Omnipress (2008)

    Google Scholar 

  22. Cover, T.M., Thomas, J.A.: Elements of information theory. Wiley Interscience, New York (1991)

    Book  MATH  Google Scholar 

  23. Bonnans, J., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer Series in Statistics. Springer, Heidelberg (2000)

    MATH  Google Scholar 

  24. Shawe-Taylor, J., Cristianini, N.: Estimating the moments of a random vector. In: Proceedings of GRETSI 2003 Conference, vol. 1, p. 47–52 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Higgs, M., Shawe-Taylor, J. (2010). A PAC-Bayes Bound for Tailored Density Estimation. In: Hutter, M., Stephan, F., Vovk, V., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2010. Lecture Notes in Computer Science(), vol 6331. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16108-7_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-16108-7_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-16107-0

  • Online ISBN: 978-3-642-16108-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics