research-article

Archipelago: nonparametric Bayesian semi-supervised learning

Authors:
Ryan Prescott Adams

University of Cambridge, Cambridge, UK

University of Cambridge, Cambridge, UK
View Profile

,
Zoubin Ghahramani

University of Cambridge, Cambridge, UK and Carnegie Mellon University, Pittsburgh, PA

University of Cambridge, Cambridge, UK and Carnegie Mellon University, Pittsburgh, PA
View Profile

ICML '09: Proceedings of the 26th Annual International Conference on Machine LearningJune 2009Pages 1–8https://doi.org/10.1145/1553374.1553375

Published:14 June 2009Publication History

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

Pages 1–8

ABSTRACT

Semi-supervised learning (SSL), is classification where additional unlabeled data can be used to improve accuracy. Generative approaches are appealing in this situation, as a model of the data's probability density can assist in identifying clusters. Nonparametric Bayesian methods, while ideal in theory due to their principled motivations, have been difficult to apply to SSL in practice. We present a nonparametric Bayesian method that uses Gaussian processes for the generative model, avoiding many of the problems associated with Dirichlet process mixture models. Our model is fully generative and we take advantage of recent advances in Markov chain Monte Carlo algorithms to provide a practical inference method. Our method compares favorably to competing approaches on synthetic and real-world multi-class data.

References

Adams, R. P., Murray, I., & MacKay, D. J. C. (2009). The Gaussian process density sampler. Advances in Neural Information Processing Systems 21 (pp. 9--16).Google Scholar
Aeberhard, S., Coomans, D., & de Vel, O. (1992). Comparison of classifiers in high dimensional settings (Technical Report 92-02). James Cook University.Google Scholar
Asuncion, A., & Newman, D. (2007). UCI machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html.Google Scholar
Beskos, A., Papaspiliopoulos, O., Roberts, G. O., & Fearnhead, P. (2006). Exact and computationally efficient likelihood-based estimation for discretely observed diffusion processes. Journal of the Royal Statistical Society, Series B, 68, 333--382.Google ScholarCross Ref
Chu, W., Sindhwani, V., Ghahramani, Z., & Keerthi, S. S. (2007). Relational learning with Gaussian processes. Advances in Neural Information Processing Systems 19 (pp. 289--296).Google Scholar
Escobar, M. D., & West, M. (1995). Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association, 90, 577--588.Google ScholarCross Ref
Girolami, M., & Rogers, S. (2006). Variational Bayesian multinomial probit regression with Gaussian process priors. Neural Computation, 18, 1790--1817. Google ScholarDigital Library
Lawrence, N. D., & Jordan, M. I. (2005). Semi-supervised learning via Gaussian processes. Advances in Neural Information Processing Systems 17 (pp. 753--760).Google Scholar
Møller, J., Pettitt, A. N., Reeves, R., & Berthelsen, K. K. (2006). An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants. Biometrika, 93, 451--458.Google ScholarCross Ref
Murray, I., Ghahramani, Z., & MacKay, D. (2006). MCMC for doubly-intractable distributions. Uncertainty in Artificial Intelligence 22 (pp. 359--366).Google Scholar
Neal, R. M. (1997). Monte Carlo implementation of Gaussian process models for Bayesian regression and classification (Technical Report 9702). Department of Statistics, University of Toronto.Google Scholar
Quiñonero-Candela, J., & Rasmussen, C. E. (2005). A unifying view of sparse approximate Gaussian process regression. Journal of Machine Learning Research, 6, 1935--1959. Google ScholarDigital Library
Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge, MA: MIT Press. Google ScholarDigital Library
Rogers, S., & Girolami, M. (2007). Multi-class semi-supervised learning with the εε-truncated multinomial probit Gaussian process. Journal of Machine Learning Research: Gaussian Processes in Practice (pp. 17--32).Google Scholar
Sindhwani, V., Chu, W., & Keerthi, S. S. (2007). Semi-supervised Gaussian process classifiers. International Joint Conference on Artificial Intelligence (pp. 1059--1064). Google ScholarDigital Library
Teh, Y. W., Seeger, M., & Jordan, M. I. (2005). Semiparametric latent factor models. International Conference on Artificial Intelligence and Statistics 10 (pp. 333--340).Google Scholar
Williams, C. K. I., & Barber, D. (1998). Bayesian classification with Gaussian processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 1342--1351. Google ScholarDigital Library

Index Terms

Archipelago: nonparametric Bayesian semi-supervised learning
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
  2. Modeling and simulation
    1. Model development and analysis
      1. Model verification and validation
      2. Modeling methodologies
2. Mathematics of computing
  1. Probability and statistics

Recommendations

Mean-field variational approximate Bayesian inference for latent variable models

The ill-posed nature of missing variable models offers a challenging testing ground for new computational techniques. This is the case for the mean-field variational Bayesian inference. The behavior of this approach in the setting of the Bayesian probit ...
Read More
Latent-Space Variational Bayes

Variational Bayesian Expectation-Maximization (VBEM), an approximate inference method for probabilistic models based on factorizing over latent variables and model parameters, has been a standard technique for practical Bayesian inference. In this paper,...
Read More
Tractable Bayesian learning of tree augmented Naive Bayes models
ICML'03: Proceedings of the Twentieth International Conference on International Conference on Machine Learning

Bayesian classifiers such as Naive Bayes or Tree Augmented Naive Bayes (TAN) have shown excellent performance given their simplicity and heavy underlying independence assumptions. In this paper we introduce a classifier taking as basis the TAN model and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning
June 2009
1331 pages
ISBN:9781605585161
DOI:10.1145/1553374
General Chair:
Andrea Danyluk
Williams College
,
Program Chairs:
Léon Bottou
NEC Laboratories America
,
Michael Littman
Rutgers University
Copyright © 2009 Copyright 2009 by the author(s)/owner(s).
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 14 June 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate140of548submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 653
  Total Downloads
- Downloads (Last 12 months)29
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Archipelago: nonparametric Bayesian semi-supervised learning

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Mean-field variational approximate Bayesian inference for latent variable models

Latent-Space Variational Bayes

Tractable Bayesian learning of tree augmented Naive Bayes models

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Archipelago: nonparametric Bayesian semi-supervised learning

ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning

ABSTRACT

References

Cited By

Index Terms

Recommendations

Mean-field variational approximate Bayesian inference for latent variable models

Latent-Space Variational Bayes

Tractable Bayesian learning of tree augmented Naive Bayes models

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media