An Information Theoretic Optimal Classifier for Semi-supervised Learning

Yin, Ke; Davidson, Ian

doi:10.1007/978-3-540-28651-6_110

An Information Theoretic Optimal Classifier for Semi-supervised Learning

Ke Yin¹⁹ &
Ian Davidson¹⁹

Conference paper

1303 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3177))

Abstract

Model uncertainty refers to the risk associated with basing prediction on only one model. In semi-supervised learning, this uncertainty is greater than in supervised learning (for the same total number of instances) given that many data points are unlabelled. An optimal Bayes classifier (OBC) reduces model uncertainty by averaging predictions across the entire model space weighted by the models’ posterior probabilities. For a given model space and prior distribution OBC produces the lowest risk. We propose an information theoretic method to construct an OBC for probabilistic semi-supervised learning using Markov chain Monte Carlo sampling. This contrasts with typical semi-supervised learning that attempts to find the single most probable model using EM. Empirical results verify that OBC yields more accurate predictions than the best single model.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agusta, Y., Dowe, D.L.: Unsupervised Learning of Correlated Multivariate Gaussian Mixture Models Using MML. In: Australian Conference on Artificial Intelligence (2003)
Google Scholar
Baxter, R.A., Oliver, J.J.: Finding Overlapping Components with MML. Statistics and Computing 10, 5–16 (2000)
Article Google Scholar
Conway, J.H., Sloane, N.J.A.: Sphere Packings, Lattices and Groups. Springer, London (1988)
MATH Google Scholar
Gilks, W., Richardson, S., Spiegelhalter, D.: Markov Chain Monte Carlo in Practice. Interdisciplinary Statistics. Chapman and Hall, Boca Raton (1996)
Google Scholar
Hansen, M.H., Yu, B.: Model selection and the principle of minimum description length. J. American Statistical Association 96, 746–774 (2001)
Article MATH MathSciNet Google Scholar
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Oliver, J.J., Baxter, R.A.: MML and Bayesianism: similarities and differences, Dept. of Computer Science, Monash University, Clayton, Victoria 3168, Australia, Technical Report TR 206 (1994)
Google Scholar
Oliver, J.J., Baxter, R.A., Wallace, C.S.: Unsupervised Learning Using MML, Machine Learning. In: Proceedings of the Thirteenth International Conference (1996)
Google Scholar
Quinlan, R., Rivest, R.L.: Inferring Decision Trees Using the Minimum Description Length Principle. Information and Computation 80(3), 227–248 (1989)
Article MATH MathSciNet Google Scholar
Rissanen, J.: Stochastic complexity. J. Royal Statistical Society, Series B 49(3), 223–239 (1987)
MATH MathSciNet Google Scholar
Solomonoff, R.J.: A Formal Theory of Induction Inference. Information and Control, Part I 7(1), 1–22 (1964)
Article MATH MathSciNet Google Scholar
Stephens, M.: Dealing with label-switching in mixture models. Journal of the Royal Statistical Society, Series B 62, 795–809 (2000)
Article MATH MathSciNet Google Scholar
Wallace, C.S., Boulton, D.M.: An Information Measure for Classification. Computer Journal 11, 185–195 (1968)
MATH Google Scholar
Wallace, C.S., Dowe, D.L.: Minimum Message Length and Kolmogorov Complexity. The computer Journal 42(4), 270–283 (1999)
Article MATH Google Scholar
Wallace, C.S., Freeman, P.R.: Estimation and inference by compact encoding (with discussion). Journal of the Royal Statistical Society series B 49, 240–265 (1987)
MATH MathSciNet Google Scholar
Wallace, C.S., Patrick, J.D.: Coding Decision Trees. Machine Learning 11, 7–22 (1993)
Article MATH Google Scholar
Yin, K., Davidson, I.: Bayesian Model Averaging Across Model Spaces via Compact Encoding. In: Eighth International Symposium on Artificial Intelligence and Mathematics (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University at Albany, 1400 Washington Ave, Albany, NY, 12222, USA
Ke Yin & Ian Davidson

Authors

Ke Yin
View author publications
You can also search for this author in PubMed Google Scholar
Ian Davidson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering, Computing, and Mathematics, University of Exeter, EX4 4QF, Exeter, UK
Zheng Rong Yang
School of Electrical and Electronic Engineering, University of Manchester, UK
Hujun Yin
School of Engineering, Computer Science and Mathematics, University of Exeter, EX4 4QF, UK
Richard M. Everson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yin, K., Davidson, I. (2004). An Information Theoretic Optimal Classifier for Semi-supervised Learning. In: Yang, Z.R., Yin, H., Everson, R.M. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2004. IDEAL 2004. Lecture Notes in Computer Science, vol 3177. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28651-6_110

Download citation

DOI: https://doi.org/10.1007/978-3-540-28651-6_110
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22881-3
Online ISBN: 978-3-540-28651-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics