MDL/Bayesian Criteria Based on Universal Coding/Measure

Suzuki, Joe

doi:10.1007/978-3-642-44958-1_31

Joe Suzuki¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7070))

1595 Accesses
1 Citations

Abstract

In the minimum description length (MDL) and Bayesian criteria, we construct description length of data z ⁿ = z ₁ ⋯ z _n of length n such that the length divided by n almost converges to its entropy rate as n → ∞, assuming z _i is in a finite set A. In model selection, if we knew the true probability P of z ⁿ ∈ A ⁿ, we would choose a model F such that the posterior probability of F given z ⁿ is maximized. But, in many situations, we use Q:A ⁿ → [0,1] such that \(\sum_{z^n\in A^n}Q(z^n)\leq 1\) rather than P because only data z ⁿ are available. In this paper, we consider an extension such that each of the attributes in data can be either discrete or continuous. The main issue is what Q is qualified to be an alternative to P in the generalized situations. We propose the condition in terms of the Radon-Nikodym derivative of P with respect to Q, and give the procedure of constructing Q in the general setting. As a result, we obtain the MDL/Bayesian criteria in a general sense.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barron, A.R.: The Strong Ergodic Theorem for Densities: Generalized Shannon-McMillan-Breiman Theorem. Annals. Probability 13(4), 1292–1303 (1985)
Article MathSciNet MATH Google Scholar
Buntine, W.L.: Learning Classification Trees. Statistics and Computing 2, 63–73 (1991)
Article Google Scholar
Billingsley, P.: Probability & Measure, 3rd edn. Wiley, New York (1995)
MATH Google Scholar
Cooper, G.F., Herskovits, E.: A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning 9, 309–347 (1992)
MATH Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory, 2nd edn. Wiley, New York (1995)
Google Scholar
Dowe, D.L.: Foreword re C. S. Wallace. Computer Journal 51(5), 523–560 (2008), Christopher Stewart WALLACE (1933-2004) memorial special issue
Article Google Scholar
Dowe, D.L.: MML, hybrid Bayesian network graphical models, statistical consistency, invariance and uniqueness. In: Bandyopadhyay, P.S., Forster, M.R. (eds.) Handbook of the Philosophy of Science (HPS). Philosophy of Statistics, vol. 7, pp. 901–982. Elsevier (June 2011)
Google Scholar
Krichevsky, R.E., Trofimov, V.K.: The Performance of Universal Encoding. IEEE Trans. Inform. Theory 27(2), 199–207 (1981)
Article MathSciNet MATH Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Statistics 22(1), 79–86 (1951)
Article MathSciNet MATH Google Scholar
Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)
Article MATH Google Scholar
Ryabko, B.: Compression-Based Methods for Nonparametric Prediction and Estimation of Some Characteristics of Time Series. IEEE Trans. on Inform. Theory 55(9), 4309–4315 (2009)
Article MathSciNet Google Scholar
Solomonoff, R.: A Formal Theory of Inductive Inference. Information and Control 22, 224–254 (1964)
Article MathSciNet Google Scholar
Suzuki, J.: On Strong Consistency of Model Selection in Classification. IEEE Trans. on Inform. Theory 52(11), 4767–4774 (2006)
Article Google Scholar
Suzuki, J.: A Construction of Bayesian Networks from Databases on an MDL Principle. In: The Ninth Conference on Uncertainty in Artificial Intelligence, vol. 7, pp. 266–273 (1993)
Google Scholar
Suzuki, J.: The Universal Measure for General Sources and its Application to MDL/Bayesian Criteria. In: Data Compression Conference 2011, Snowbird, Utah (2011)
Google Scholar
Wallace, C.S., Boulton, D.M.: An information measure for classification. Computer Journal 11(2), 185–194 (1968)
Article MATH Google Scholar
Wallace, C.S.: Statistical and Inductive Inference by Minimum Message Length. Springer (2005) ISBN: 0-387-23795-X
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Osaka University, Japan
Joe Suzuki

Authors

Joe Suzuki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Information Technology, Clayton School of Information Technology, Monash University, Bldg. 63, Wellington Road, 3800, Clayton, VIC, Australia
David L. Dowe

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Suzuki, J. (2013). MDL/Bayesian Criteria Based on Universal Coding/Measure. In: Dowe, D.L. (eds) Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence. Lecture Notes in Computer Science, vol 7070. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-44958-1_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-44958-1_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-44957-4
Online ISBN: 978-3-642-44958-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics