Skip to main content

Strong Entropy Concentration, Game Theory, and Algorithmic Randomness

  • Conference paper
  • First Online:
Computational Learning Theory (COLT 2001)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2111))

Included in the following conference series:

Abstract

We give a characterization of Maximum Entropy/Minimum Relative Entropy inference by providingt wo ‘strongen tropy concentration’ theorems. These theorems unify and generalize Jaynes’ ‘concentration phenomenon’ and Van Campenhout and Cover’s ‘conditional limit theorem’. The theorems characterize exactly in what sense a ‘prior’ distribution Q conditioned on a given constraint and the distribution Ṗ minimizing D(P∥Q) over all P satisfyingthe constraint are ‘close’ to each other. We show how our theorems are related to ‘universal models’ for exponential families, thereby establishinga link with Rissanen’s MDL/stochastic complexity. We then apply our theorems to establish the relationship (A) between entropy concentration and a game-theoretic characterization of Maximum Entropy Inference due to Topsøe and others; (B) between maximum entropy distributions and sequences that are random (in the sense of Martin-Löf/Kolmogorov) with respect to the given constraint. These two applications have strong implications for the use of Maximum Entropy distributions in sequential prediction tasks, both for the logarithmic loss and for general loss functions. We identify circumstances under which Maximum Entropy predictions are almost optimal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. Azoury and M. Warmuth. Relative loss bounds for on-line density estimation with the exponential family of distributions. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI’ 99), pages 31–40. Morgan Kaufmann, 1999.

    Google Scholar 

  2. A. Barron, J. Rissanen, and B. Yu. The minimum description length principle in coding and modeling. IEEE Transactions on Information Theory, 44(6):2743–2760, 1998.

    Article  MATH  MathSciNet  Google Scholar 

  3. P. Billingsley. Convergence of Probability Measures. Wiley, 1968.

    Google Scholar 

  4. T.M. Cover and J.A. Thomas. Elements of Information Theory. Wiley Interscience, New York, 1991.

    MATH  Google Scholar 

  5. I. Csiszár. I-divergence geometry of probability distributions and minimization problems. The Annals of Probability, 3(1):146–158, 1975.

    Article  MATH  Google Scholar 

  6. I. Csiszár. Sanov property, generalized i-projection and a conditional limit theorem. The Annals of Probability, 12(3):768–793, 1984.

    Article  MATH  MathSciNet  Google Scholar 

  7. I. Csiszár. Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems. The Annals of Statistics, 19(4):2032–2066, 1991.

    Article  MATH  MathSciNet  Google Scholar 

  8. M. Feder. Maximum entropy as a special case of the minimum description length criterion. IEEE Transactions on Information Theory, 32(6):847–849, 1986.

    Article  MATH  MathSciNet  Google Scholar 

  9. W. Feller. An Introduction to Probability Theory and Its Applications, volume 2. Wiley, 1968. Third edition.

    Google Scholar 

  10. P.D. Grünwald. Maximum entropy and the glasses you are looking through. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI 2000). Morgan Kaufmann Publishers, 2000.

    Google Scholar 

  11. P.D. Grünwald. The Minimum Description Length Principle and Reasoning under Uncertainty. PhD thesis, University of Amsterdam, The Netherlands, October 1998. Available as ILLC Dissertation Series 1998-03; see http://www.//cwi.nl/~pdg.

  12. P.D. Grünwald. Strong entropy concentration, coding, game theory and randomness. Technical Report 010, EURANDOM, 2001.

    Google Scholar 

  13. E.T. Jaynes. Where do we stand on maximum entropy? In R.D. Levine and M. Tribus, editors, The Maximum Entropy Formalism, pages 15–118. MIT Press, Cambridge, MA, 1978.

    Google Scholar 

  14. E.T. Jaynes. On the rationale of maximum-entropy methods. Proceedings of the IEEE, 70(939-951), 1982.

    Google Scholar 

  15. E.T. Jaynes. Papers on Probability, Statistics and Statistical Physics. Kluwer Academic Publishers, second edition, 1989.

    Google Scholar 

  16. E.T. Jaynes. Probability theory: the logic of science. Available at ftp://bayes.wustl.edu/Jaynes.book/, 1996.

  17. J.N. Kapur and H.K Kesavan. Entropy Optimization Principles with Applications. Academic Press, Inc., 1992.

    Google Scholar 

  18. R.E. Kass and P.W. Voss. Geometrical Foundations of Asymptotic Inference. Wiley Interscience, 1997.

    Google Scholar 

  19. J. Lafferty. Additive models, boosting and inference for generalized divergences. In Proceedings of the Twelfth Annual Workshop on Computational Learning Theory (COLT’ 99), 1999.

    Google Scholar 

  20. M. Li and P.M.B. Vitányi. An Introduction to Kolmogorov Complexity and Its Applications. Springer-Verlag, New York, revised and expanded second edition, 1997.

    MATH  Google Scholar 

  21. N. Merhav and M. Feder. A strongv ersion of the redundancy-capacity theorem of universal coding. IEEE Transactions on Information Theory, 41(3):714–722, 1995.

    Article  MATH  MathSciNet  Google Scholar 

  22. J. Rissanen. Stochastic Complexity in Statistical Inquiry. World Scientific Publishing Company, 1989.

    Google Scholar 

  23. J. Rissanen. Strongoptimalit y of the normalized ML models as universal codes, 2001. To appear in IEEE Transactions on Information Theory.

    Google Scholar 

  24. F. Topsøe. Information theoretical optimization techniques. Kybernetika, 15(1), 1979.

    Google Scholar 

  25. J. van Campenhout and T. Cover. Maximum entropy and conditional probability. IEEE Transactions on Information Theory, IT-27(4):483–489, 1981.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grünwald, P. (2001). Strong Entropy Concentration, Game Theory, and Algorithmic Randomness. In: Helmbold, D., Williamson, B. (eds) Computational Learning Theory. COLT 2001. Lecture Notes in Computer Science(), vol 2111. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44581-1_21

Download citation

  • DOI: https://doi.org/10.1007/3-540-44581-1_21

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42343-0

  • Online ISBN: 978-3-540-44581-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics