Skip to main content

Stochastic complexity in learning

  • Conference paper
  • First Online:
Computational Learning Theory (EuroCOLT 1995)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 904))

Included in the following conference series:

Abstract

This is an expository paper on the latest results in the theory of stochastic complexity and the associated MDL principle with special interest in modeling problems arising in machine learning. As an illustration we discuss the problem of designing MDL decision trees, which are meant to improve the earlier designs in two ways: First, by use of the sharper formula for the stochastic complexity at the nodes the earlier found tendency of getting too small trees appears to be overcome. Secondly, a dynamic programming based pruning algorithm is described for finding the optimal trees, which generalizes an algorithm described in Nohre (1994).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Chaitin, G.J. (1966), ‘On the Lengths of Programs for Computing Finite Binary Sequences', JACM, 13, 547–569.

    Google Scholar 

  • Chaitin, G.J. (1969), ‘On the Length of Programs for Computing Finite Binary Sequences: Statistical Considerations', JACM, 16, 145–159.

    Google Scholar 

  • Davisson, Lee D. (1973), “Universal Noiseless Coding”, IEEE Trans. on Information Theory, Vol. IT-19, No. 6, 783–795, November 1973.

    Google Scholar 

  • Kolmogorov, A.N. (1965), ‘Three Approaches to the Quantitative Definition of Information', Problems of Information Transmission 1, 1–7.

    Google Scholar 

  • Krichevsky, R.E. and Trofimov, V.K. (1983), “The Performance of Universal Coding”, IEEE Trans. on Information Theory, Vol. IT-27, No. 2, 199–207.

    Google Scholar 

  • Lehtokangas, M., Saarinen, J., Huuhtanen, P. and Kaski, K. (1993), ‘Chaotic Time Series Modeling with Optimum Neural Network Architecture', Proc. of International Joint Conference on Neural Networks, IJCNN'93, Nagoya, Japan.

    Google Scholar 

  • Li, M. and Vitanyi, P. (1993), An Introduction to Kolmogorov Complexity and Its Applications, Springer-Verlag, New York (546 pages).

    Google Scholar 

  • Nohre, R. (1994), ‘Some Topics in Descriptive Complexity', PhD Thesis, Linkoping University, Linkoping, Sweden.

    Google Scholar 

  • Quinlan, J.R. and Rivest, R.L. (1989), ‘Inferring Decision Trees Using Minimum Description Length Principle', Information and Computation, 80, 227–248.

    Google Scholar 

  • Rissanen, J. (1984), ‘Universal Coding, Information, Prediction, and Estimation', IEEE Trans. Inf. Theory, Vol. IT-30, Nr. 4, 629–636.

    Google Scholar 

  • Rissanen, J. (1986), ‘Stochastic Complexity and Modeling', Annals of Statistics, Vol 14, 1080–1100.

    Google Scholar 

  • Rissanen, J. (1989), Stochastic Complexity in Statistical Inquiry, World Scientific Publ. Co., Suite 1B, 1060 Main Street, River Edge, New Jersey (175 pages).

    Google Scholar 

  • Rissanen, J. and Yu, B. (1992), ‘MDL Learning', to appear.

    Google Scholar 

  • Rissanen, J. (1992), ‘Information Theory and Neural Nets', a chapter in book Mathematical Perspectives of Neural Networks, (P. Smolensky, M. Mozer, D. Rumelhart, eds.), Lawrence Erlbaum Assoc. (To appear in 1995).

    Google Scholar 

  • Rissanen, J. (1994), ‘Fisher Information and Stochastic Complexity’ submitted to IEEE Trans. Information Theory.

    Google Scholar 

  • Rissanen, J. and Wax, M. (1988), Algorithm for Constructing Tree Structured Classifiers, US Patent No. 4,719,571.

    Google Scholar 

  • Shtarkov, Yu. M. (1987), “Universal Sequential Coding of Single Messages”, Translated from Problems of Information Transmission, Vol. 23, No. 3, 3–17, July–September 1987.

    Google Scholar 

  • Solomonoff, R.J. (1964), ‘A Formal Theory of Inductive Inference', Part I, Information and Control 7, 1–22; Part II, Information and Control 7, 224–254.

    Google Scholar 

  • Yamanishi, K. (1990), ‘A Learning Criterion for Stochastic Rules', Proc. of the Third Annual Workshop on Computational Learning Theory, August 1990.

    Google Scholar 

  • Weinberger, M.J., Rissanen, J., and Feder, M. (1993), ‘A Universal Finite Memory Source', to appear in Trans. of IEEE on Information Theory.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Paul Vitányi

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rissanen, J. (1995). Stochastic complexity in learning. In: Vitányi, P. (eds) Computational Learning Theory. EuroCOLT 1995. Lecture Notes in Computer Science, vol 904. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-59119-2_178

Download citation

  • DOI: https://doi.org/10.1007/3-540-59119-2_178

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-59119-1

  • Online ISBN: 978-3-540-49195-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics