Skip to main content

On the sample complexity of various learning strategies in the probabilistic PAC learning paradigms

  • Selected Papers
  • Conference paper
  • First Online:
Nonmonotonic and Inductive Logic (NIL 1991)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 659))

Included in the following conference series:

  • 129 Accesses

Abstract

Recently, in the context of learning probability distributions or stochastic rules, learning strategies which take into account both the simplicity of the hypothesized model and how well it explains the data have been shown to be effective. There are several strategies which fall into this general category, such as the minimum description length (MDL) principle and Occam's Razor. In this paper, we give au intuitive account of the reason why hypotheses obtained by such strategies may exhibit fast convergence to the true or optimal model as the sample size increases. We do so using the notion of ‘uniform convergence.’ We then investigate how we might apply the ‘uniform convergence method’ to estimate the convergence rates of these strategies, using the well-known Kullback-Leibler divergence as the distance measure between probabilistic information sources. In the process of doing so, we show that in fact for proving fast convergence with respect to the Kullback-Leibler divergence by the uniform convergence method, it is convenient to modify the MDL principle. We thus propose a new principle of statistical estimation, which we call ‘NIC(a new information criterion),’ motivated primarily by the goal of proving fast convergence to the true model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. N. Abe, J. Takeuchi, and M. K. Warmuth. Polynomial learnability of probabilistic concepts with respect to the Kullback-Leibler divergence. In Proceedings of the 1991 Workshop on Computational Learning Theory. Morgan Kaufmann, San Mateo, California, August 1991.

    Google Scholar 

  2. N. Abe and M. K. Warmuth. On the computational complexity of approximating probability distributions by probabilistic automata. Machine Learning, 9(2/3), 1992. A special issue for COLT'90.

    Google Scholar 

  3. A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36(4):929–965, October 1989.

    Google Scholar 

  4. D. Haussler. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Technical Report UCSC-CRL-91-02, UCSC, 1990. An extended abstract appeared in the Proceedings of FOCS '89.

    Google Scholar 

  5. M. Kearns and R. Schapire. Efficient distribution-free learning of probabilistic concepts. In Proceedings of IEEE Symposium on Foundations of Computer Science, October 1990.

    Google Scholar 

  6. S. E. Levinson, L. R. Rabiner, and M. M. Sondhi. An introduction to the application of the theory of probabilistic functions of a markov process to automatic speech recognition. The Bell System Technical Journal, 62(4), April 1983.

    Google Scholar 

  7. Azaria Paz. Introduction to Probabilistic Automata. Academic Press, 1971.

    Google Scholar 

  8. D. Pollard. Convergence of Stochastic Processes. Springer-Verlag, 1984.

    Google Scholar 

  9. J. Rissanen. Modeling by shortest data description. Automatica, 14:465–471, 1978.

    Google Scholar 

  10. L. G. Valiant. A theory of the learnable. Communications of A.C.M., 27:1134–1142, 1984.

    Google Scholar 

  11. V. G. Vapnik and A. Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Application, 16(2):264–280, 1971.

    Google Scholar 

  12. K. Yamanishi. A learning criterion for stochastic rules. Machine Learning, 9(2/3), 1992. A special issue for COLT'90.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Gerhard Brewka Klaus P. Jantke Peter H. Schmitt

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Abe, N. (1993). On the sample complexity of various learning strategies in the probabilistic PAC learning paradigms. In: Brewka, G., Jantke, K.P., Schmitt, P.H. (eds) Nonmonotonic and Inductive Logic. NIL 1991. Lecture Notes in Computer Science, vol 659. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0030387

Download citation

  • DOI: https://doi.org/10.1007/BFb0030387

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-56433-1

  • Online ISBN: 978-3-540-47557-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics