Abstract
In the paper we analyze a connection between outcomes of the cross-validation procedure and Vapnik bounds [1,2] on generalization of learning machines. We do not focus on how well the measured cross-validation outcome estimates the generalization error or how far it is from the training error; instead, we want to make statements about the cross-validation result without actually measuring it. In particular we want to state probabilistically what ε-difference one can expect between the known Vapnik bound and the unknown cross-validation result for given conditions of the experiment. In the consequence, we are able to calculate the necessary size of the training sample, so that the ε is sufficiently small; and so that the optimal complexity indicated via SRM is acceptable in the sense that cross-validation, if performed, would probably indicate the same complexity. We consider a non-stratified variant of cross-validation, which is convenient for the main theorem.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Vapnik, V.: Statistical Learning Theory: Inference from Small Samples. Wiley, New York (1995)
Vapnik, V.: Estimation of Dependences Based on Empirical Data. Information Science & Statistics. Springer, US (2006)
Cherkassky, V., Mulier, F.: Learning from data. John Wiley & Sons, Inc. (1998)
Hellman, M., Raviv, J.: Probability of error, equivocation and the chernoff bound. IEEE Transactions on Information Theory IT-16, 368–372 (1970)
Schmidt, J., Siegel, A., Srinivasan, A.: Chernoff-hoeffding bounds for applications with limited independence. SIAM Journal on Discrete Mathematics 8, 223–250 (1995)
Shawe-Taylor, J., et al.: A framework for structural risk minimization. In: COLT, pp. 68–76 (1996)
Devroye, L., Gyorfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Springer Verlag, New York, Inc. (1996)
Anthony, M., Shawe-Taylor, J.: A result of vapnik with applications. Discrete Applied Mathematics 47, 207–217 (1993)
Krzyżak, A., et al.: Application of structural risk minimization to multivariate smoothing spline regression estimates. Bernoulli 8, 475–489 (2000)
Holden, S.: Cross-validation and the pac learning model. Technical Report RN/96/64, Dept. of CS, University College, London (1996)
Holden, S.: Pac-like upper bounds for the sample complexity of leave-one-out cross-validation. In: 9th Annual ACM Workshop on Computational Learning Theory, pp. 41–50 (1996)
Kearns, M., Ron, D.: Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Computation 11, 1427–1453 (1999)
Kearns, M.: A bound on the error of cross-validation, with consequences for the training-test split. In: Advances in Neural Information Processing Systems, vol. 8. MIT Press (1995)
Kearns, M.: An experimental and theoretical comparison of model selection methods. In: 8th Annual ACM Workshop on Computational Learning Theory, pp. 21–30 (1995)
Bartlett, P., Kulkarni, S., Posner, S.: Covering numbers for real-valued function classes. IEEE Transactions on Information Theory 47, 1721–1724 (1997)
Bartlett, P.: The sample complexity of pattern classification with neural networks: the size of weights is more important then the size of the network. IEEE Transactions on Information Theory 44 (1997)
Ng, A.: Feature selection, l1 vs. l2 regularization, and rotational invariance. In: 21st ACM International Conference on Machine Learning. Proceeding Series, vol. 69 (2004)
Vapnik, V., Chervonenkis, A.: The necessary and sufficient conditions for the consistency of the method of empirical risk minimization. Yearbook of the Academy of Sciences of the USSR on Recognition, Classification and Forecasting 2, 217–249 (1989)
Kohavi, R.: A study of cross-validation and boostrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence, IJCAI (1995)
Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman & Hall, London (1993)
Hjorth, J.: Computer Intensive Statistical Methods Validation, Model Selection, and Bootstrap. Chapman & Hall, London (1994)
Weiss, S., Kulikowski, C.: Computer Systems That Learn. Morgan Kaufmann (1991)
Fu, W., Caroll, R., Wang, S.: Estimating misclassification error with small samples via bootstrap cross-validation. Bioinformatics 21, 1979–1986 (2005)
Vapnik, V., Chervonenkis, A.: On the uniform convergence of relative frequencies of events to their probabilities. Dokl. Aka. Nauk. 181 (1968)
Korzeń, M., Klęsk, P.: Maximal Margin Estimation with Perceptron-like Algorithm. In: Rutkowski, L., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2008. LNCS (LNAI), vol. 5097, pp. 597–608. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Klęsk, P. (2013). Probabilistic Connection between Cross-Validation and Vapnik Bounds. In: Filipe, J., Fred, A. (eds) Agents and Artificial Intelligence. ICAART 2011. Communications in Computer and Information Science, vol 271. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29966-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-29966-7_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29965-0
Online ISBN: 978-3-642-29966-7
eBook Packages: Computer ScienceComputer Science (R0)