Abstract
We consider the sample complexity of agnostic learning with respect to squared loss. It is known that if the function class F used for learning is convex then one can obtain better sample complexity bounds than usual. It has been claimed that there is a lower bound that showed there was an essential gap in the rate. In this paper we show that the lower bound proof has a gap in it. Although we do not provide a definitive answer to its validity. More positively, we show one can obtain “fast” sample complexity bounds for nonconvex F for “most” target conditional expectations. The new bounds depend on the detailed geometry of F, in particular the distance in a certain sense of the target’s conditional expectation from the set of nonuniqueness points of the class F.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Peter L. Bartlett, Olivier Bousquet, Shahar Mendelson, “Localized Rademacher averages”, in COLT2002 (these proceedings).
Shai Ben-David and Michael Lindenbaum, “Learning Distributions by their Density Levels — A Paradigm for Learning without a Teacher,” in Computational Learning Theory — EUROCOLT’95, pages 53–68 (1995).
Dietrich Braess, Nonlinear Approximation Theory, Springer-Verlag, Berlin, 1986.
Richard M. Dudley, Uniform Central Limit Theorems, Cambridge Studies in Advanced Mathematics 63, Cambridge University Press 1999.
David Haussler, “Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications,” Information and Computation, 100, 78–150 (1992).
Michael J. Kearns, Robert E. Schapire and Linda M. Sellie, “Toward Efficient Agnostic Learning,” pages 341–352 in Proceedings of the 5th Annual Workshop on Computational Learning Theory, ACM press, New York, 1992.
Wee Sun Lee, Agnostic Learning and Single Hidden Layer Neural Networks, Ph.D. Thesis, Australian National University, 1996.
Wee Sun Lee, Peter L. Bartlett and Robert C. Williamson, “Efficient Agnostic Learning of Neural Networks with Bounded Fan-in,” IEEE Trans. on Information Theory, 42(6), 2118–2132 (1996).
Wee Sun Lee, Peter L. Bartlett and Robert C. Williamson, “The Importance of Convexity in Learning with Squared Loss” IEEE Transactions on Information Theory 44(5), 1974–1980, 1998 (earlier version in Proceedings of the 9th Annual Conference on Computational Learning Theory, pages 140–146, 1996.)
Shahar Mendelson, “Improving the sample complexity using global data,” IEEE transactions on Information Theory, to appear. http://axiom.anu.edu.au/~shahar
Shahar Mendelson “Rademacher averages and phase transitions in Glivenko-Cantelli classes” IEEE transactions on Information Theory, 48(1), 251–263, (2002).
Shahar Mendelson “A few remarks on Statistical Learning Theory”, preprint. http://axiom.anu.edu.au/~shahar
S. B. Stechkin, “Approximation Properties of Sets in Normed Linear Spaces,” Revue de mathematiques pures et appliquees, 8, 5–18, (1963) [in Russian].
M. Talagrand, “Sharper bounds for Gaussian and empirical processes”, Annals of Probability, 22(1), 28–76, (1994).
Aad W. van der Vaart and Jon A. Wellner, Weak Convergence and Empirical Processes, Springer, New York, 1996.
Frederick A. Valentine, Convex Sets, McGraw-Hill, San Francisco, 1964.
L. P. Vlasov, “Approximative Properties of Sets in Normed Linear Spaces,” Russian Mathematical Surveys, 28(6), 1–66, (1973).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Mendelson, S., Williamson, R.C. (2002). Agnostic Learning Nonconvex Function Classes. In: Kivinen, J., Sloan, R.H. (eds) Computational Learning Theory. COLT 2002. Lecture Notes in Computer Science(), vol 2375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45435-7_1
Download citation
DOI: https://doi.org/10.1007/3-540-45435-7_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43836-6
Online ISBN: 978-3-540-45435-9
eBook Packages: Springer Book Archive