New lower bounds for statistical query learning

https://doi.org/10.1016/j.jcss.2004.10.003Get rights and content
Under an Elsevier user license
open archive

Abstract

We prove two lower bounds in the statistical query (SQ) learning model. The first lower bound is on weak-learning. We prove that for a concept class of SQ-dimension d, a running time of Ω(d/logd) is needed. The SQ-dimension of a concept class is defined to be the maximum number of concepts that are “uniformly correlated”, in that each of their pair has nearly the same correlation. This lower bound matches the upper bound in Blum et al. (Weakly Learning DNF and Characterizing Statistical Query Learning using Fourier Analysis, STOC 1994, pp. 253–262), up to a logarithmic factor. We prove this lower bound against an “honest SQ-oracle”, which gives a stronger result than the ones against the more frequently used “adversarial SQ-oracles”. The second lower bound is more general. It gives a continuous trade-off between the “advantage” of an algorithm in learning the target function and the number of queries it needs to make, where the advantage of an algorithm is the probability it succeeds in predicting a label minus the probability it does not. Both lower bounds extend and/or strengthen previous results, and solve an open problem left in previous papers. An earlier version of this paper [K. Yang, New lower bounds for statistical query learning, in: The Proceedings of the 15th Annual Conference on Computational Learning Theory, COLT 2002, Sydney, Australia, July 8–10, Lecture Notes in Computer Science, vol. 2375, 2002, pp. 229–243.] appeared in the proceedings of the 15th Annual Conference on Computational Learning Theory (COLT 2002).

Keywords

Lower-bound statistical query
PAC learning
KL-divergence
Singular-value decomposition

Cited by (0)

1

Partially supported by the CMU SCS Alumni Fellowship and the NSF Aladdin center, Grant CCR-0122581.