Abstract
The Statistical Query (SQ) model provides an elegant means for generating noise-tolerant PAC learning algorithms that run in time inverse polynomial in the noise rate. Whether or not there is an SQ algorithm for every noise-tolerant PAC algorithm that is efficient in this sense remains an open question. However, we show that PAC algorithms derived from the Statistical Query model are not always the most efficient possible. Specifically, we give a general definition of SQ-based algorithm and show that there is a subclass of parity functions for which there is an efficient PAC algorithm requiring asymptotically less running time than any SQ-based algorithm. While it turns out that this result can be derived fairly easily by combining a recent algorithm of Blum, Kalai, and Wasserman with an older lower bound, we also provide alternate, Fourier-based approaches to both the upper and lower bounds that strengthen the results in various ways. The lower bound in particular is stronger than might be expected, and the amortized technique used in deriving this bound may be of independent interest.
Similar content being viewed by others
References
J.A. Aslam and S.E. Decatur, Specification and simulation of statistical query algorithms for efficiency and noise tolerance, Journal of Computer and System Sciences 56(2) (1998) 191–208.
A. Blum, M. Furst, J. Jackson, M. Kearns, Y. Mansour and S. Rudich, Weakly learning DNF and characterizing statistical query learning using Fourier analysis, in: Proceedings of the 26th Annual ACM Symposium on Theory of Computing (1994) pp. 253–262. Preliminary version available as http://www.mathcs.duq.edu/∼jackson/dnfsq.ps.
A. Blum, A. Kalai and H. Wasserman, Noise-tolerant learning, the parity problem, and the Statistical Query model, in: Proceedings of the 32nd Annual ACM Symposium on Theory of Computing (2000) pp. 435–440.
B. Bollobás, Random Graphs (Academic Press, 1985).
N. Bshouty and C. Tamon, On the Fourier spectrum of monotone functions, Journal of the ACM 43(4) (1996) 747–770.
Y. Freund and R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and Systems Sciences 55(1) (1997) 119–139. First appeared in EuroCOLT'95.
O. Goldreich and L.A. Levin, A hard-core predicate for all one-way functions, in: Proceedings of the 21st Annual ACM Symposium on Theory of Computing (1989) pp. 25–32.
T. Hagerup and C. Rüb, A guided tour of Chernoff bounds, Information Processing Letters 33 (1989/90) 305–308.
W. Hoeffding, Probability inequalities for sums of bounded random variables, American Statistical Association Journal 58 (1963) 13–30.
J. Jackson, An efficient membership-query algorithm for learning DNF with respect to the uniform distribution, Journal of Computer and System Sciences 55(3) (1997) 414–440. Earlier version appeared in Proceedings of the 35th Annual Symposium on Foundations of Computer Science (1994) pp. 42–53. Preliminary extended version available as http://www.mathcs.duq.edu/∼jackson/J97.ps.
M.J. Kearns, Efficient noise-tolerant learning from statistical queries, Journal of the ACM 45(6) (1998) 983–1006. Earlier version appeared in Proceedings of the 25th Annual ACM Symposium on Theory of Computing (1993) pp. 392–401.
E. Kushilevitz and Y. Mansour, Learning decision trees using the Fourier spectrum, SIAM Journal on Computing 22(6) (December 1993) 1331–1348. Earlier version appeared in Proceedings of the 23rd Annual ACM Symposium on Theory of Computing (1991) pp. 455–464.
N. Linial, Y. Mansour and N. Nisan, Constant depth circuits, Fourier transform, and learnability, Journal of the ACM 40(3) (1993) 607–620. Earlier version appeared in Proceedings of the 30th Annual Symposium on Foundations of Computer Science (1989) pp. 574–579.
J.E. Littlewood, On the probability in the tail of a binomial distribution, Advances in Applied Probability 1 (1969) 43–72.
F.J. MacWilliams and N.J.A. Sloane, The Theory of Error-Correcting Codes (Elsevier Science Publishers, 1977).
R.E. Schapire, The strength of weak learnability, Machine Learning 5 (1990) 197–227.
L.G. Valiant, A theory of the learnable, Communications of the ACM 27(11) (1984) 1134–1142.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Jackson, J. On the Efficiency of Noise-Tolerant PAC Algorithms Derived from Statistical Queries. Annals of Mathematics and Artificial Intelligence 39, 291–313 (2003). https://doi.org/10.1023/A:1024697502780
Issue Date:
DOI: https://doi.org/10.1023/A:1024697502780