Skip to main content

On Learning Correlated Boolean Functions Using Statistical Queries (Extended Abstract)

  • Conference paper
  • First Online:
Algorithmic Learning Theory (ALT 2001)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2225))

Included in the following conference series:

Abstract

In this paper, we study the problem of using statistical query (SQ) to learn a class of highly correlated boolean functions, namely, a class of functions where any pair agree on significantly more than 1/2 fraction of the inputs. We give an almost-tight bound on how well one can approximate all the functions without making any query, and then we show that beyond this bound, the number of statistical queries the algorithm has to make increases with the “extra” advantage the algorithm gains in learning the functions. Here the advantage is defined to be the probability the algorithm agrees with the target function minus the probability the algorithm doesn’t agree. An interesting consequence of our results is that the class of booleanized linear functions over a finite field (f(a(x) = 1 iff ø(a.x) = 1, where ø is an arbitrary boolean function that maps any elements in GFp to ±1) is not efficiently learnable. This result is useful since the hardness of learning booleanized linear functions over a finite field is related to the security of certain cryptosystems ([B01]). In particular, we prove that the class of linear threshold functions over a finite field (f(a,b(x) = 1 iff a. x ≥ b) cannot be learned efficiently using statistical query. This contrasts with Blum et. al.’s result [BFK+96] that linear threshold functions over reals (perceptions) are learnable using the SQ model. Finally, we describe a PAC-learning algorithm that learns a class of linear threshold functions in time that is provably impossible for statistical query algorithms. With properly chosen parameters, this class of linear threshold functions become an example of PAC-learnable, but not SQlearnable functions that are not parity functions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Andris Ambainis. Quantum lower bounds by quantum arguments, In Proceedings of the 32nd ACM Symposium on Theory of Computing, pages 636–643, 2000.

    Google Scholar 

  2. Javed Aslam and Scott Decatur. General Bounds on Statistical Query Learning and PAC learning with Noise via Hypothesis Boosting, In Information and Computation, 141, pages 85–118 (1998).

    Article  MathSciNet  Google Scholar 

  3. Leemon Baird. Blind Computation. Manuscript, 2001.

    Google Scholar 

  4. Avrim Blum, Merrick Furst, Jeffrey Jackson, Michael Kearns, Yishay Mansour, and Steven Rudich. Weakly Learning DNF and Characterizing Statistical Query Learning Using Fourier Analysis. In Proceedings of the 26th Annual ACM Symposium on Theory of Computing, pages 253–262, 1994.

    Google Scholar 

  5. Avrim Blum, Alan Frieze, Ravi Kannan, and Santosh Vempala, A Polynomial-time Algorithm for Learning Noisy Linear Threshold Functions, In Algorithmica, 22:35–52, 1998. An extended abstract appears in Proceedings of the 37th Annual Symposium on Foundations of Computer Science (FOCS’96), pages 330-338.

    Article  MathSciNet  Google Scholar 

  6. Avrim Blum, Adam Kalai and Hal Wasserman, Noise-tolerant Learning, the Parity problem, and the Statistical Query model. In Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, pp. 435–440, 2000.

    Google Scholar 

  7. Scott Decatur, Efficient Learning from Faulty Data. Ph.D. Thesis, Harvard University, TR-30-95, 1995.

    Google Scholar 

  8. Oded Goldreich and Leonid Levin, A hard-core predicate for all one-way functions. In Proceedings of the 21st Annual ACM Symposium on Theory of Computing, pp. 25–32, 1989.

    Google Scholar 

  9. Jeff Jackson On the Efficiency of Noise-Tolerant PAC Algorithms Derived from Statistical Queries. In Proceedings of the 13th Annual Workshop on Computational Learning Theory, 2000.

    Google Scholar 

  10. Michael Kearns. Efficient noise-tolerant learning from statistical queries. In Journal of the ACM, 45(6), pp. 983–1006, 1998. Preliminary version in Proceedings of the 25th Annual ACM Symposium on Theory of Computing, pp. 392-401, 1993.

    Article  MathSciNet  Google Scholar 

  11. Rajeev Motwani and Prabhakar Raghavan, Randomized Algorithms, Cambridge University Press, 1995.

    Google Scholar 

  12. Robert Schapire and Linda Selle, Learning Sparse Multivariate Polynomials over a Field with Queries and Counterexamples. In Journal of Computer and System Sciences, 52, 201–213, 1996.

    Article  MathSciNet  Google Scholar 

  13. Salil Vadhan, Private Communication.

    Google Scholar 

  14. Leslie Valiant, A theory of the Leanable. In Communications of the ACM, 27(11): 1134–1142, November 1984.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yang, K. (2001). On Learning Correlated Boolean Functions Using Statistical Queries (Extended Abstract). In: Abe, N., Khardon, R., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2001. Lecture Notes in Computer Science(), vol 2225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45583-3_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-45583-3_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42875-6

  • Online ISBN: 978-3-540-45583-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics