Abstract
It is well known that \(AC^0\) circuits can be learned by the Low Degree Algorithm in quasi-polynomial-time under the uniform distribution due to Linial, Mansour and Nisan. Furst et al. and Blais et al. Then showed that this learnability also holds when the input variables are mutually independent or conform to some product distributions. However, a long-standing question is whether we can learn \(AC^0\) beyond these distributions, e.g. under some non-product distributions.
In this paper we show \(AC^0\) can be non-trivially learned under a sort of distributions, which we call k-dependent distributions. Informally, a k-dependent distribution is one satisfying that for a randomly sampled string (as input to a circuit being learned), some bits of it are mutually independent, of which each other bit is dependent on at most k ones. We note that this sort of distributions contains some natural non-product distributions. We show that with respect to any such distribution, if the dependence relations of all bits of sampled strings are known, \(AC^0\) can be learned in quasi-polynomial-time in the case that k is poly-logarithmic, and otherwise, the learning costs exponential-time but still uses similarly many examples as the former case. We note that in the latter case although the time complexity is exponential, it is significantly smaller than that of the brute-force method (when the size of the circuit being learned is sufficiently large).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
We sketch the DNF construction in [7]. Suppose \(\mu \) is a probability that we wish to approximate. Since now \(\epsilon 2^{-\log ^c n}\) difference is allowed, we only need to construct a DNF that outputs 1 with probability \(\mu \) with \(O(\log ^c n+\log 1/\epsilon )\) bits kept after the binary point. So just assume \(\mu =\varSigma _{j=1}^l a_j2^{-j}\), where \(l=O(\log ^c n+\log 1/\epsilon )\) and \(a_j\in \{0,1\}\). Create one AND for each j satisfying \(a_j = 1\) such that the AND on input j uniform bits outputs 1 with probability \(2^{-j}\). Also insure that at most one AND among all produces 1 on each input. Let the DNF be the OR of all these AND’s which totally has \(O(l^2)\) bits as input and is of size O(l).
References
Ajtai, M., Ben-Or, M.: A theorem on probabilistic constant depth computations. In: Proceedings of the 16th Annual ACM Symposium on Theory of Computing, Washington, DC, USA, pp. 471–474, 30 April–2 May 1984. http://doi.acm.org/10.1145/800057.808715
Aspnes, J., Beigel, R., Furst, M., Rudich, S.: The expressive power of voting polynomials. Combinatorica 14(2), 1–14 (1994)
Beigel, R.: When do extra majority gates help? Polylog (n) majority gates are equivalent to one. Comput. Complex. 4, 314–324 (1994)
Blais, E., O’Donnell, R., Wimmer, K.: Polynomial regression under arbitrary product distributions. Mach. Learn. 80(2–3), 273–294 (2010). http://dx.doi.org/10.1007/s10994-010-5179-6
Boppana, R.B.: The average sensitivity of bounded-depth circuits. Inf. Process. Lett. 63(5), 257–261 (1997). http://dx.doi.org/10.1016/S0020-0190(97)00131-2
Bun, M., Thaler, J.: Hardness amplification and the approximate degree of constant-depth circuits. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9134, pp. 268–280. Springer, Heidelberg (2015). doi:10.1007/978-3-662-47672-7_22
Furst, M.L., Jackson, J.C., Smith, S.W.: Improved learning of AC\({}^{\text{0}}\) functions. In: Warmuth, M.K., Valiant, L.G. (eds.) Proceedings of the Fourth Annual Workshop on Computational Learning Theory, COLT 1991, Santa Cruz, California, USA, pp. 317–325. Morgan Kaufmann, 5–7 August 1991. http://dl.acm.org/citation.cfm?id=114866
Gopalan, P., Servedio, R.A.: Learning and lower bounds for AC 0 with threshold gates. In: Serna, M., Shaltiel, R., Jansen, K., Rolim, J. (eds.) APPROX/RANDOM -2010. LNCS, vol. 6302, pp. 588–601. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15369-3_44
Hajnal, A., Maass, W., Pudlák, P., Szegedy, M., Turán, G.: Threshold circuits of bounded depth. J. Comput. Syst. Sci. 46(2), 129–154 (1993). http://dx.doi.org/10.1016/0022-0000(93)90001-D
Harsha, P., Srinivasan, S.: On polynomial approximations to \(AC^0\). CoRR abs/1604.08121 (2016). http://arxiv.org/abs/1604.08121
Håstad, J.: A slight sharpening of LMN. J. Comput. Syst. Sci. 63(3), 498–508 (2001). http://dx.doi.org/10.1006/jcss.2001.1803
Haussler, D.: Decision theoretic generalizations of the PAC model for neural net and other learning applications. Inf. Comput. 100(1), 78–150 (1992)
Jackson, J.C., Klivans, A., Servedio, R.A.: Learnability beyond \(\text{AC }^0\). In: IEEE Conference on Computational Complexity, p. 26. IEEE Computer Society (2002)
Kalai, A.T., Klivans, A.R., Mansour, Y., Servedio, R.A.: Agnostically learning halfspaces. SIAM J. Comput. 37(6), 1777–1805 (2008). http://dx.doi.org/10.1137/060649057
Kearns, M.J., Schapire, R.E., Sellie, L.: Toward efficient agnostic learning. Mach. Learn. 17(2–3), 115–141 (1994)
Linial, N., Mansour, Y., Nisan, N.: Constant depth circuits, fourier transform, and learnability. J. ACM 40(3), 607–620 (1993)
Tal, A.: Tight bounds on the fourier spectrum of \(\rm {A}\rm {C}^{0}\). Electron. Colloq. Comput. Complex. (ECCC) 21, 174 (2014). http://eccc.hpi-web.de/report/2014/174
Tarui, J.: Probablistic polynomials, \(\rm {A}\rm {C}^{0}\) functions, and the polynomial-time hierarchy. Theor. Comput. Sci. 113(1), 167–183 (1993). http://dx.doi.org/10.1016/0304-3975(93)90214-E
Toda, S., Ogiwara, M.: Counting classes are at least as hard as the polynomial-time hierarchy. SIAM J. Comput. 21(2), 316–328 (1992). http://dx.doi.org/10.1137/0221023
Valiant, L.G.: A theory of the learnable. Commun. ACM 27(11), 1134–1142 (1984)
Acknowledgments
We are grateful to the reviewers of TAMC 2016 for their useful comments. This work is supported by the National Natural Science Foundation of China (Grant No. 61572309) and Major State Basic Research Development Program (973 Plan) of China (Grant No. 2013CB338004) and Research Fund of Ministry of Education of China and China Mobile (Grant No. MCM20150301).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ding, N., Ren, Y., Gu, D. (2017). Learning \(AC^0\) Under k-Dependent Distributions. In: Gopal, T., Jäger , G., Steila, S. (eds) Theory and Applications of Models of Computation. TAMC 2017. Lecture Notes in Computer Science(), vol 10185. Springer, Cham. https://doi.org/10.1007/978-3-319-55911-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-55911-7_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-55910-0
Online ISBN: 978-3-319-55911-7
eBook Packages: Computer ScienceComputer Science (R0)