Abstract
Recently, Kalai et al. [1] have shown (among other things) that linear threshold functions over the Boolean cube and unit sphere are agnostically learnable with respect to the uniform distribution using the hypothesis class of polynomial threshold functions. Their primary algorithm computes monomials of large constant degree, although they also analyze a low-degree algorithm for learning origin-centered halfspaces over the unit sphere. This paper explores noise-tolerant learnability of linear thresholds over the cube when the learner sees a very limited portion of each instance. Uniform-distribution weak learnability results are derived for the agnostic, unknown attribute noise, and malicious noise models. The noise rates that can be tolerated vary: the rate is essentially optimal for attribute noise, constant (roughly 1/8) for agnostic learning, and non-trivial (\(\Omega(1/\sqrt{n})\)) for malicious noise. In addition, a new model that lies between the product attribute and malicious noise models is introduced, and in this stronger model results similar to those for the standard attribute noise model are obtained for learning homogeneous linear thresholds with respect to the uniform distribution over the cube. The learning algorithms presented are simple and have small-polynomial running times.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Kalai, A.T., Klivans, A.R., Mansour, Y., Servedio, R.A.: Agnostically learning halfspaces. In: Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, pp. 11–20 (2005)
Ben-David, S., Dichterman, E.: Learning with restricted focus of attention. In: Proceedings of the 6th Annual Conference on Computational Learning Theory, pp. 287–296 (1993)
Gotsman, C., Linial, N.: Spectral properties of threshold functions. Combinatorica 14, 35–50 (1994)
Blum, A., Furst, M., Jackson, J., Kearns, M., Mansour, Y., Rudich, S.: Weakly learning DNF and characterizing statistical query learning using Fourier analysis. In: Proceedings of the 26th Annual ACM Symposium on Theory of Computing, pp. 253–262 (1994), Preliminary version available as: http://www.mathcs.duq.edu/~jackson/dnfsq.ps
Valiant, L.G.: A theory of the learnable. Communications of the ACM 27, 1134–1142 (1984)
Kearns, M.J., Schapire, R.E., Sellie, L.M.: Toward efficient agnostic learning. Machine Learning 17, 115–141 (1994)
Shackelford, G., Volper, D.: Learning k-DNF with noise in the attributes. In: Proceedings of the 1988 Workshop on Computational Learning Theory, pp. 97–103 (1988)
Valiant, L.G.: Learning disjunctions of conjunctions. In: Proceedings of the Ninth International Joint Conference on Artificial Intelligence, vol. 1, pp. 560–566 (1985)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. American Statistical Association Journal 58, 13–30 (1963)
Bshouty, N.H., Jackson, J.C., Tamon, C.: Uniform-distribution attribute noise learnability. Inf. Comput. 187, 277–290 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jackson, J.C. (2006). Uniform-Distribution Learnability of Noisy Linear Threshold Functions with Restricted Focus of Attention. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_24
Download citation
DOI: https://doi.org/10.1007/11776420_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35294-5
Online ISBN: 978-3-540-35296-9
eBook Packages: Computer ScienceComputer Science (R0)