A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions

Blum, A.; Frieze, A.; Kannan, R.; Vempala, S.

doi:10.1007/PL00013833

A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions

Published: September 1998

Volume 22, pages 35–52, (1998)
Cite this article

Algorithmica Aims and scope Submit manuscript

A. Blum¹,
A. Frieze²,
R. Kannan¹ &
…
S. Vempala¹

625 Accesses
78 Citations
Explore all metrics

Abstract.

In this paper we consider the problem of learning a linear threshold function (a half-space in n dimensions, also called a ``perceptron''). Methods for solving this problem generally fall into two categories. In the absence of noise, this problem can be formulated as a Linear Program and solved in polynomial time with the Ellipsoid Algorithm or Interior Point methods. Alternatively, simple greedy algorithms such as the Perceptron Algorithm are often used in practice and have certain provable noise-tolerance properties; but their running time depends on a separation parameter, which quantifies the amount of ``wiggle room'' available for a solution, and can be exponential in the description length of the input.

In this paper we show how simple greedy methods can be used to find weak hypotheses (hypotheses that correctly classify noticeably more than half of the examples) in polynomial time, without dependence on any separation parameter. Suitably combining these hypotheses results in a polynomial-time algorithm for learning linear threshold functions in the PAC model in the presence of random classification noise. (Also, a polynomial-time algorithm for learning linear threshold functions in the Statistical Query model of Kearns.)

Our algorithm is based on a new method for removing outliers in data. Specifically, for any set S of points in R ⁿ , each given to b bits of precision, we show that one can remove only a small fraction of S so that in the remaining set T , for every vector v , max _{x ∈ T} (v . x) ² ≤ poly(n,b) E _{x ∈ T} (v . x) ²; i.e., for any hyperplane through the origin, the maximum distance (squared) from a point in T to the plane is at most polynomially larger than the average. After removing these outliers, we are able to show that a modified version of the Perceptron Algorithm finds a weak hypothesis in polynomial time, even in the presence of random classification noise.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Tractability from overparametrization: the example of the negative perceptron

Article 22 January 2024

An Improved Deterministic Rescaling for Linear Programming Algorithms

Polynomial-Time Approximation Scheme for a Problem of Searching for the Largest Subset with the Constraint on Quadratic Variation

Author information

Authors and Affiliations

School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA. {avrim, kannan+,svempala}@cs.cmu.edu., , , , , , US
A. Blum, R. Kannan & S. Vempala
Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA. af1p@andrew.cmu.edu., , , , , , US
A. Frieze

Authors

A. Blum
View author publications
You can also search for this author inPubMed Google Scholar
A. Frieze
View author publications
You can also search for this author inPubMed Google Scholar
R. Kannan
View author publications
You can also search for this author inPubMed Google Scholar
S. Vempala
View author publications
You can also search for this author inPubMed Google Scholar

Additional information

Received February 5, 1997; revised July 2, 1997.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Blum, A., Frieze, A., Kannan, R. et al. A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions . Algorithmica 22, 35–52 (1998). https://doi.org/10.1007/PL00013833

Download citation

Issue Date: September 1998
DOI: https://doi.org/10.1007/PL00013833

Key words. Computational learning theory, Linear threshold functions, Perceptron algorithm, Learning with noise.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Polynomial-Time Algorithm for Learning Noisy Linear Threshold Functions

Abstract.

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Tractability from overparametrization: the example of the negative perceptron

An Improved Deterministic Rescaling for Linear Programming Algorithms

Polynomial-Time Approximation Scheme for a Problem of Searching for the Largest Subset with the Constraint on Quadratic Variation

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now