Skip to main content
Log in

On-line learning of linear functions

  • Published:
computational complexity Aims and scope Submit manuscript

Abstract

We present an algorithm for the on-line learning of linear functions which is optimal to within a constant factor with respect to bounds on the sum of squared errors for a worst case sequence of trials. The bounds are logarithmic in the number of variables. Furthermore, the algorithm is shown to be optimally robust with respect to noise in the data (again to within a constant factor).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • S. S. Agaian,Hadamard Matrices and Their Applications. Number 1168 in Lecture Notes in Mathematics. Springer-Verlag, 1985.

  • E. J. Bernstein, Absolute error bounds for learning linear functions on line.Proceedings of the 1992 Workshop on Computational Learning Theory, 1992, 160–163.

  • A. Blum, L. Hellerstein, and N. Littlestone, Learning in the presence of finitely many or infinitely many irrelevant attributes.The 1991 Workshop on Computational Learning Theory, 1991, 157–166.

  • N. Cesa-Bianchi, P. M. Long, and M. K. Warmuth, Worst-case quadratic loss bounds for a generalization of the Widrow-Hoff rule.The 1993 Workshop on Computational Learning Theory, 1993, 429–438.

  • R. O. Duda and P. E. Hart,Pattern Classification and Scene Analysis. Wiley, 1973.

  • D. Haussler, Learning conjunctive concepts in structural domains.Machine Learning 4(1) (1989), 7–40.

    Google Scholar 

  • M. Kearns, M. Li, L. Pitt, and L. G. Valiant, On the learnability of boolean formulae.Proceedings of the 19th Annual Symposium on the Theory of Computation, 1987, 285–295.

  • S. Kullback, A lower bound for discrimination in terms of variation.IEEE transactions on Information Theory 13 (1967), 126–127.

    Google Scholar 

  • N. Littlestone, Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm.Machine Learning 2 (1988), 285–318.

    Google Scholar 

  • N. Littlestone,Mistake Bounds and Logarithmic Linear-threshold Learning Algorithms. PhD thesis, UC Santa Cruz, 1989.

  • N. Littlestone and M. Warmuth, The weighted majority algorithm.Information and Computation (1994). To appear.

  • J. Mycielski, A learning algorithm for linear operators.Proceedings of the American Mathematical Society 103(2) (1988), 547–550.

    Google Scholar 

  • L. Pitt andM. K. Warmuth, Prediction preserving reducibility.Journal of Computer and System Sciences 41(3) (1990), 430–467.

    Google Scholar 

  • G. Strang,Linear Algebra and its Applications. Harcourt, Brace, Jovanovich, 1988.

    Google Scholar 

  • B. Widrow and M. E. Hoff, Adaptive switching circuits.1960 IRE WESCON Convention Record (1960), 96–104.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Littlestone, N., Warmuth, M.K. & Long, P.M. On-line learning of linear functions. Comput Complexity 5, 1–23 (1995). https://doi.org/10.1007/BF01277953

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01277953

Key words

Subject classifications

Navigation