Skip to main content

A Second-Order Perceptron Algorithm

  • Conference paper
  • First Online:
Computational Learning Theory (COLT 2002)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2375))

Included in the following conference series:

Abstract

We introduce a variant of the Perceptron algorithm called second-order Perceptron algorithm, which is able to exploit certain spectral properties of the data. We analyze the second-order Perceptron algorithm in the mistake bound model of on-line learning and prove bounds in terms of the eigenvalues of the Gram matrix created from the data. The performance of the second-order Perceptron algorithm is affected by the setting of a parameter controlling the sensitivity to the distribution of the eigenvalues of the Gram matrix. Since this information is not preliminarly available to on-line algorithms, we also design a refined version of the second-order Perceptron algorithm which adaptively sets the value of this parameter. For this second algorithm we are able to prove mistake bounds corresponding to a nearly optimal constant setting of the parameter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Angluin, D. (1988). Queries and concept learning. Machine Learning, 2(4), 319–342.

    Google Scholar 

  2. Auer, P., & Warmuth, M. K. (1998). Tracking the best disjunction. Machine Learning, 32(2), 127–150.

    Article  MATH  Google Scholar 

  3. Auer, P., Cesa Bianchi, N., & Gentile, C. (2001). Adaptive and self-confident online learning algorithms. Journal of Computer and System Sciences, to appear.

    Google Scholar 

  4. Auer, P. (2000). Using Upper Confidence Bounds for Online Learning. In 41st FOCS, IEEE, pp. 270–279.

    Google Scholar 

  5. Azoury K. S., & Warmuth, M. K. (2001). Relative loss bounds for on-line density estimation with the exponential familiy of distributions. Machine Learning, 43(3), 211–246.

    Article  MATH  Google Scholar 

  6. Ben-Israel, A. & Greville, T. N. E. (1974). Generalized Inverses: Theory and Applications. John Wiley and Sons.

    Google Scholar 

  7. Block, H. D. (1962). The perceptron: A model for brain functioning. Reviews of Modern Physics, 34, 123–135.

    Article  MATH  MathSciNet  Google Scholar 

  8. Cesa-Bianchi, N., Freund, Y., Haussler, D., Helmbold, D. P., Schapire, R. E., & Warmuth, M. K. (1997). How to use expert advice. J. ACM, 44(3), 427–485.

    Article  MATH  MathSciNet  Google Scholar 

  9. Cesa-Bianchi, N., Conconi, A., & Gentile, C. (2001). On the generalization ability of on-line learning algorithms. In NIPS 13, MIT Press, to appear.

    Google Scholar 

  10. Cristianini, N. & Shawe-Taylor, J. (2001). An Introduction to Support Vector Machines. Cambridge University Press.

    Google Scholar 

  11. Deerwester, S., Dumais, S. T., Furnas, G. W., Laundauer, T. K., & Harshman, R. A. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391–407.

    Article  Google Scholar 

  12. Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern Classification. John Wiley and Sons.

    Google Scholar 

  13. Gentile, C. & Warmuth, M. (1998). Linear hinge loss and average margin. In NIPS 10, MIT Press, pp. 225–231.

    Google Scholar 

  14. Gentile, C. (2001). A new approximate maximal margin classification algorithm. Journal of Machine Learning Research, 2, 213–242.

    Article  MathSciNet  Google Scholar 

  15. Grove, A. J., Littlestone, N., & Schuurmans, D. (2001). General convergence results for linear discriminant updates. Machine Learning Journal, 43(3), 173–210.

    Article  MATH  Google Scholar 

  16. Herbster, M., & Warmuth, M. K. (1998). Tracking the best expert. Machine Learning Journal, 32(2), 151–178.

    Article  MATH  Google Scholar 

  17. Hoerl, A., & Kennard, R. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12, 55–67.

    Article  MATH  Google Scholar 

  18. Horn, R. A., & Johnson, C. R. (1985). Matrix Analysis. Cambridge University Press.

    Google Scholar 

  19. Kivinen, J., Warmuth, M. K., & Auer, P. (1997). The perceptron algorithm vs. winnow: linear vs. logarithmic mistake bounds when few input variables are relevant. Artificial Intelligence, 97, 325–343.

    Article  MATH  MathSciNet  Google Scholar 

  20. Li, Y., & Long, P. (2002). The relaxed online maximum margin algorithm. Machine Learning Journal, 46(1/3), 361–387.

    Article  MATH  Google Scholar 

  21. Littlestone, N. (1988). Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm. Machine Learning, 2(4), 285–318.

    Google Scholar 

  22. Littlestone, N., & Warmuth, M. K. (1994). The weighted majority algorithm. Information and Computation, 108:2, 212–261.

    Article  MathSciNet  Google Scholar 

  23. Marcus, M., & Minc, H. (1965). Introduction to Linear Algebra. Dover.

    Google Scholar 

  24. Novikov, A. B. J. (1962). On convergence proofs on perceptrons. Proc. of the Symposium on the Mathematical Theory of Automata, vol. XII, pp. 615–622.

    Google Scholar 

  25. Press, W. H., Flannery, B. P., Teukolsky, S. A., & Wetterling. W. T. (1989). Numerical Recipes in Pascal. Cambridge University Press.

    Google Scholar 

  26. Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65, 386–408.

    Article  MathSciNet  Google Scholar 

  27. Vapnik, V. (1998). Statistical learning theory. New York: J. Wiley & Sons.

    MATH  Google Scholar 

  28. Vovk, V. (1990). Aggregating strategies. In 3rd COLT, Morgan Kaufmann, pp. 371–383.

    Google Scholar 

  29. Vovk, V. (2001). Competitive on-line statistics. International Statistical Review, 69, 213–248.

    Article  MATH  Google Scholar 

  30. Williamson, R. C., Shawe-Taylor, J., Schölkopf, B., & Smola, A. (1999). Sample based generalization bounds. Technical Report NC-TR-99-055, NeuroCOLT.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cesa-Bianchi, N., Conconi, A., Gentile, C. (2002). A Second-Order Perceptron Algorithm. In: Kivinen, J., Sloan, R.H. (eds) Computational Learning Theory. COLT 2002. Lecture Notes in Computer Science(), vol 2375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45435-7_9

Download citation

  • DOI: https://doi.org/10.1007/3-540-45435-7_9

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43836-6

  • Online ISBN: 978-3-540-45435-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics