Abstract
Shifting bounds for on-line classification algorithms ensure good performance on any sequence of examples that is well predicted by a sequence of smoothly changing classifiers. When proving shifting bounds for kernel-based classifiers, one also faces the problem of storing a number of support vectors that can grow unboundedly, unless an eviction policy is used to keep this number under control. In this paper, we show that shifting and on-line learning on a budget can be combined surprisingly well. First, we introduce and analyze a shifting Perceptron algorithm achieving the best known shifting bounds while using an unlimited budget. Second, we show that by applying to the Perceptron algorithm the simplest possible eviction policy, which discards a random support vector each time a new one comes in, we achieve a shifting bound close to the one we obtained with no budget restrictions. More importantly, we show that our randomized algorithm strikes the optimal trade-off \(U = \Theta\bigl(\sqrt{B}\bigr)\) between budget B and norm U of the largest classifier in the comparison sequence.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Angluin, D.: Queries and concept learning. Machine Learning 2(4), 319–342 (1988)
Auer, P., Warmuth, M.: Tracking the best disjunction. Machine Learning 3, 127–150 (1998)
Block, H.D.: The Perceptron: A model for brain functioning. Reviews of Modern Physics 34, 123–135 (1962)
Bordes, A., Ertekin, S., Weston, J., Bottou, L.: Fast kernel classifiers with online and active learning. JMLR 6, 1579–1619 (2005)
Cesa-Bianchi, N., Conconi, A., Gentile, C.: A second-order Perceptron algorithm. SIAM Journal of Computing 34(3), 640–668 (2005)
Crammer, K., Kandola, J., Singer, Y.: Online classification on a budget. In: Proc.16th NIPS (2003)
Dekel, O., Shalev-Shwartz, S., Singer, Y.: The Forgetron: a kernel-based Perceptron on a fixed budget. In: Proc.19th NIPS (2005)
Freund, Y., Schapire, R.: Large margin classification using the Perceptron algorithm. Journal of Machine Learning 37(3), 277–296 (1999)
Gentile, C.: A new approximate maximal margin classification algorithm. Journal of Machine Learning Research 2, 213–242 (2001)
Gentile, C.: The robustness of the p-norm algorithms. Machine Learning 53, 265–299 (2003)
Gentile, C., Warmuth, M.: Linear hinge loss and average margin. In: Advances in Neural Information Processing Systems 10, pp. 225–231. MIT Press, Cambridge (1999)
Grove, A.J., Littlestone, N., Schuurmans, D.: General convergence results for linear discriminant updates. Journal of Machine Learning 43(3), 173–210
Herbster, M., Warmuth, M.: Tracking the best expert. Machine Learning 32, 151–178 (1998)
Herbster, M., Warmuth, M.: Tracking the best linear predictor. Journal of Machine Learning Research 1, 281–309 (2001)
Kivinen, J., Smola, A.J., Williamson, R.C.: Online learning with kernels. IEEE Transactions on Signal Processing 52(8), 2165–2176 (2004)
Li, Y., Long, P.: The relaxed online maximum margin algorithm. Journal of Machine Learning 46(1), 361–387 (2002)
Littlestone, N.: Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning 2, 285–318 (1988)
Littlestone, N.: Mistake Bounds and Logarithmic Linear-threshold Learning Algorithms. PhD thesis, University of California Santa Cruz (1989)
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Information and Computation 108(2), 212–261 (1994)
Novikov, A.B.J.: On convergence proofs on Perceptrons. In: Proc. of the Symposium on the Mathematical Theory of Automata, vol. XII, pp. 615–622 (1962)
Schölkopf, B., Smola, A.: Learning with kernels. MIT Press, Cambridge (2002)
Vapnik, V.: Statistical learning theory. J. Wiley & Sons, New York (1998)
Weston, J., Bordes, A., Bottou, L.: Online (and offline) on an even tighter budget. In: Proc. 10th AIStat, pp. 413–420 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cesa-Bianchi, N., Gentile, C. (2006). Tracking the Best Hyperplane with a Simple Budget Perceptron. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_36
Download citation
DOI: https://doi.org/10.1007/11776420_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35294-5
Online ISBN: 978-3-540-35296-9
eBook Packages: Computer ScienceComputer Science (R0)