Abstract
We present a new and simple algorithm for learning large margin classifiers that works in a truly online manner. The algorithm generates a linear classifier by averaging the weights associated with several perceptron-like algorithms run in parallel in order to approximate the Bayes point. A random subsample of the incoming data stream is used to ensure diversity in the perceptron solutions. We experimentally study the algorithm’s performance on online and batch learning settings. The online experiments showed that our algorithm produces a low prediction error on the training sequence and tracks the presence of concept drift. On the batch problems its performance is comparable to the maximum margin algorithm which explicitly maximises the margin.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
C. Gentile, (2001) A new approximate maximal margin classification algorithm. Journal of Machine Learning Research, 2:213–242.
Y. Li. & P. Long, (2002) The relaxed online maximum margin algorithm. Machine Learning, 46(1–3):361–387.
J. Kivinen, A. Smola, and R. C. Williamson, (2002) Online Learning with kernels. Advances in Neural Information Processing Systems 14, Cambridge, MA: MIT Press (pp. 785–793).
R.O. Duda & P.E. Hart & D.G. Stork, (2000) Pattern Classification And Scene Analysis 2nd Edition. John Wiley.
A.B.J. Novikoff, (1962) On convergence proofs on perceptrons. In Proceedings of the Symposium on Mathematical Theory of Automata, vol. XII, (pp. 615–622).
R. Herbrich, (2002) Learning Kernel Classifiers, Cambridge, MA: MIT Press.
R. Herbrich, T. Graepel & C. Campbell, (2001) Bayes Point Machines. Journal of Machine Learning Research, 1:245–279.
L. Breiman, (1996) Bagging predictors. Machine Learning, 24(2):120–140.
R.E. Schapire, (1990) The strength of weak learnability. Machine Learning, 5:197–227.
S. Sonnenburg, (2002) New Methods for Splice Site Recognition. Master’s thesis, Humbold University.
W.J. Conover, (1980) Practical nonparametric statistics, 2nd Edition. John Wiley.
B. Widrow and M. E. Hoff, (1960) Adaptive switching circuits. 1960 IRE WESCON Convention Record, pt. 4, pp. 96–104.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Harrington, E., Herbrich, R., Kivinen, J., Platt, J., Williamson, R.C. (2003). Online Bayes Point Machines. In: Whang, KY., Jeon, J., Shim, K., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2003. Lecture Notes in Computer Science(), vol 2637. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36175-8_24
Download citation
DOI: https://doi.org/10.1007/3-540-36175-8_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-04760-5
Online ISBN: 978-3-540-36175-6
eBook Packages: Springer Book Archive