Abstract
We consider a natural convolution kernel defined by a directed graph. Each edge contributes an input. The inputs along a path form a product and the products for all paths are summed. We also have a set of probabilities on the edges so that the outflow from each node is one. We then discuss multiplicative updates on these graphs where the prediction is essentially a kernel computation and the update contributes a factor to each edge. Now the total outflow out of each node is not one any more. However some clever algorithms re-normalize the weights on the paths so that the total outflow out of each node is one again. Finally we discuss the use of regular expressions for speeding up the kernel and re-normalization computation. In particular we rewrite the multiplicative algorithms that predict as well as the best pruning of a series parallel graph in terms of efficient kernel computations.
Part of this work was done while the author visited University of California, Santa Cruz.
Supported by NSF grant CCR 9821087
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119–139, August 1997.
David Haussler. Convolution kernels on discrete structures. Technical Report UCSC-CRL-99-10, Univ. of Calif. Computer Research Lab, Santa Cruz, CA, 1999.
D. P. Helmbold, S. Panizza, and M. K. Warmuth. Direct and indirect algorithms for on-line learning of disjunctions. Theoretical Computer Science, 2002. To appear.
D. P. Helmbold and R. E. Schapire. Predicting nearly as well as the best pruning of a decision tree. Machine Learning, 27(01):51–68, 1997.
Roni Khardon, Dan Roth, and Rocco Servedio. Efficiency versus convergence of Boolean kernels for on-line learning algorithms. In Advances in Neural Information Processing Systems 14, 2001.
J. Kivinen and M. K. Warmuth. Additive versus exponentiated gradient updates for linear prediction. Information and Computation, 132(1):1–64, January 1997.
N. Littlestone and M. K. Warmuth. The weighted majority algorithm. Inform. Comput., 108(2):212–261, 1994.
Mehryar Mohri. General algebraic frameworks and algorithms for shortest distance problems. Technical Report 981219-10TM, AT&T Labs-Research, 1998.
M. Maass and M. K. Warmuth. Efficient learning with virtual threshold gates. Information and Computation, 141(1):66–83, February 1998.
Eiji Takimoto and Manfred K. Warmuth. Predicting nearly as well as the best pruning of a planar decision graph. In 10th ALT, volume 1720 of Lecture Notes in Artificial Intelligence, pages 335–346, 1999. To appear in Theoretical Computer Science.
Chris Watkins. Dynamic alignment kernels. Technical Report CSD-TR-98-11, Royal Holloway, University of London, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takimoto, E., Warmuth, M.K. (2002). Path Kernels and Multiplicative Updates. In: Kivinen, J., Sloan, R.H. (eds) Computational Learning Theory. COLT 2002. Lecture Notes in Computer Science(), vol 2375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45435-7_6
Download citation
DOI: https://doi.org/10.1007/3-540-45435-7_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43836-6
Online ISBN: 978-3-540-45435-9
eBook Packages: Springer Book Archive