Abstract
No-regret algorithms for online convex optimization are potent online learning tools and have been demonstrated to be successful in a wide-ranging number of applications. Considering affine and external regret, we investigate what happens when a set of no-regret learners (voters) merge their respective decisions in each learning iteration to a single, common one in form of a convex combination. We show that an agent (or algorithm) that executes this merged decision in each iteration of the online learning process and each time feeds back a copy of its own reward function to the voters, incurs sublinear regret itself. As a by-product, we obtain a simple method that allows us to construct new no-regret algorithms out of known ones.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Blum, A., Even-Dar, E., Ligett, K.: Routing without regret: on convergence to nash equilibria of regret-minimizing algorithms in routing games. In: PODC 2006: Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing (2006)
Blum, A., Mansour, Y.: From external to internal regret. In: Auer, P., Meir, R. (eds.) COLT 2005. LNCS, vol. 3559, pp. 621–636. Springer, Heidelberg (2005)
Blum, A., Kumar, V., Rudra, A., Wu, F.: Online learning in online auctions. Theor. Comput. Sci. 324(2-3), 137–146 (2004)
Boyd, S., Vandenberghe, L.: Convex optimization. Cambridge University Press, Cambridge (2004)
Calliess, J.-P.: On fixed convex combinations of no-regret learners, Tech. Report CMU-ML-08-112, Carnegie Mellon (2008)
Calliess, J.-P., Gordon, G.J.: No-regret learning and a mechanism for distributed multiagent planning. In: Proc. of 7th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2008) (2008)
Foster, D., Vohra, R.: Calibrated learning and correlated equilibrium. Games and Economic Behavior (1997)
Freund, Y., Shapire, R.E.: Game theory, on-line prediction and boosting. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS, vol. 2777. Springer, Heidelberg (2003)
Gordon, G.: No-regret algorithms for online convex programs. In: Advances in Neural Information Processing Systems, vol. 19 (2007)
Gordon, G.J.: Approximate solutions to markov decision processes, Ph.D. thesis, Carnegie Mellon University (1999)
Gordon, G.J., Greenwald, A., Marks, C.: No-regret learning in convex games. In: 25th Int. Conf. on Machine Learning (ICML 2008) (2008)
Hannan, J.: Contributions to the theory of games. Princeton University Press, Princeton (1957)
Jafari, A., Greenwald, A.R., Gondek, D., Ercal, G.: On no-regret learning, fictitious play, and nash equilibrium. In: ICML 2001: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 226–233 (2001)
Kalai, A., Vempala, S.: Efficient algorithms for online decision problems. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS, vol. 2777, pp. 26–40. Springer, Heidelberg (2003)
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. In: IEEE Symposium on Foundations of Computer Science, pp. 256–261 (1989)
Sahota, M.K., Mackworth, A.K., Barman, R.A., Kingdon, S.J.: Real-time control of soccer-playing robots using off-board vision: the dynamite testbed. In: IEEE International Conference on Systems, Man, and Cybernetics, pp. 3690–3663 (1995)
Shapire, R.E.: The strength of weak learnability. Machine Learning 5(2), 197–227 (1990); First boosting method
Stoltz, G., Lugosi, G.: Learning correlated equilibria in games with compact sets of strategies. Games and Economic Behavior 59, 187–208 (2007)
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Twentieth International Conference on Machine Learning (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Calliess, JP. (2009). On Fixed Convex Combinations of No-Regret Learners. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_37
Download citation
DOI: https://doi.org/10.1007/978-3-642-03070-3_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03069-7
Online ISBN: 978-3-642-03070-3
eBook Packages: Computer ScienceComputer Science (R0)