Making Online Decisions with Bounded Memory

Lu, Chi-Jen; Lu, Wei-Fu

doi:10.1007/978-3-642-24412-4_21

Making Online Decisions with Bounded Memory

Chi-Jen Lu²² &
Wei-Fu Lu²³

Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6925))

Abstract

We study the online decision problem in which there are T steps to play and n actions to choose. For this problem, several algorithms achieve an optimal regret of \(O(\sqrt{T \ln n})\), but they all require about T ⁿ states, which one may not be able to afford when n and T are very large. We are interested in such large scale problems, and we would like to understand what an online algorithm can achieve with only a bounded number of states. We provide two algorithms, both with m ^n − 1 states, for a parameter m, which achieve regret of O(m + (T/m)ln (mn)) and \(O(n \sqrt{m} +T/\sqrt{m})\), respectively. We also show that any online algorithm with m ^n − 1 states must suffer a regret of Ω(T/m), which is close to what our algorithms achieve.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abernethy, J., Agarwal, A., Bartlett, P.L., Rakhlin, A.: A stochastic view of optimal regret through minimax duality. In: Proceedings of the 22nd Annual Conference on Learning Theory (2009)
Google Scholar
Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta algorithm and applications (2005) (manuscript)
Google Scholar
Ben-David, S., Pal, D., Shalev-Shwartz, S.: Agnostic online learning. In: Proceedings of the 22nd Annual Conference on Learning Theory (2009)
Google Scholar
Blum, A., Mansour, Y.: Learning, regret minimization, and equilibria. In: Algorithmic Game Theory. Cambridge University Press, New York (2007)
Google Scholar
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, New York (2006)
Book MATH Google Scholar
Dar, R., Feder, M.: Finite-memory universal prediction of individual continuous sequences. CoRR abs/1102.2836 (2011)
Google Scholar
Even-Dar, E., Kleinberg, R., Mannor, S., Mansour, Y.: Online learning for global cost functions. In: Proceedings of the 22nd Annual Conference on Learning Theory (2009)
Google Scholar
Freund, Y., Schapire, R.: A decision theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Article MathSciNet MATH Google Scholar
Freund, Y., Schapire, R.: Adaptive game playing using multiplicative weights. Games and Economic Behavior 29, 79–103 (1999)
Article MathSciNet MATH Google Scholar
Hazan, E., Kale, S.: Extracting certainty from uncertainty: regret bounded by variation in costs. In: Proceedings of the 21st Annual Conference on Learning Theory, pp. 57–68 (2008)
Google Scholar
Littlestone, N., Warmuth, M.: The weighted majority algorithm. Information and Computation 108(2), 212–261 (1994)
Article MathSciNet MATH Google Scholar
Meron, E., Feder, M.: Finite-memory universal prediction of individual sequences. IEEE Transactions on Information Theory 50(7), 1506–1523 (2004)
Article MathSciNet MATH Google Scholar
Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Proceedings of the 20th International Conference on Machine Learning, pp. 928–936 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information Science, Academia Sinica, Taipei, Taiwan
Chi-Jen Lu
Department of Computer Science and Information Engineering, Asia University, Taiwan
Wei-Fu Lu

Authors

Chi-Jen Lu
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Fu Lu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Helsinki, (Gustaf Hällströmin katu 2b), P.O. Box 68, 00014, Helsinki, Finland
Jyrki Kivinen & Esko Ukkonen &
Department of Computing Science, University of Alberta, T6G 2E8, Edmonton, AB, Canada
Csaba Szepesvári
Division of Computer Science, Hokkaido University, N-14, W-9, 060-0814, Sapporo, Japan
Thomas Zeugmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lu, CJ., Lu, WF. (2011). Making Online Decisions with Bounded Memory. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2011. Lecture Notes in Computer Science(), vol 6925. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24412-4_21

Download citation

DOI: https://doi.org/10.1007/978-3-642-24412-4_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24411-7
Online ISBN: 978-3-642-24412-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics