Skip to main content

Making Online Decisions with Bounded Memory

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6925))

Abstract

We study the online decision problem in which there are T steps to play and n actions to choose. For this problem, several algorithms achieve an optimal regret of \(O(\sqrt{T \ln n})\), but they all require about T n states, which one may not be able to afford when n and T are very large. We are interested in such large scale problems, and we would like to understand what an online algorithm can achieve with only a bounded number of states. We provide two algorithms, both with m n − 1 states, for a parameter m, which achieve regret of O(m + (T/m)ln (mn)) and \(O(n \sqrt{m} +T/\sqrt{m})\), respectively. We also show that any online algorithm with m n − 1 states must suffer a regret of Ω(T/m), which is close to what our algorithms achieve.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abernethy, J., Agarwal, A., Bartlett, P.L., Rakhlin, A.: A stochastic view of optimal regret through minimax duality. In: Proceedings of the 22nd Annual Conference on Learning Theory (2009)

    Google Scholar 

  2. Arora, S., Hazan, E., Kale, S.: The multiplicative weights update method: a meta algorithm and applications (2005) (manuscript)

    Google Scholar 

  3. Ben-David, S., Pal, D., Shalev-Shwartz, S.: Agnostic online learning. In: Proceedings of the 22nd Annual Conference on Learning Theory (2009)

    Google Scholar 

  4. Blum, A., Mansour, Y.: Learning, regret minimization, and equilibria. In: Algorithmic Game Theory. Cambridge University Press, New York (2007)

    Google Scholar 

  5. Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, New York (2006)

    Book  MATH  Google Scholar 

  6. Dar, R., Feder, M.: Finite-memory universal prediction of individual continuous sequences. CoRR abs/1102.2836 (2011)

    Google Scholar 

  7. Even-Dar, E., Kleinberg, R., Mannor, S., Mansour, Y.: Online learning for global cost functions. In: Proceedings of the 22nd Annual Conference on Learning Theory (2009)

    Google Scholar 

  8. Freund, Y., Schapire, R.: A decision theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  9. Freund, Y., Schapire, R.: Adaptive game playing using multiplicative weights. Games and Economic Behavior 29, 79–103 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  10. Hazan, E., Kale, S.: Extracting certainty from uncertainty: regret bounded by variation in costs. In: Proceedings of the 21st Annual Conference on Learning Theory, pp. 57–68 (2008)

    Google Scholar 

  11. Littlestone, N., Warmuth, M.: The weighted majority algorithm. Information and Computation 108(2), 212–261 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  12. Meron, E., Feder, M.: Finite-memory universal prediction of individual sequences. IEEE Transactions on Information Theory 50(7), 1506–1523 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  13. Zinkevich, M.: Online convex programming and generalized infinitesimal gradient ascent. In: Proceedings of the 20th International Conference on Machine Learning, pp. 928–936 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lu, CJ., Lu, WF. (2011). Making Online Decisions with Bounded Memory. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds) Algorithmic Learning Theory. ALT 2011. Lecture Notes in Computer Science(), vol 6925. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24412-4_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24412-4_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24411-7

  • Online ISBN: 978-3-642-24412-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics