Skip to main content

Online Learning with Constraints

  • Conference paper
Learning Theory (COLT 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4005))

Included in the following conference series:

Abstract

We study online learning where the objective of the decision maker is to maximize her average long-term reward given that some average constraints are satisfied along the sample path. We define the reward-in-hindsight as the highest reward the decision maker could have achieved, while satisfying the constraints, had she known Nature’s choices in advance. We show that in general the reward-in-hindsight is not attainable. The convex hull of the reward-in-hindsight function is, however, attainable. For the important case of a single constraint the convex hull turns out to be the highest attainable function. We further provide an explicit strategy that attains this convex hull using a calibrated forecasting rule.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hannan, J.: Approximation to Bayes Risk in Repeated Play. Contribution to The Theory of Games, vol. III, pp. 97–139. Princeton University Press, Princeton (1957)

    Google Scholar 

  2. Mannor, S., Shimkin, N.: A geometric approach to multi-criterion reinforcement learning. Journal of Machine Learning Research 5, 325–360 (2004)

    MathSciNet  Google Scholar 

  3. Altman, E.: Constrained Markov Decision Processes. Chapman and Hall, Boca Raton (1999)

    MATH  Google Scholar 

  4. Shimkin, N.: Stochastic games with average cost constraints. In: Basar, T., Haurie, A. (eds.) Advances in Dynamic Games and Applications, pp. 219–230. Birkhäuser, Basel (1994)

    Google Scholar 

  5. Blackwell, D.: An analog of the minimax theorem for vector payoffs. Pacific J. Math. 6(1), 1–8 (1956)

    MathSciNet  MATH  Google Scholar 

  6. Spinat, X.: A necessary and sufficient condition for approachability. Mathematics of Operations Research 27(1), 31–44 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  7. Blackwell, D.: Controlled random walks. In: Proc. Int. Congress of Mathematicians 1954, vol. 3, pp. 336–338. North Holland, Amsterdam (1956)

    Google Scholar 

  8. Mannor, S., Shimkin, N.: The empirical Bayes envelope and regret minimization in competitive Markov decision processes. Mathematics of Operations Research 28(2), 327–345 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  9. Mertens, J.F., Sorin, S., Zamir, S.: Repeated games. CORE Reprint Dps 9420, 9421 and 9422, Center for Operation Research and Econometrics, Universite Catholique De Louvain, Belgium (1994)

    Google Scholar 

  10. Foster, D.P., Vohra, R.V.: Calibrated learning and correlated equilibrium. Games and Economic Behavior 21, 40–55 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  11. Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, New York (2006)

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mannor, S., Tsitsiklis, J.N. (2006). Online Learning with Constraints. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_39

Download citation

  • DOI: https://doi.org/10.1007/11776420_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35294-5

  • Online ISBN: 978-3-540-35296-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics