skip to main content
10.1145/2090236.2090238acmconferencesArticle/Chapter ViewAbstractPublication PagesitcsConference Proceedingsconference-collections
research-article

Learning hurdles for sleeping experts

Published: 08 January 2012 Publication History

Abstract

We study the online decision problem where the set of available actions varies over time, also called the sleeping experts problem. We consider the setting where the performance comparison is made with respect to the best ordering of actions in hindsight. In this paper, both the payoff function and the availability of actions is adversarial. Kleinberg et al. (2008) gave a computationally efficient no-regret algorithm in the setting where payoffs are stochastic. Kanade et al. (2009) gave an efficient no-regret algorithm in the setting where action availability is stochastic.
However, the question of whether there exists a computationally efficient no-regret algorithm in the adversarial setting was posed as an open problem by Kleinberg et al. (2008). We show that such an algorithm would imply an algorithm for PAC learning DNF, a long standing important open problem. We also show that a related problem, the gambling problem, posed as an open problem by Abernethy (2010) is related to agnostically learning halfspaces, albeit under restricted distributions.

References

[1]
J. Abernethy. Can we learn to gamble efficiently? (open problem). In COLT, 2010.
[2]
S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45:501--555, May 1998.
[3]
S. Ben-David, D. Pál, and S. Shalev-Shwartz. Agnostic online learning. In COLT, 2009.
[4]
A. Blum and Y. Mansour. From external to internal regret. Journal of Machine Learning Research, 8:1307--1324, 2007.
[5]
N. Cesa-Bianchi, A. Conconi, and C. Gentile. On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory, 50(9):2050--2057, 2004.
[6]
N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. Cambridge University Press, 2006.
[7]
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the second European conference on computational learning theory, 1995.
[8]
Y. Freund, R. E. Schapire, Y. Singer, and M. K. Warmuth. Using and combining predictors that specialize. In Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, 1997.
[9]
M. R. Garey and D. S. Johnson. Computers and Intractability. W. H. Freeman and Co., New York, NY, USA, 1979.
[10]
J. Håstad. Some optimal inapproximability results. J. ACM, 48:798--859, July 2001.
[11]
D. Haussler. Decision theoretic generalizations of the pac model for neural net and other learning applications. Information and Computation, 100:78--150, 1992.
[12]
A. T. Kalai, V. Kanade, and Y. Mansour. Reliable agnostic learning. In COLT, 2009.
[13]
A. T. Kalai, A. R. Klivans, Y. Mansour, and R. A. Servedio. Agnostically learning halfspaces. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, 2005.
[14]
V. Kanade, B. McMahan, and B. Bryan. Sleeping experts and bandits with stochastic action availability and adversarial rewards. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, pages 272--279, 2009.
[15]
M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. Machine Learning, 17(2-3):115--141, 1994.
[16]
R. Kleinberg, A. Niculescu-Mizil, and Y. Sharma. Regret bounds for sleeping experts and bandits. Machine learning, pages 1--28, 2008.
[17]
A. R. Klivans and R. Servedio. Learning dnf in time. In Proceedings of the thirty-third annual ACM symposium on Theory of computing, STOC '01, pages 258--265, New York, NY, USA, 2001. ACM.
[18]
N. Littlestone. From on-line to batch learning. In Proceedings of the second annual workshop on computational learning theory, 1989.
[19]
S. Shalev-Shwartz, O. Shamir, and K. Sridharan. Learning kernel-based halfspaces with the zero-one loss. In Proceedings of the 23rd Annual Conference on Learning Theory, 2010.

Cited By

View all
  • (2014)Online combinatorial optimization with stochastic decision sets and adversarial lossesProceedings of the 28th International Conference on Neural Information Processing Systems - Volume 210.5555/2969033.2969137(2780-2788)Online publication date: 8-Dec-2014
  • (2013)Online learning in Markov decision processes with adversarially chosen transition probability distributionsProceedings of the 27th International Conference on Neural Information Processing Systems - Volume 210.5555/2999792.2999892(2508-2516)Online publication date: 5-Dec-2013
  • (2013)Sleeping Experts in Wireless NetworksDistributed Computing10.1007/978-3-642-41527-2_24(344-357)Online publication date: 2013

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ITCS '12: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference
January 2012
516 pages
ISBN:9781450311151
DOI:10.1145/2090236
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 January 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. lower bounds
  2. online learning
  3. sleeping experts

Qualifiers

  • Research-article

Funding Sources

Conference

ITCS '12
Sponsor:
ITCS '12: Innovations in Theoretical Computer Science
January 8 - 10, 2012
Massachusetts, Cambridge

Acceptance Rates

ITCS '12 Paper Acceptance Rate 39 of 93 submissions, 42%;
Overall Acceptance Rate 172 of 513 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2014)Online combinatorial optimization with stochastic decision sets and adversarial lossesProceedings of the 28th International Conference on Neural Information Processing Systems - Volume 210.5555/2969033.2969137(2780-2788)Online publication date: 8-Dec-2014
  • (2013)Online learning in Markov decision processes with adversarially chosen transition probability distributionsProceedings of the 27th International Conference on Neural Information Processing Systems - Volume 210.5555/2999792.2999892(2508-2516)Online publication date: 5-Dec-2013
  • (2013)Sleeping Experts in Wireless NetworksDistributed Computing10.1007/978-3-642-41527-2_24(344-357)Online publication date: 2013

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media