research-article

Learning hurdles for sleeping experts

Authors:

Thomas SteinkeAuthors Info & Claims

ITCS '12: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference

Pages 11 - 18

https://doi.org/10.1145/2090236.2090238

Published: 08 January 2012 Publication History

Abstract

We study the online decision problem where the set of available actions varies over time, also called the sleeping experts problem. We consider the setting where the performance comparison is made with respect to the best ordering of actions in hindsight. In this paper, both the payoff function and the availability of actions is adversarial. Kleinberg et al. (2008) gave a computationally efficient no-regret algorithm in the setting where payoffs are stochastic. Kanade et al. (2009) gave an efficient no-regret algorithm in the setting where action availability is stochastic.

However, the question of whether there exists a computationally efficient no-regret algorithm in the adversarial setting was posed as an open problem by Kleinberg et al. (2008). We show that such an algorithm would imply an algorithm for PAC learning DNF, a long standing important open problem. We also show that a related problem, the gambling problem, posed as an open problem by Abernethy (2010) is related to agnostically learning halfspaces, albeit under restricted distributions.

References

[1]

J. Abernethy. Can we learn to gamble efficiently? (open problem). In COLT, 2010.

[2]

S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and the hardness of approximation problems. J. ACM, 45:501--555, May 1998.

Digital Library

[3]

S. Ben-David, D. Pál, and S. Shalev-Shwartz. Agnostic online learning. In COLT, 2009.

[4]

A. Blum and Y. Mansour. From external to internal regret. Journal of Machine Learning Research, 8:1307--1324, 2007.

Digital Library

[5]

N. Cesa-Bianchi, A. Conconi, and C. Gentile. On the generalization ability of on-line learning algorithms. IEEE Transactions on Information Theory, 50(9):2050--2057, 2004.

Digital Library

[6]

N. Cesa-Bianchi and G. Lugosi. Prediction, Learning, and Games. Cambridge University Press, 2006.

Digital Library

[7]

Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the second European conference on computational learning theory, 1995.

Digital Library

[8]

Y. Freund, R. E. Schapire, Y. Singer, and M. K. Warmuth. Using and combining predictors that specialize. In Proceedings of the twenty-ninth annual ACM symposium on Theory of computing, 1997.

Digital Library

[9]

M. R. Garey and D. S. Johnson. Computers and Intractability. W. H. Freeman and Co., New York, NY, USA, 1979.

Digital Library

[10]

J. Håstad. Some optimal inapproximability results. J. ACM, 48:798--859, July 2001.

Digital Library

[11]

D. Haussler. Decision theoretic generalizations of the pac model for neural net and other learning applications. Information and Computation, 100:78--150, 1992.

Digital Library

[12]

A. T. Kalai, V. Kanade, and Y. Mansour. Reliable agnostic learning. In COLT, 2009.

[13]

A. T. Kalai, A. R. Klivans, Y. Mansour, and R. A. Servedio. Agnostically learning halfspaces. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, 2005.

Digital Library

[14]

V. Kanade, B. McMahan, and B. Bryan. Sleeping experts and bandits with stochastic action availability and adversarial rewards. In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, pages 272--279, 2009.

[15]

M. J. Kearns, R. E. Schapire, and L. M. Sellie. Toward efficient agnostic learning. Machine Learning, 17(2-3):115--141, 1994.

Digital Library

[16]

R. Kleinberg, A. Niculescu-Mizil, and Y. Sharma. Regret bounds for sleeping experts and bandits. Machine learning, pages 1--28, 2008.

[17]

A. R. Klivans and R. Servedio. Learning dnf in time. In Proceedings of the thirty-third annual ACM symposium on Theory of computing, STOC '01, pages 258--265, New York, NY, USA, 2001. ACM.

Digital Library

[18]

N. Littlestone. From on-line to batch learning. In Proceedings of the second annual workshop on computational learning theory, 1989.

Digital Library

[19]

S. Shalev-Shwartz, O. Shamir, and K. Sridharan. Learning kernel-based halfspaces with the zero-one loss. In Proceedings of the 23rd Annual Conference on Learning Theory, 2010.

Cited By

Neu GValko M(2014)Online combinatorial optimization with stochastic decision sets and adversarial lossesProceedings of the 28th International Conference on Neural Information Processing Systems - Volume 210.5555/2969033.2969137(2780-2788)Online publication date: 8-Dec-2014
https://dl.acm.org/doi/10.5555/2969033.2969137
Abbasi-Yadkori YBartlett PKanade VSeldin YSzepesvári C(2013)Online learning in Markov decision processes with adversarially chosen transition probability distributionsProceedings of the 27th International Conference on Neural Information Processing Systems - Volume 210.5555/2999792.2999892(2508-2516)Online publication date: 5-Dec-2013
https://dl.acm.org/doi/10.5555/2999792.2999892
Dams JHoefer MKesselheim T(2013)Sleeping Experts in Wireless NetworksDistributed Computing10.1007/978-3-642-41527-2_24(344-357)Online publication date: 2013
https://doi.org/10.1007/978-3-642-41527-2_24

Index Terms

Learning hurdles for sleeping experts
1. Theory of computation
  1. Computational complexity and cryptography
    1. Complexity classes

Recommendations

Learning Hurdles for Sleeping Experts
Special issue on innovations in theoretical computer science 2012 - Part II

We study the online decision problem in which the set of available actions varies over time, also called the sleeping experts problem. We consider the setting in which the performance comparison is made with respect to the best ordering of actions in ...
Kernelization Lower Bounds Through Colors and IDs

In parameterized complexity, each problem instance comes with a parameter k, and a parameterized problem is said to admit a polynomial kernel if there are polynomial time preprocessing rules that reduce the input instance to an instance with size ...
Lower bounds on kernelization

Preprocessing (data reduction or kernelization) to reduce instance size is one of the most commonly deployed heuristics in the implementation practice to tackle computationally hard problems. However, a systematic theoretical study of them has remained ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ITCS '12: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference

January 2012

516 pages

ISBN:9781450311151

DOI:10.1145/2090236

Program Chair:
Shafi Goldwasser
MIT

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 January 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Division of Computing and Communication Foundations

Conference

ITCS '12

Sponsor:

SIGACT

ITCS '12: Innovations in Theoretical Computer Science

January 8 - 10, 2012

Massachusetts, Cambridge

Acceptance Rates

ITCS '12 Paper Acceptance Rate 39 of 93 submissions, 42%;

Overall Acceptance Rate 172 of 513 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
104
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Neu GValko M(2014)Online combinatorial optimization with stochastic decision sets and adversarial lossesProceedings of the 28th International Conference on Neural Information Processing Systems - Volume 210.5555/2969033.2969137(2780-2788)Online publication date: 8-Dec-2014
https://dl.acm.org/doi/10.5555/2969033.2969137
Abbasi-Yadkori YBartlett PKanade VSeldin YSzepesvári C(2013)Online learning in Markov decision processes with adversarially chosen transition probability distributionsProceedings of the 27th International Conference on Neural Information Processing Systems - Volume 210.5555/2999792.2999892(2508-2516)Online publication date: 5-Dec-2013
https://dl.acm.org/doi/10.5555/2999792.2999892
Dams JHoefer MKesselheim T(2013)Sleeping Experts in Wireless NetworksDistributed Computing10.1007/978-3-642-41527-2_24(344-357)Online publication date: 2013
https://doi.org/10.1007/978-3-642-41527-2_24

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten