research-article

Learning Hurdles for Sleeping Experts

Authors:

Thomas SteinkeAuthors Info & Claims

ACM Transactions on Computation Theory (TOCT), Volume 6, Issue 3

Article No.: 11, Pages 1 - 16

https://doi.org/10.1145/2505983

Published: 01 July 2014 Publication History

Abstract

We study the online decision problem in which the set of available actions varies over time, also called the sleeping experts problem. We consider the setting in which the performance comparison is made with respect to the best ordering of actions in hindsight. In this article, both the payoff function and the availability of actions are adversarial. Kleinberg et al. [2010] gave a computationally efficient no-regret algorithm in the setting in which payoffs are stochastic. Kanade et al. [2009] gave an efficient no-regret algorithm in the setting in which action availability is stochastic.

However, the question of whether there exists a computationally efficient no-regret algorithm in the adversarial setting was posed as an open problem by Kleinberg et al. [2010]. We show that such an algorithm would imply an algorithm for PAC learning DNF, a long-standing important open problem. We also consider the setting in which the number of available actions is restricted and study its relation to agnostic-learning monotone disjunctions over examples with bounded Hamming weight.

References

[1]

J. Abernethy. 2010. Can we learn to gamble efficiently? (open problem). In Proceedings of the 23rd Annual Conference on Learning Theory. 318--319.

[2]

S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. 1998. Proof verification and the hardness of approximation problems. J. ACM 45, 501--555.

Digital Library

[3]

S. Ben-David, D. Pál, and S. Shalev-Shwartz. 2009. Agnostic online learning. In Proceedings of the 22nd Annual Conference on Learning Theory.

[4]

A. Beygelzimer, J. Langford, L. Li, L. Reyzin, and R. E. Schapire. 2011. Contextual bandit algorithms with supervised learning guarantees. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR Proceedings Track, 19--26.

[5]

A. Blum and Y. Mansour. 2007. From external to internal regret. J. Machine Learn. Res. 8, 1307--1324.

Digital Library

[6]

N. Cesa-Bianchi, A. Conconi, and C. Gentile. 2004. On the generalization ability of on-line learning algorithms. IEEE Trans. Inform. Theory 50, 9, 2050--2057.

Digital Library

[7]

N. Cesa-Bianchi and G. Lugosi. 2006. Prediction, Learning, and Games. Cambridge University Press.

Digital Library

[8]

D. P. Dubhashi and A. Panconesi. 2009. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press.

Digital Library

[9]

M. Dudik, D. Hsu, S. Kale, N. Karampatziakis, J. Langford, L. Reyzin, and T. Zhang. 2011. Efficient optimal learning for contextual bandits. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelliegence, 169--178.

[10]

Y. Freund and R. E. Schapire. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the 2nd European Conference on Computational Learning Theory. 23--37.

Digital Library

[11]

Y. Freund, R. E. Schapire, Y. Singer, and M. K. Warmuth. 1997. Using and combining predictors that specialize. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing. ACM, New York, NY, 334--343.

Digital Library

[12]

J. Håstad. 2001. Some optimal inapproximability results. J. ACM 48, 798--859.

Digital Library

[13]

D. Haussler. 1992. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Inform. Computat. 100, 78--150.

Digital Library

[14]

E. Hazan, S. Kale, and S. Shalev-Shwartz. 2012. Near-optimal algorithms for online matrix prediction. In Proceedings of the 25th Annual Conference on Learning Theory, Vol. 23, JMLR Proceedings Track, 38.1--38.13.

[15]

A. T. Kalai, V. Kanade, and Y. Mansour. 2009. Reliable agnostic learning. In Proceedings of the 22nd Annual Conference on Learning Theory.

[16]

A. T. Kalai, A. R. Klivans, Y. Mansour, and R. A. Servedio. 2005. Agnostically learning halfspaces. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science. IEEE.

Digital Library

[17]

V. Kanade, B. McMahan, and B. Bryan. 2009. Sleeping experts and bandits with stochastic action availability and adversarial rewards. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics. JMLR Proceedings Track, 272--279.

[18]

M. J. Kearns, R. E. Schapire, and L. M. Sellie. 1994. Toward efficient agnostic learning. Machine Learn. 17, 2--3, 115--141.

Digital Library

[19]

R. Kleinberg, A. Niculescu-Mizil, and Y. Sharma. 2010. Regret bounds for sleeping experts and bandits. Machine Learn. 80, 2--3, 245--272.

Digital Library

[20]

A. R. Klivans and R. A. Servedio. 2001. Learning DNF in time 2Õ(n1/3). In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing. ACM, New York, NY, 258--265.

Digital Library

[21]

A. R. Klivans and A. Sherstov. 2007. A lower bound for agnostically learning disjunctions. In Proceedings of the 20th Annual Conference on Learning Theory. 409--423.

Digital Library

[22]

J. Langford and T. Zhang. 2007. The epoch-greedy algorithm for contextual multi-armed bandits. In Proceedings of the 23rd Annual Conference on Neural Information Processing Systems.

[23]

N. Littlestone. 1989. From on-line to batch learning. In Proceedings of the 2nd Annual Workshop on Computational Learning Theory. 269--284.

Digital Library

[24]

C. H. Papadimitriou and M. Yannakakis. 1991. Optimization, approximation, and complexity classes. J. Comput. System Sci. 43, 3, 425--440.

[25]

S. Shalev-Shwartz, O. Shamir, and K. Sridharan. 2010. Learning kernel-based halfspaces with the zero-one loss. In Proceedings of the 23rd Annual Conference on Learning Theory. 441--450.

[26]

L. G. Valiant. 1984. A theory of the learnable. Commun. ACM 27, 11, 1134--1142.

Digital Library

Cited By

Shao YFang ZSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)On multi-armed bandit with impatient armsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693879(44429-44473)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693879
Li JCheng YHuang WZhang MFan JDeng XXie JZhang J(2024)Decentralized Funding of Public Goods in Blockchain System: Leveraging Expert AdviceIEEE Transactions on Cloud Computing10.1109/TCC.2024.339497312:2(725-736)Online publication date: Apr-2024
https://doi.org/10.1109/TCC.2024.3394973
Yuan JWoon WCoba L(2023)Adversarial Sleeping Bandit Problems with Multiple Plays: Algorithm and Ranking ApplicationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608824(744-749)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608824
Show More Cited By

Index Terms

Learning Hurdles for Sleeping Experts
1. Computing methodologies
  1. Machine learning
2. Theory of computation
  1. Computational complexity and cryptography
    1. Complexity classes

Recommendations

Bandits and Experts in Metric Spaces
Networking, Computational Complexity, Design and Analysis of Algorithms, Real Computation, Algorithms, Online Algorithms and Computer-aided Verification

In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of trials to maximize the total payoff of the chosen strategies. While the performance of bandit algorithms with a small finite strategy set is well ...
Bandits with switching costs: T^2/3 regret
STOC '14: Proceedings of the forty-sixth annual ACM symposium on Theory of computing

We study the adversarial multi-armed bandit problem in a setting where the player incurs a unit cost each time he switches actions. We prove that the player's T-round minimax regret in this setting is [EQUATION], thereby closing a fundamental gap in our ...
Learning hurdles for sleeping experts
ITCS '12: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference

We study the online decision problem where the set of available actions varies over time, also called the sleeping experts problem. We consider the setting where the performance comparison is made with respect to the best ordering of actions in ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computation Theory

ACM Transactions on Computation Theory Volume 6, Issue 3

Special issue on innovations in theoretical computer science 2012 - Part II

July 2014

107 pages

ISSN:1942-3454

EISSN:1942-3462

DOI:10.1145/2663945

Editor:
Eric Allender
Rutgers University, NJ, USA

Issue’s Table of Contents

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2014

Accepted: 01 July 2013

Revised: 01 June 2013

Received: 01 September 2012

Published in TOCT Volume 6, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Lord Rutherford Memorial Research Fellowship
Division of Computing and Communication Foundations

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
279
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)3

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shao YFang ZSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)On multi-armed bandit with impatient armsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693879(44429-44473)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693879
Li JCheng YHuang WZhang MFan JDeng XXie JZhang J(2024)Decentralized Funding of Public Goods in Blockchain System: Leveraging Expert AdviceIEEE Transactions on Cloud Computing10.1109/TCC.2024.339497312:2(725-736)Online publication date: Apr-2024
https://doi.org/10.1109/TCC.2024.3394973
Yuan JWoon WCoba L(2023)Adversarial Sleeping Bandit Problems with Multiple Plays: Algorithm and Ranking ApplicationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608824(744-749)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608824
Li JCheng YHuang WZhang MFan JDeng XXie J(2022)Funding Public Goods with Expert Advice in Blockchain System2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS54860.2022.00026(180-190)Online publication date: Jul-2022
https://doi.org/10.1109/ICDCS54860.2022.00026
Li FLiu JJi B(2020)Combinatorial Sleeping Bandits With Fairness ConstraintsIEEE Transactions on Network Science and Engineering10.1109/TNSE.2019.29543107:3(1799-1813)Online publication date: 1-Jul-2020
https://doi.org/10.1109/TNSE.2019.2954310
Li FLiu JJi B(2019)Combinatorial Sleeping Bandits with Fairness ConstraintsIEEE INFOCOM 2019 - IEEE Conference on Computer Communications10.1109/INFOCOM.2019.8737461(1702-1710)Online publication date: Apr-2019
https://doi.org/10.1109/INFOCOM.2019.8737461
Kale SLee CPál D(2016)Hardness of online sleeping combinatorial optimization problemsProceedings of the 30th International Conference on Neural Information Processing Systems10.5555/3157096.3157341(2189-2197)Online publication date: 5-Dec-2016
https://dl.acm.org/doi/10.5555/3157096.3157341

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents