skip to main content
research-article

Learning Hurdles for Sleeping Experts

Published: 01 July 2014 Publication History

Abstract

We study the online decision problem in which the set of available actions varies over time, also called the sleeping experts problem. We consider the setting in which the performance comparison is made with respect to the best ordering of actions in hindsight. In this article, both the payoff function and the availability of actions are adversarial. Kleinberg et al. [2010] gave a computationally efficient no-regret algorithm in the setting in which payoffs are stochastic. Kanade et al. [2009] gave an efficient no-regret algorithm in the setting in which action availability is stochastic.
However, the question of whether there exists a computationally efficient no-regret algorithm in the adversarial setting was posed as an open problem by Kleinberg et al. [2010]. We show that such an algorithm would imply an algorithm for PAC learning DNF, a long-standing important open problem. We also consider the setting in which the number of available actions is restricted and study its relation to agnostic-learning monotone disjunctions over examples with bounded Hamming weight.

References

[1]
J. Abernethy. 2010. Can we learn to gamble efficiently? (open problem). In Proceedings of the 23rd Annual Conference on Learning Theory. 318--319.
[2]
S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. 1998. Proof verification and the hardness of approximation problems. J. ACM 45, 501--555.
[3]
S. Ben-David, D. Pál, and S. Shalev-Shwartz. 2009. Agnostic online learning. In Proceedings of the 22nd Annual Conference on Learning Theory.
[4]
A. Beygelzimer, J. Langford, L. Li, L. Reyzin, and R. E. Schapire. 2011. Contextual bandit algorithms with supervised learning guarantees. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR Proceedings Track, 19--26.
[5]
A. Blum and Y. Mansour. 2007. From external to internal regret. J. Machine Learn. Res. 8, 1307--1324.
[6]
N. Cesa-Bianchi, A. Conconi, and C. Gentile. 2004. On the generalization ability of on-line learning algorithms. IEEE Trans. Inform. Theory 50, 9, 2050--2057.
[7]
N. Cesa-Bianchi and G. Lugosi. 2006. Prediction, Learning, and Games. Cambridge University Press.
[8]
D. P. Dubhashi and A. Panconesi. 2009. Concentration of Measure for the Analysis of Randomized Algorithms. Cambridge University Press.
[9]
M. Dudik, D. Hsu, S. Kale, N. Karampatziakis, J. Langford, L. Reyzin, and T. Zhang. 2011. Efficient optimal learning for contextual bandits. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelliegence, 169--178.
[10]
Y. Freund and R. E. Schapire. 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the 2nd European Conference on Computational Learning Theory. 23--37.
[11]
Y. Freund, R. E. Schapire, Y. Singer, and M. K. Warmuth. 1997. Using and combining predictors that specialize. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing. ACM, New York, NY, 334--343.
[12]
J. Håstad. 2001. Some optimal inapproximability results. J. ACM 48, 798--859.
[13]
D. Haussler. 1992. Decision theoretic generalizations of the PAC model for neural net and other learning applications. Inform. Computat. 100, 78--150.
[14]
E. Hazan, S. Kale, and S. Shalev-Shwartz. 2012. Near-optimal algorithms for online matrix prediction. In Proceedings of the 25th Annual Conference on Learning Theory, Vol. 23, JMLR Proceedings Track, 38.1--38.13.
[15]
A. T. Kalai, V. Kanade, and Y. Mansour. 2009. Reliable agnostic learning. In Proceedings of the 22nd Annual Conference on Learning Theory.
[16]
A. T. Kalai, A. R. Klivans, Y. Mansour, and R. A. Servedio. 2005. Agnostically learning halfspaces. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science. IEEE.
[17]
V. Kanade, B. McMahan, and B. Bryan. 2009. Sleeping experts and bandits with stochastic action availability and adversarial rewards. In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics. JMLR Proceedings Track, 272--279.
[18]
M. J. Kearns, R. E. Schapire, and L. M. Sellie. 1994. Toward efficient agnostic learning. Machine Learn. 17, 2--3, 115--141.
[19]
R. Kleinberg, A. Niculescu-Mizil, and Y. Sharma. 2010. Regret bounds for sleeping experts and bandits. Machine Learn. 80, 2--3, 245--272.
[20]
A. R. Klivans and R. A. Servedio. 2001. Learning DNF in time 2Õ(n1/3). In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing. ACM, New York, NY, 258--265.
[21]
A. R. Klivans and A. Sherstov. 2007. A lower bound for agnostically learning disjunctions. In Proceedings of the 20th Annual Conference on Learning Theory. 409--423.
[22]
J. Langford and T. Zhang. 2007. The epoch-greedy algorithm for contextual multi-armed bandits. In Proceedings of the 23rd Annual Conference on Neural Information Processing Systems.
[23]
N. Littlestone. 1989. From on-line to batch learning. In Proceedings of the 2nd Annual Workshop on Computational Learning Theory. 269--284.
[24]
C. H. Papadimitriou and M. Yannakakis. 1991. Optimization, approximation, and complexity classes. J. Comput. System Sci. 43, 3, 425--440.
[25]
S. Shalev-Shwartz, O. Shamir, and K. Sridharan. 2010. Learning kernel-based halfspaces with the zero-one loss. In Proceedings of the 23rd Annual Conference on Learning Theory. 441--450.
[26]
L. G. Valiant. 1984. A theory of the learnable. Commun. ACM 27, 11, 1134--1142.

Cited By

View all
  • (2024)On multi-armed bandit with impatient armsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693879(44429-44473)Online publication date: 21-Jul-2024
  • (2024)Decentralized Funding of Public Goods in Blockchain System: Leveraging Expert AdviceIEEE Transactions on Cloud Computing10.1109/TCC.2024.339497312:2(725-736)Online publication date: Apr-2024
  • (2023)Adversarial Sleeping Bandit Problems with Multiple Plays: Algorithm and Ranking ApplicationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608824(744-749)Online publication date: 14-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computation Theory
ACM Transactions on Computation Theory  Volume 6, Issue 3
Special issue on innovations in theoretical computer science 2012 - Part II
July 2014
107 pages
ISSN:1942-3454
EISSN:1942-3462
DOI:10.1145/2663945
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2014
Accepted: 01 July 2013
Revised: 01 June 2013
Received: 01 September 2012
Published in TOCT Volume 6, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Online learning
  2. lower bounds
  3. sleeping experts

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)19
  • Downloads (Last 6 weeks)3
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)On multi-armed bandit with impatient armsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693879(44429-44473)Online publication date: 21-Jul-2024
  • (2024)Decentralized Funding of Public Goods in Blockchain System: Leveraging Expert AdviceIEEE Transactions on Cloud Computing10.1109/TCC.2024.339497312:2(725-736)Online publication date: Apr-2024
  • (2023)Adversarial Sleeping Bandit Problems with Multiple Plays: Algorithm and Ranking ApplicationProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608824(744-749)Online publication date: 14-Sep-2023
  • (2022)Funding Public Goods with Expert Advice in Blockchain System2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS54860.2022.00026(180-190)Online publication date: Jul-2022
  • (2020)Combinatorial Sleeping Bandits With Fairness ConstraintsIEEE Transactions on Network Science and Engineering10.1109/TNSE.2019.29543107:3(1799-1813)Online publication date: 1-Jul-2020
  • (2019)Combinatorial Sleeping Bandits with Fairness ConstraintsIEEE INFOCOM 2019 - IEEE Conference on Computer Communications10.1109/INFOCOM.2019.8737461(1702-1710)Online publication date: Apr-2019
  • (2016)Hardness of online sleeping combinatorial optimization problemsProceedings of the 30th International Conference on Neural Information Processing Systems10.5555/3157096.3157341(2189-2197)Online publication date: 5-Dec-2016

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media