A conjecture on the Feldman bandit problem

Maher Nouiehed; Sheldon M. Ross

doi:10.1017/jpr.2018.19

A conjecture on the Feldman bandit problem

Part of: Sequential methods Stochastic processes

Published online by Cambridge University Press: 28 March 2018

Maher Nouiehed and

Sheldon M. Ross

Show author details

Maher Nouiehed*: Affiliation:
University of Southern California
Sheldon M. Ross*: Affiliation:
University of Southern California
*: * Postal address: Department of Industrial and Systems Engineering, University of Southern California, Los Angeles, CA 90089, USA.
* Postal address: Department of Industrial and Systems Engineering, University of Southern California, Los Angeles, CA 90089, USA.

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We consider the Bernoulli bandit problem where one of the arms has win probability α and the others β, with the identity of the α arm specified by initial probabilities. With u = max(α, β), v = min(α, β), call an arm with win probability u a good arm. Whereas it is known that the strategy of always playing the arm with the largest probability of being a good arm maximizes the expected number of wins in the first n games for all n, we conjecture that it also stochastically maximizes the number of wins. That is, we conjecture that this strategy maximizes the probability of at least k wins in the first n games for all k, n. The conjecture is proven when k = 1, and k = n, and when there are only two arms and k = n - 1.

Keywords

Multi-armed bandit stochastically maximizing

MSC classification

Primary: 60G40: Stopping times; optimal stopping problems; gambling theory

Secondary: 62L05: Sequential design

Type: Short Communications
Information: Journal of Applied Probability , Volume 55 , Issue 1 , March 2018 , pp. 318 - 324

DOI: https://doi.org/10.1017/jpr.2018.19 [Opens in a new window]

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1]Feldman, D. (1962). Contributions to the "two-armed bandit" problem. Ann. Math. Statist. 33, 847–856. Google Scholar

[2]Rodman, L. (1978). On the many-armed bandit problem. Ann. Prob. 6, 491–498. CrossRef Google Scholar

[3]Presman, É. L. and Sonin, I. N. (1990). Sequential Control With Incomplete Information: The Bayesian Approach to Multi-Armed Bandit Problems. Academic Press, San Diego, CA. Google Scholar

Article contents

A conjecture on the Feldman bandit problem

Abstract

Keywords

MSC classification

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests