Abstract:
A class of asymptotically ε-optimal two-armed bandit controllers is given, and two criteria for comparing thelong-term finite-time performance of controllers in this clas...Show MoreMetadata
Abstract:
A class of asymptotically ε-optimal two-armed bandit controllers is given, and two criteria for comparing thelong-term finite-time performance of controllers in this class are proposed. The performances of three particular controllers are compared using the criteria, and the analysis is confirmed by computer iteration if the appropriate probability recurrence relations.
Published in: IEEE Transactions on Systems, Man, and Cybernetics ( Volume: SMC-3, Issue: 2, March 1973)