Abstract:
In this work, we study the no-sensing adversarial multi-player multi-armed bandits problem. A new dimension of hardness, called attackability, is introduced, which is ort...Show MoreMetadata
Abstract:
In this work, we study the no-sensing adversarial multi-player multi-armed bandits problem. A new dimension of hardness, called attackability, is introduced, which is orthogonal to the hardness of multiple players. All adversaries can be categorized based on the attackability and we introduce Adversary-Adaptive Collision-Communication (A2C2), a family of algorithms with forced-collision communications among players. Information-theoretic tools of the Z-channel model, error-correction/detection coding, and randomized communication are utilized to address the challenge of implicit communication without collision information in an adversarial environment. Theoretical analysis proves that asymptotic attackability-dependent sublinear regrets can be achieved, which do not have an exponential dependence on the number of players and as a result reveal a fundamental tradeoff between the two dimensions of hardness in this problem.
Date of Conference: 12-20 July 2021
Date Added to IEEE Xplore: 01 September 2021
ISBN Information: