loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Piotr Januszewski ; Dominik Grzegorzek and Paweł Czarnul

Affiliation: Department of Computer Architecture, Gdańsk University of Technology, Gdańsk, Poland

Keyword(s): Contextual Multi-Armed Bandits, Offline Policy Learning, Dataset Quality.

Abstract: The Contextual Multi-Armed Bandits (CMAB) framework is pivotal for learning to make decisions. However, due to challenges in deploying online algorithms, there is a shift towards offline policy learning, which relies on pre-existing datasets. This study examines the relationship between the quality of these datasets and the performance of offline policy learning algorithms, specifically, Neural Greedy and NeuraLCB. Our results demonstrate that NeuraLCB can learn from various datasets, while Neural Greedy necessitates extensive coverage of the action-space for effective learning. Moreover, the way data is collected significantly affects offline methods’ efficiency. This underscores the critical role of dataset quality in offline policy learning.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 3.149.251.155

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Januszewski, P.; Grzegorzek, D. and Czarnul, P. (2024). Dataset Characteristics and Their Impact on Offline Policy Learning of Contextual Multi-Armed Bandits. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-680-4; ISSN 2184-433X, SciTePress, pages 87-98. DOI: 10.5220/0012311000003636

@conference{icaart24,
author={Piotr Januszewski. and Dominik Grzegorzek. and Paweł Czarnul.},
title={Dataset Characteristics and Their Impact on Offline Policy Learning of Contextual Multi-Armed Bandits},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2024},
pages={87-98},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012311000003636},
isbn={978-989-758-680-4},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - Dataset Characteristics and Their Impact on Offline Policy Learning of Contextual Multi-Armed Bandits
SN - 978-989-758-680-4
IS - 2184-433X
AU - Januszewski, P.
AU - Grzegorzek, D.
AU - Czarnul, P.
PY - 2024
SP - 87
EP - 98
DO - 10.5220/0012311000003636
PB - SciTePress