Loading [a11y]/accessibility-menu.js
Risk-averse trees for learning from logged bandit feedback | IEEE Conference Publication | IEEE Xplore

Risk-averse trees for learning from logged bandit feedback


Abstract:

Logged data is one of the most widespread form of recorded information, since it can be acquired by almost any system and stored at a little cost. Customarily, the intera...Show More

Abstract:

Logged data is one of the most widespread form of recorded information, since it can be acquired by almost any system and stored at a little cost. Customarily, the interaction logs between the system and a user (or environment) present the structure of a sequential decision process: given a context, the system performs an action and the user provides a feedback about it. This structure is common to a wide range of real-world micro-economic applications, e.g., e-commerce websites and advertisement campaigns. The problem of learning a policy from such logged interactions to take more profitable decisions in the future is known as the Learning from Logged Bandit Feedback (LLBF) problem. In this paper, we propose RADT, an algorithm specifically shaped for the LLBF setting and based on a risk-averse learning method which exploits the joint use of regression trees and statistical confidence bounds. Differently from existing techniques developed for this setting, RADT generates policies aiming to maximize a lower bound on the expected reward and provides a clear characterization of those features in the context that influence the process the most. Finally, we provide a wide experimental campaign over both synthetic and real-world datasets showing empirical evidence that RADT outperforms both state-of-the-art machine learning classification and regression techniques and existing methods addressing the LLBF setting.
Date of Conference: 14-19 May 2017
Date Added to IEEE Xplore: 03 July 2017
ISBN Information:
Electronic ISSN: 2161-4407
Conference Location: Anchorage, AK, USA

References

References is not available for this document.