A Sample-Efficient Black-Box Optimizer to Train Policies for Human-in-the-Loop Systems With User Preferences

A Sample-Efficient Black-Box Optimizer to Train Policies for Human-in-the-Loop Systems With User Preferences | IEEE Journals & Magazine | IEEE Xplore