Abstract:
In many real-world reinforcement learning (RL) tasks, the agent who takes the actions often only has partial observations of the environment. On the other hand, a princip...Show MoreMetadata
Abstract:
In many real-world reinforcement learning (RL) tasks, the agent who takes the actions often only has partial observations of the environment. On the other hand, a principal may have a complete, system-level view but cannot directly take actions to interact with the environment. Motivated by this agent-principal capability mismatch, we study a novel “teaching” problem where the principal attempts to guide the agent's behavior via implicit adjustment on her observed rewards. Rather than solving specific instances of this problem, we develop a general RL framework for the principal to teach any RL agent without knowing the optimal action a priori. The key idea is to view the agent as part of the environment, and to directly set the reward adjustment as actions such that efficient learning and teaching can be simultaneously accomplished at the principal. This framework is fully adaptive to diverse principal and agent settings (such as heterogeneous agent strategies and adjustment costs), and can adopt a variety of RL algorithms to solve the teaching problem with provable performance guarantees. Extensive experimental results on different RL tasks demonstrate that the proposed framework guarantees a stable convergence and achieves the best tradeoff between rewards and costs among various baseline solutions.
Date of Conference: 22-24 March 2023
Date Added to IEEE Xplore: 10 April 2023
ISBN Information: