Information Directed Reward Learning for Reinforcement Learning

Abstract

For many reinforcement learning (RL) applications, specifying a reward isdifficult. In this paper, we consider an RL setting where the agent can obtaininformation about the reward only by querying an expert that can, for example,evaluate individual states or provide binary preferences over trajectories.From such expensive feedback, we aim to learn a model of the reward functionthat allows standard RL algorithms to achieve high expected return with as fewexpert queries as possible. For this purpose, we propose Information DirectedReward Learning (IDRL), which uses a Bayesian model of the reward function andselects queries that maximize the information gain about the difference inreturn between potentially optimal policies. In contrast to prior active rewardlearning methods designed for specific types of queries, IDRL naturallyaccommodates different query types. Moreover, by shifting the focus fromreducing the reward approximation error to improving the policy induced by thereward model, it achieves similar or better performance with significantlyfewer queries. We support our findings with extensive evaluations in multipleenvironments and with different types of queries.

Quick Read (beta)

loading the full paper ...