Identifying and Addressing Delusions for Target-Directed Decision-Making

Abstract

We are interested in target-directed agents, which produce targets duringdecision-time planning, to guide their behaviors and achieve bettergeneralization during evaluation. Improper training of these agents can resultin delusions: the agent may come to hold false beliefs about the targets, whichcannot be properly rejected, leading to unwanted behaviors and damagingout-of-distribution generalization. We identify different types of delusions byusing intuitive examples in carefully controlled environments, and investigatetheir causes. We demonstrate how delusions can be addressed for agents trainedby hindsight relabeling, a mainstream approach in for training target-directedRL agents. We validate empirically the effectiveness of the proposed solutionsin correcting delusional behaviors and improving out-of-distributiongeneralization.

Quick Read (beta)

loading the full paper ...