Distributional Reward Decomposition for Reinforcement Learning

Abstract

Many reinforcement learning (RL) tasks have specific properties that can beleveraged to modify existing RL algorithms to adapt to those tasks and furtherimprove performance, and a general class of such properties is the multiplereward channel. In those environments the full reward can be decomposed intosub-rewards obtained from different channels. Existing work on rewarddecomposition either requires prior knowledge of the environment to decomposethe full reward, or decomposes reward without prior knowledge but with degradedperformance. In this paper, we propose Distributional Reward Decomposition forReinforcement Learning (DRDRL), a novel reward decomposition algorithm whichcaptures the multiple reward channel structure under distributional setting.Empirically, our method captures the multi-channel structure and discoversmeaningful reward decomposition, without any requirements on prior knowledge.Consequently, our agent achieves better performance than existing methods onenvironments with multiple reward channels.

Quick Read (beta)

loading the full paper ...