Abstract
Many reinforcement learning (RL) environments consist of independent entitiesthat interact sparsely. In such environments, RL agents have only limitedinfluence over other entities in any particular situation. Our idea in thiswork is that learning can be efficiently guided by knowing when and what theagent can influence with its actions. To achieve this, we introduce a measureof situation-dependent causal influence based on conditional mutual informationand show that it can reliably detect states of influence. We then proposeseveral ways to integrate this measure into RL algorithms to improveexploration and off-policy learning. All modified algorithms show strongincreases in data efficiency on robotic manipulation tasks.
Quick Read (beta)
loading the full paper ...