Meta Reinforcement Learning with Successor Feature Based Context

Abstract

Most reinforcement learning (RL) methods only focus on learning a single taskfrom scratch and are not able to use prior knowledge to learn other tasks moreeffectively. Context-based meta RL techniques are recently proposed as apossible solution to tackle this. However, they are usually less efficient thanconventional RL and may require many trial-and-errors during training. Toaddress this, we propose a novel meta-RL approach that achieves competitiveperformance comparing to existing meta-RL algorithms, while requiressignificantly fewer environmental interactions. By combining context variableswith the idea of decomposing reward in successor feature framework, our methoddoes not only learn high-quality policies for multiple tasks simultaneously butalso can quickly adapt to new tasks with a small amount of training. Comparedwith state-of-the-art meta-RL baselines, we empirically show the effectivenessand data efficiency of our method on several continuous control tasks.

Quick Read (beta)

loading the full paper ...