CURIOUS: Intrinsically Motivated Multi-Task, Multi-Goal Reinforcement Learning

Abstract

In open-ended and changing environments, agents face a wide range ofpotential tasks that may or may not come with associated reward functions. Suchautonomous learning agents must be able to generate their own tasks through aprocess of intrinsically motivated exploration, some of which might prove easy,others impossible. For this reason, they should be able to actively selectwhich task to practice at any given moment, to maximize their overall masteryon the set of learnable tasks. This paper proposes CURIOUS, an extension ofUniversal Value Function Approximators that enables intrinsically motivatedagents to learn to achieve both multiple tasks and multiple goals within aunique policy, leveraging hindsight learning. Agents focus on achievable tasksfirst, using an automated curriculum learning mechanism that biases theirattention towards tasks maximizing the absolute learning progress. Thismechanism provides robustness to catastrophic forgetting (by refocusing ontasks where performance decreases) and distracting tasks (by avoiding taskswith no absolute learning progress). Furthermore, we show that having twolevels of parameterization (tasks and goals within tasks) enables moreefficient learning of skills in an environment with a modular physicalstructure (e.g. multiple objects) as compared to flat, goal-parameterized RLwith hindsight experience replay.

Quick Read (beta)

loading the full paper ...