Modular Networks Prevent Catastrophic Interference in Model-Based Multi-Task Reinforcement Learning

Abstract

In a multi-task reinforcement learning setting, the learner commonly benefitsfrom training on multiple related tasks by exploiting similarities among them.At the same time, the trained agent is able to solve a wider range of differentproblems. While this effect is well documented for model-free multi-taskmethods, we demonstrate a detrimental effect when using a single learneddynamics model for multiple tasks. Thus, we address the fundamental question ofwhether model-based multi-task reinforcement learning benefits from shareddynamics models in a similar way model-free methods do from shared policynetworks. Using a single dynamics model, we see clear evidence of taskconfusion and reduced performance. As a remedy, enforcing an internal structurefor the learned dynamics model by training isolated sub-networks for each tasknotably improves performance while using the same amount of parameters. Weillustrate our findings by comparing both methods on a simple gridworld and amore complex vizdoom multi-task experiment.

Quick Read (beta)

loading the full paper ...