Training on more Reachable Tasks for Generalisation in Reinforcement Learning

  • 2024-10-04 17:15:31
  • Max Weltevrede, Caroline Horsch, Matthijs T. J. Spaan, Wendelin Böhmer
  • 0

Abstract

In multi-task reinforcement learning, agents train on a fixed set of tasksand have to generalise to new ones. Recent work has shown that increasedexploration improves this generalisation, but it remains unclear why exactlythat is. In this paper, we introduce the concept of reachability in multi-taskreinforcement learning and show that an initial exploration phase increases thenumber of reachable tasks the agent is trained on. This, and not the increasedexploration, is responsible for the improved generalisation, even tounreachable tasks. Inspired by this, we propose a novel method Explore-Go thatimplements such an exploration phase at the beginning of each episode.Explore-Go only modifies the way experience is collected and can be used withmost existing on-policy or off-policy reinforcement learning algorithms. Wedemonstrate the effectiveness of our method when combined with some popularalgorithms and show an increase in generalisation performance across severalenvironments.

 

Quick Read (beta)

loading the full paper ...