Concurrent Meta Reinforcement Learning

  • 2019-03-07 03:28:41
  • Emilio Parisotto, Soham Ghosh, Sai Bhargav Yalamanchi, Varsha Chinnaobireddy, Yuhuai Wu, Ruslan Salakhutdinov
  • 37


State-of-the-art meta reinforcement learning algorithms typically assume thesetting of a single agent interacting with its environment in a sequentialmanner. A negative side-effect of this sequential execution paradigm is that,as the environment becomes more and more challenging, and thus requiring moreinteraction episodes for the meta-learner, it needs the agent to reason overlonger and longer time-scales. To combat the difficulty of long time-scalecredit assignment, we propose an alternative parallel framework, which we name"Concurrent Meta-Reinforcement Learning" (CMRL), that transforms the temporalcredit assignment problem into a multi-agent reinforcement learning one. Inthis multi-agent setting, a set of parallel agents are executed in the sameenvironment and each of these "rollout" agents are given the means tocommunicate with each other. The goal of the communication is to coordinate, ina collaborative manner, the most efficient exploration of the shared task theagents are currently assigned. This coordination therefore represents themeta-learning aspect of the framework, as each agent can be assigned or assignitself a particular section of the current task's state space. This frameworkis in contrast to standard RL methods that assume that each parallel rolloutoccurs independently, which can potentially waste computation if many of therollouts end up sampling the same part of the state space. Furthermore, theparallel setting enables us to define several reward sharing functions andauxiliary losses that are non-trivial to apply in the sequential setting. Wedemonstrate the effectiveness of our proposed CMRL at improving over sequentialmethods in a variety of challenging tasks.


Quick Read (beta)

loading the full paper ...